# Multiple choices questions in NLP, Natural Language Processing solved MCQ, Bigram model, How to calculate the bigram probability using a corpus statistics? maximum likelihood estimate to find the bigram probability

__Natural Language Processing MCQ - Bigram probability calculation using MLE__

1. Using Maximum
Likelihood Estimate (MLE), to compute the bigram probability P(w_{n}|w_{n-1}),
we need to count the number of bigrams (w_{n-1}w_{n}) from a
corpus and normalize by the count of all bigrams that start with w_{n-1}.
This normalization step ensures that the estimate lie between 0 and 1.

**P(w _{n}|w_{n-1}) = Count
(w_{n-1}w_{n}) / Sum(Count(w_{n-1}w))**

Here, w is any word
that follows w_{n-1}.

This equation can
be simplified by replacing the bigram count in the denominator with the unigram
count of w_{n-1}. Why do we want to do that?

a) Bigram count can only be normalized by unigram count

b) Sum of all bigram counts that start with the word wn-1 is equal to the unigram count of the same word

c) Normalization using bigram count will make the estimate to be greater than 1 in some cases.

d) None of the above.

Let us calculate
the bigram probability P(increase | to) using both the normalization using
bigram and unigram. (
For this case, we need to normalize using the total count of bigrams that start with the word “to”.
P(increase | to) = C(“to increase”)/[C(“to increase”)+C(“to be”)+C(“to fill”)] = 2/[2+1+1] = 2/4 = 0.5
For this case, we need to normalize using the unigram count of the same word “to”.
P(increase|to) = C(“to increase”)/C(“to”) = 2/4 = 0.5
We have only 4 occurrences of word “to” in the corpus. Hence, the sum of count of any bigram that starts with “to” cannot exceed 4. For this reason, we can simplify the equation by normalizing using unigram count instead of sum of all bigram counts. |