## Bigram Trigram and NGram in NLP, How to calculate the unigram, bigram, trigram, and ngram probabilities of a sentence? Maximum likelihood estimation to calculate the ngram probabilities

__How to use N-gram model to estimate probability of a word sequence?__
Let us consider Equation 1 again. For

For

**a Unigram model**, how would we change the Equation 1?

**Example:****a bigram model**, how would we change the Equation 1?**Example:**

Now, let us generalize the above examples of
Unigram, Bigram, and Trigram calculation of a word sequence into equations.

__Unigram:__

__Bigram:__

In a

**, for***Bigram model***, either the sentence start marker (<s>) or an empty string could be used as the word***i=1***. [***w*_{i-1}*The empty string could be used as the start of the sentence or word sequence*].

__Trigram:__

In a

**, for***Trigram model***and***i=1***, two empty strings could be used as the word***i=2***,***w*_{i-1}**[***w*_{i-2}_{. }*The empty strings could be used as the start of every sentence or word sequence*].

__How do we estimate these N-gram probabilities?__
We can use Maximum Likelihood Estimation to
estimate the Bigram and Trigram probabilities. We get the MLE estimate for the
parameters of an

*N*-gram model by taking counts from a corpus, and**normalizing**them so they lie between 0 and 1.

__For Bigram probability,__

__Example:__

The bigram probability is calculated by dividing
the

**by***number of times the string “prime minister” appears in the given corpus***.***the total number of times the word “prime” appears in the same corpus*

__For Trigram probability,__

__Example:__

The trigram probability is calculated by dividing
the

**by***number of times the string “prime minister of” appears in the given corpus***.***the total number of times the string “prime minister” appears in the same corpus*
----------------------------------------------------------------------------------------------------------------------------

Go to NLP Glossary

Go to Natural Language Processing Home page

the formula for bigram and trigram is incorrect for bigram you will skip the first word probability and starts calculating the conditional probability from second word, same goes for trigram

ReplyDelete