Bigram, Trigram, and NGram Models in NLP

Monday, April 6, 2020

Bigram, Trigram, and NGram Models in NLP

Bigram Trigram and NGram in NLP, How to calculate the unigram, bigram, trigram, and ngram probabilities of a sentence? Maximum likelihood estimation to calculate the ngram probabilities

Page 1 Page 2 Page 3

How to use N-gram model to estimate probability of a word sequence?

Let us consider Equation 1 again. For a Unigram model, how would we change the Equation 1?

Unigram probability formula: probability of a word sequence w1 to wn equals the product of individual word probabilities P(w1) times P(w2) times P(w3) ... times P(wn-1)

Example:

For a bigram model, how would we change the Equation 1?

bigram probability calculation: given a sequence of words, the probability of that sequence is the product of individual bigram probabilities (ie the probability of word i given word i-1)

Example:

For a trigram model, how would we change the Equation 1?

trigram probability equation: given a word sequence, the probability of that sequence as per trigram model is the product of individual trigrams (ie., probability of word i given word i-1 and word i-1

Example:

Now, let us generalize the above examples of Unigram, Bigram, and Trigram calculation of a word sequence into equations.

Unigram:

Bigram:

In a Bigram model, for i=1, either the sentence start marker (<s>) or an empty string could be used as the word w_i-1. [The empty string could be used as the start of the sentence or word sequence].

Trigram:

In a Trigram model, for i=1 and i=2, two empty strings could be used as the word w_i-1, w_i-2_.[The empty strings could be used as the start of every sentence or word sequence].

How do we estimate these N-gram probabilities?

We can use Maximum Likelihood Estimation to estimate the Bigram and Trigram probabilities. We get the MLE estimate for the parameters of an N-gram model by taking counts from a corpus, and normalizing them so they lie between 0 and 1.

For Bigram probability,

Example:

The bigram probability is calculated by dividing the number of times the string “prime minister” appears in the given corpus by the total number of times the word “prime” appears in the same corpus.

For Trigram probability,

Example:

The trigram probability is calculated by dividing the number of times the string “prime minister of” appears in the given corpus by the total number of times the string “prime minister” appears in the same corpus.

Page 1 Page 2 Page 3

----------------------------------------------------------------------------------------------------------------------------

Go to NLP Glossary

Go to Natural Language Processing Home page

How to estimate bigram probability?

How to estimate trigram probability?

how to estimate ngram probability?

MLE for calculating the ngram probabilities

What is the equation for unigram, bigram and trigram estimation

Example bigram and trigram probability estimates

1 comment:

Muhammad Emmad SiddiquiSeptember 21, 2024 at 11:54 AM
the formula for bigram and trigram is incorrect for bigram you will skip the first word probability and starts calculating the conditional probability from second word, same goes for trigram
ReplyDelete
Replies

Add comment

Major links

Quicklinks

Monday, April 6, 2020