Tuesday, March 8, 2022

Solved MCQ in Natural Language Processing - Maximum Likelihood Estimate in Language Models

Multiple choices questions in NLP, Natural Language Processing solved MCQ, Bigram model, How to calculate the bigram probability using a corpus statistics? maximum likelihood estimate to find the bigram probability

Natural Language Processing MCQ - Bigram probability calculation using MLE

< Previous                      

Next >

 

1. Using Maximum Likelihood Estimate (MLE), to compute the bigram probability P(wn|wn-1), we need to count the number of bigrams (wn-1wn) from a corpus and normalize by the count of all bigrams that start with wn-1. This normalization step ensures that the estimate lie between 0 and 1.

P(wn|wn-1) = Count (wn-1wn) / Sum(Count(wn-1w))

Here, w is any word that follows wn-1.

This equation can be simplified by replacing the bigram count in the denominator with the unigram count of wn-1. Why do we want to do that?

a)    Bigram count can only be normalized by unigram count

b)    Sum of all bigram counts that start with the word wn-1 is equal to the unigram count of the same word

c)     Normalization using bigram count will make the estimate to be greater than 1 in some cases.

d)   None of the above.

 

Answer: (b) Sum of all bigram counts that start with the word wn-1 is equal to the unigram count of the same word

Let us calculate the bigram probability P(increase | to) using both the normalization using bigram and unigram. (Note: hereafter I use ‘C’ to refer ‘Count’)

 

Normalizing by sum of all bigram counts

 

For this case, we need to normalize using the total count of bigrams that start with the word “to”.

 

P(increase | to) = C(“to increase”)/[C(“to increase”)+C(“to be”)+C(“to fill”)] = 2/[2+1+1] = 2/4 = 0.5

 

Normalizing by unigram count

 

For this case, we need to normalize using the unigram count of the same word “to”.

 

P(increase|to) = C(“to increase”)/C(“to”) = 2/4 = 0.5

 

We have only 4 occurrences of word “to” in the corpus. Hence, the sum of count of any bigram that starts with “to” cannot exceed 4. For this reason, we can simplify the equation by normalizing using unigram count instead of sum of all bigram counts.


 

< Previous                      

Next >


*****************
Related links:


Top interview questions in NLP

Maximum likelihood estimate normalizes the n-gram count using the n-1 gram count. Why? Explain the reason behind this.

How to calculate the bi-gram probability?

Bi-gram language model uses bi-gram probabilities that were learnt from the corpus using MLE

How to make use of maximum likelihood estimate to measure bi-gram probability?

Why do we normalize the bigram counts using unigram counts instead of sum of all bigram counts in bigram language model?

MLE estimate of a bigram model counts the bigrams and normalize using unigrams, Why?

NLP Solved MCQ, Natural language processing solved mcq, language model solved mcq, perplexity solved mcq, nlp solved exercises


No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery