Tuesday, March 1, 2022

Natural Language Processing Solved MCQ - Bigram Model

Natural Language Processing MCQ - Bigram probability calculation

1. Find the probability P(Alice | is) as per the bi-gram model. Use the corpus given below;

                                   <s> My name is Alice </s>

                                   <s> Alice my name is </s>

                                   <s> A girl said that her name is Alice </s>

                                   <s> My daughter’s name is Alice </s>

a) 0

b) 0.75

c) 0.25

d) 0.5


Answer: (b) 0.75

The frequency of the bi-gram 'is Alice' and the frequency of the uni-gram 'is' can be used to calculate the required bi-gram probability. The occurrence of bi-grams can be counted from the given corpus and the probability can be calculated as follows using Maximum Likelihood Estimate (MLE).

P(Alice | is) = Count('is Alice')/Count('is') = 3/4 = 0.75
The bi-gram 'is Alice' occurs three times and uni-gram 'is' occurs 4 times in the given corpus.


