Saturday, January 1, 2022

Natural Language Processing MCQ - Find the perplexity of the language model

Multiple choices questions in NLP, Natural Language Processing solved MCQ, What is perplexity, how to calculate perplexity, evaluating language model, intrinsic vs extrinsic evaluation

Natural Language Processing MCQ - Find the perplexity of a language model

< Previous

Next >

1. Consider the following corpus:

S1: You have five minutes remaining till the end of the test

S2: You have submitted the test

S3: You are given five marks for the correct answer

Let us suppose that the sentence S2 of the given corpus is the test. What is the perplexity of the test? Assume the bigram language model is being used.

(a) 1.14

(b) 1.42

(c) 1.35

(d) 1.43

Answer: (c) 1.35

Let us find the probability of S2;

P(You|<s>) = Count(<s>, You)/Count(<s>) = 3/3 = 1.

[Hint: Numerator: the bigram “<s> You” occurs 3 times in the corpus. That is “You” starts two sentences as per the given corpus. Denominator: The unigram “<s>” (start symbol) occurs 3 times in the corpus.]

P(have|You) = 2/3

P(submitted|have) = 1/2

P(the|submitted) = 1/1 = 1

P(test|the) = 2/4 = 1/2

P(</s>|test) = 2/2 = 1

P(S2) = 1 * (2/3) * (1/2) * 1 * (1/2) * 1 = 1/6

What is perplexity?

Perplexity is the inverse probability of test set, normalized by the number of words. It is an intrinsic evaluation method. [Refer here for more.]

Perplexity (PP) of the test sentence can be measured using the following equation;

PP(W) = P(w₁w₂w₃…w_N)^-1/N

w₁, w₂, … are the words in the test set and N is the total count of word tokens in the test set including </s> (but excluding <s>).

PP(S2) = (1/6)^-1/6 = 1/(1/6)^1/6 = 1.35 approx.

For more on perplexity, please refer this link.

Reason for including begin and end sentence markers <s> and </s>

Since this sequence will cross many sentence boundaries, we need to include the begin- and end-sentence markers <s> and </s> in the probability computation. We also need to include the end-of-sentence marker </s> (but not the beginning-of-sentence marker <s>) in the total count of word tokens N.

Source: Speech and Language Processing by Daniel Jurafsky and James H. Martin

< Previous

Next >

******************

Related links:

Go to Natural Language Processing home page

Go to Natural Language Processing - Glossary

Go to NLP - MCQ Quiz Home page

NLP quiz questions with answers explained

What is perplexity

How to calculate perplexity

Main difference between intrinsic evaluation and extrinsic evaluation

Major links

Quicklinks

Saturday, January 1, 2022

Natural Language Processing MCQ - Find the perplexity of the language model

Multiple choices questions in NLP, Natural Language Processing solved MCQ, What is perplexity, how to calculate perplexity, evaluating language model, intrinsic vs extrinsic evaluation

Natural Language Processing MCQ - Find the perplexity of a language model

Go to Natural Language Processing home page

Go to Natural Language Processing - Glossary

Go to NLP - MCQ Quiz Home page

Top interview questions in NLP

NLP quiz questions with answers explained

What is perplexity

How does perplexity help in evaluating a language model?

NLP Solved MCQ, Natural language processing solved mcq, language model solved mcq, perplexity solved mcq, nlp solved exercises

No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

All time most popular contents

Report Abuse