Please visit, subscribe and share 10 Minutes Lectures in Computer Science

## MCQ questions and Answers in NLP

### 1. Which of the following is an advantage of Porter stemmer over a full morphological parser?

a) The stemmer is better justified from a theoretical point of view
b) The output of a stemmer is always a valid word
c) The stemmer does not require a detailed lexicon to implement
d) None of the above

 Answer: (c) The stemmer does not require a detailed lexicon to implement The Porter stemming algorithm is a process for removing suffixes from words in English. The Porter stemming algorithm was made in the assumption that we don’t have a stem dictionary (lexicon) and that the purpose of the task is to improve Information Retrieval performance. Stemming algorithms are typically rule-based. You can view them as heuristic process that sort-of lops off the ends of words. Stemming usually refers to a crude heuristic process that chops off the ends of words in the hope of achieving this goal correctly most of the time, and often includes the removal of derivational affixes.

2.  Given a set of unigram and bigram probabilities, what is the probability of the following sequence ‘<s> do Sam I like’ according to the bigram language model?[Refer here: Language modeling in NLP]
P(do|<s>) = 2/11, P(do|Sam) = 1/11, P(Sam|<s>) = 4/11, P(Sam|do) = 1/8, P(I|Sam) = 4/11, P(Sam|I) = 2/9, P(I|do) = 2/8, P(I|like) = 2/7, P(like|I) = 3/11, P(do) = 3/8, P(Sam) = 2/11, P(I) = 4/11, P(like) = 5/11
a) 3/11 * 2/11 * 4/11 * 5/11
b) 2/11 * 1/8 * 4/11 * 3/11
c) 2/11 * 1/11 * 2/9 * 2/7
d) 2/11 + 1/11 + 2/9 + 2/7

 Answer: (b) 2/11 * 1/8 * 4/11 * 3/11 As per Bigram language model, the probability of the given word sequence can be calculated by multiplying the bigram conditional probabilities present in the word sequence. Bigram conditional probability = P(current word | previous word)

### 3. Which of the following is TRUE about CRF (Conditional Random Field) and HMM (Hidden Markov Model)?

a) CRF is generative model and HMM is discriminative model
b) Both CRF and HMM are generative model
c) CRF is discriminative model and HMM is generative model
d) Both CRF and HMM are discriminative model

 Answer: (c) - CRF is discriminative model and HMM is generative model Generative and discriminative models are probability models. Generative model learns the joint probability distribution P(x, y) whereas discriminative model learns the conditional probability P(y|x). A hidden Markov model explicitly describes the prior distribution on states, not just the conditional distribution of the output given the current state. It actually gives a joint distribution on states and outputs.  Like any generative model, it means that you could generate data that follows the same distribution of the input you're modeling. Conditional Random Fields are a type of discriminative classifier, and as such, they model the decision boundary between the different classes.

### 4. Knowledge of the relationship of meaning to the goals and intentions of the speaker is ________

a) Morphology
b) Semantics
c) Pragmatics
d) Discourse

 Answer: (c) Pragmatics Pragmatics focuses on conversational implicature. Conversational implicature is a process in which the speaker implies and a listener infers. Simply, it is a study about the sentences that are not directly spoken. It is the study of how people use language.

### 5. Which of the following NLP problems can be solved with Hidden Markov Model (HMM)?

a) POS tagging
b) Speech recognition
c) Spelling correction
d) All of the above

 Answer: (d) All of the above An HMM is a probabilistic sequence model: given a sequence of units (words, letters, morphemes, sentences, whatever), it computes a probability distribution over possible sequences of labels and chooses the best label sequence.

*************