Showing posts with label NLP Question Bank. Show all posts
Showing posts with label NLP Question Bank. Show all posts

Wednesday, May 20, 2020

Natural language processing question bank 01

Differentiate between generative and discriminative models


Question:

Difference between generative and discriminative models.


Answer:

Generative and discriminative models are probability models.

In generative model, we model the conditional probability of the input x given the label y. Generative model learns the joint probability distribution P(x, y) and uses Bayes’ theorem to find the conditional probability. 
In discriminative model, we directly model the conditional probability P(y|x).

  • Generative models estimate joint distribution but discriminative models estimate conditional distribution.
  • In generative models, dependence assumption has to be specified for P(w|t) and P(t) but in discriminative models, arbitrary features can be incorporated for modeling P(t|w).
  • Generate models can be used in unsupervised learning but discriminative models require labeled data, suitable for (semi-) supervised learning.

*************************

Related questions:



Natural language processing question bank 02

Why is it better to compute likelihood and probability values in log space?


Question:

Why is it better to computer likelihood and probability values in log space?


Answer:


Computing likelihood and probability values in log space solves the underflow problem (refer below). Moreover, when computing in log space, we can replace the multiplication operations by addition operations, which are usually faster than multiplication operations on a modern computer.

What is Underflow?

Joint probability distribution often involves multiplying small individual probabilities. Many probabilistic algorithms involve multiplying probabilities of individual data points that takes the result very close to zero. This is called as underflow.


*************************

Related questions:


  • Why is it better to compute likelihood and probability values in log space?


Natural language processing question bank 04

You are building an ngram model of a corpus. Should you stem the words and do the counts or leave them in the surface form? Give pros and cons and include what characteristics of the corpus might influence your decision.


Question:

You are building an ngram model of a corpus. Should you stem the words and do the counts or leave them in the surface form? Give pros and cons and include what characteristics of the corpus might influence your decision.


Answer:


  • Stemming the words means there will be fewer types, since there will just be base forms. This means that some generalizations will be captured (He swam, he swims … He swim). However, there are some generalization that won’t be captured (I swim vs. she swims),
  • This is a good idea when there is a small amount of data and there are fewer examples of the ngrams or in highly inflected languages where there are many different forms of each word.
  • If a large amount of data available, however, ngrams over the surface forms can be more powerful and precise.

*************************

Related questions:


  • You are building an ngram model of a corpus. Should you stem the words and do the counts or leave them in the surface form? Give pros and cons and include what characteristics of the corpus might influence your decision.

Natural language processing question bank 03

One common way probabilities are assigned to rules in a PCFG is to use a treebank. When will some rules have a probability of 0.0? What is the problem if some rule has a probability of 0.0? How can it be overcome?

 

Question:

One common way probabilities are assigned to rules in a PCFG is to use a treebank. When will some rules have a probability of 0.0? What is the problem if some rule has a probability of 0.0? How can it be overcome?


Answer:


When the tree bank has no instance of a rule among its derivation trees, then the rule will have a probability of 0.0.
The problem is that the probability for any parse tree using a rule with probability 0.0 will be 0.0 since the probability of the parse tree is obtained as a product of all rules in the parse. [Refer here for more]
The same smoothing methods used to handle zero counts in language models can be used here. For example, Laplace smoothing which starts with a count of 1 for all rules.


*************************

Related questions:


  • One common way probabilities are assigned to rules in a PCFG is to use a treebank. When will some rules have a probability of 0.0? What is the problem if some rule has a probability of 0.0? How can it be overcome?





Natural language processing question bank 06

Assume that we have the Hidden Markov Model (HMM). If each of the states can take on k different values and a total of m different observations are possible (across all states), how many parameters are required to fully define this HMM? Justify your answer.


Question:

Assume that we have the Hidden Markov Model (HMM). If each of the states can take on k different values and a total of m different observations are possible (across all states), how many parameters are required to fully define this HMM? Justify your answer.


Answer:


There are a total of three probability distributions that define the HMM, the initial probability distribution, the transition probability distribution, and the emission probability distribution.
  • There are a total of k states, so k parameters are required to define the initial probability distribution.
  • For the transition distribution, we can transition from any one of k states to any of the k states (including staying in the same state), so k2 parameters are required.
  • We need a total of km parameters for the emission probability distribution, since each of the k states can emit each of the m observations.
Thus, the total number of parameters required are k + k2 + km. Note that the number of parameters does not depend on the length of the HMMs.

*************************

Related questions:


  • How many parameters are required to fully define a Hidden Markov Model?  



Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery