Saturday, April 4, 2020

Naive bayes classifier exercise using smoothing

Naive Bayes classifier exercise using smoothing, Naive Bayes classifier solved exercise


Naive Bayes Classifier Solved Exercise



Question:
Assume that a Naive Bayes classifier has a vocabulary that consists of 28345 word types. Suppose that training the classifier on a collection of movie reviews gave us the following;
count(Enthiran, +) = 25, count(Enthiran, −) = 0, 𝑁+ =40430, 𝑁 = 38299
Here, count(𝑤,𝑐) gives us the count of occurrences of 𝑤 in documents that are under class 𝑐, + refers to positive reviews class, refers to negative reviews class and 𝑁𝑐 refers to the total number of word occurrences in documents with class 𝑐. Estimate 𝑃(Enthiran | +) and 𝑃(Enthiran | −) using Maximum Likelihood estimation with Add-k smoothing, with 𝑘=0.01.

Solution:
Given,
|V| = 28345
count(Enthiran, +) = 25
count(Enthiran, −) = 0
𝑁+ =40430
𝑁 = 38299.
As per maximum likelihood estimate, the bi-gram probability can be calculated as follows;
Also, it is said that we need to Add-k smoothing with k = 0.01. Hence, the above equation can be modified to smooth as follows; 
With this equation, we can calculate the probabilities;

*********

Go to NLP Glossary


How to classify text documents using Naive Bayes classifier?

Naive Bayes classifier solved exercises

How to use Naive Bayes classifier in probability estimation?

Naive Bayes classification using MLE and Add-k smoothing

Maximum Likelihood Estimate solved exercise

Add-k smoothing solved exercise

 



No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery