Multiple choices questions in Machine learning. Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, question bank in machine learning, What is entropy? Why entropy is important in decision tree? How to calculate entropy for a dataset?
Machine Learning MCQ  Calculate the entropy of a decision tree given the dataset
1. For the dataset given in Table 1 below to learn a decision tree, find the approximate entropy H(Passed). This decision tree predicts whether students pass or not (Y for yes or N for no), based on their past CGPA scores (H for high, A for average, and L for Low) and whether they prepared or not (Y or N).
CGPA 
Prepared 
Passed 
H H M M L L 
F T F T F T 
T T F T F T 
a) 0.92
b) 0.66
c) 1.92
d) 1.32
Answer: (a) 0.92
How to calculate entropy?The formula for calculating the entropy is as follows;
Here, p_{i} is the probability of getting the class i when randomly selecting the one from the available classes. In our example, we need to find the entropy of ‘Passed’ which has only two classes, F or T.
H(Passed) =  (p_{(Passed = F)} log p_{(Passed = F)} + p_{(Passed = T)} log p_{(Passed = T)}) = (2/6 log_{2}(2/6)+4/6 log_{2} (4/6)) = (1/3 log_{2}(1/3)+2/3 log_{2} (2/3)) = ( 0.52826 – 0.38997) = 0.92
What is entropy?Entropy is an information theory metric that measures the impurity or uncertainty in a group of observations.
How does entropy measurement help in decision tree?Entropy determines how a decision tree chooses to split data to minimize this impurity as much as possible at the leaf (or the endoutcome) nodes. It means the objective function is to decrease the impurity (i.e. uncertainty or surprise) of the target column or in other words, to increase the homogeneity of the variable at every split of the given data.
