Saturday, October 18, 2025

10 Hot Decision Tree MCQs: Gain Ratio, Continuous Attributes & Tie-Breaking


10 Hot Decision Tree MCQs: Gain Ratio, Continuous Attributes & Tie-Breaking


1. The root node in a decision tree is selected based on:

A) Minimum entropy
B) Maximum information gain
C) Minimum Gini
D) Random initialization

Answer: B

Explanation: The root node is the first split in the tree. The goal is to reduce uncertainty in the dataset as much as possibleDecision tree algorithms (like ID3, C4.5) calculate information gain for all attributes. The attribute with the highest information gain is chosen as the root because it splits the data in the best way, creating the purest child nodes.
The root node is selected by picking the attribute that gives the largest reduction in entropy — i.e., the highest information gain.



2. If a dataset has 100% identical attribute values for all samples but mixed labels, the information gain of any attribute will be:

A) 0
B) 1
C) Undefined
D) Negative

Answer: A

Explanation: If all samples have the same attribute values, splitting on any attribute does not reduce uncertainty. Child nodes after the split are exactly the same as the parent in terms of class distribution. Therefore, the weighted entropy of children = entropy of parent. So, the information gain = 0.



3. In a two-class problem, Gini Index = 0.5 represents:

A) Maximum impurity
B) Pure split
C) Perfect classification
D) Minimum impurity

Answer: A

Explanation: Gini = 0 → node is pure (all samples belong to one class). Gini = 0.5 → node is maximally impure in a two-class problem (50%-50% split)Gini Index = 0.5 means the node is completely mixed, with an equal number of samples from both classes.



4. A pruned decision tree generally has:

A) Higher accuracy on training data but lower on test data
B) Lower training accuracy but better generalization
C) Equal accuracy everywhere
D) Random performance

Answer: B

Explanation: Pruning sacrifices some training accuracy to avoid overfittingPruning simplifies the tree. Slightly worse on training data but much better on new/unseen data.

Option A: NO - this is an overfitted tree, not a pruned one.
Option C: NO - Rare in practice
Option D: NO - Pruning is systematic not random.



5. In manual decision tree construction, if an attribute gives 0 information gain, what should you do?

A) Still choose it
B) Pplit based on it partially
C) Skip it for splitting 
D) Replace missing values

Answer: C

Explanation: If an attribute gives 0 information gain, it cannot help separate classes, so you ignore it and choose a better attribute for splitting.



6. In a decision tree, if a node contains only one sample, what is its entropy?

A) 0
B) 0.5
C) 1
D) Cannot be calculated

Answer: A

Explanation: A single sample belongs to a single class → node is perfectly pure → entropy = 0.



7. Which splitting criterion can be used for multi-class problems besides binary classification?

A) Gini Index
B) Entropy / Information Gain
C) Gain Ratio
D) All of the above

Answer: D

Explanation: All these measures can handle more than two classes; they just compute probabilities for each class.



8. Which of the following is most likely to cause overfitting in a decision tree?

A) Shallow tree
B) Large minimum samples per leaf
C) Very deep tree with small leaves
D) Using pruning

Answer: C

Explanation: Deep trees with tiny leaves memorize training data → overfit → poor generalization. 



9. In manual construction of a decision tree, what is the first step?

A) Calculate child node entropy
B) Select root attribute based on information gain
C) Split dataset randomly
D) Prune unnecessary branches

Answer: B

Explanation: The root is chosen to maximize information gain, which reduces the initial uncertainty the most.



10. If a node’s children after a split all have entropy = 0.3 and the parent has entropy = 0.3, what does it indicate?

A) Maximum information gain
B) Node is pure
C) Overfitting
D) No  information gain

Answer: D

Explanation: Information gain = Parent entropy − Weighted child entropy = 0 → the split did not improve purity.




 

No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents