✔ Scroll down and test yourself — answers are hidden under the “View Answer” button.
Top 10 Decision Tree MCQs for Manual Construction & Entropy Basics
1. You have a dataset with attributes Weather = {Sunny, Rainy}, Wind = {Weak, Strong}, and the target variable Play = {Yes, No}. If all samples where Weather = Rainy have Play = No, what is the information gain of splitting on “Weather”?
2. When manually constructing a decision tree, which step comes immediately before calculating information gain for each attribute?
3. When two or more attributes have the same information gain, how does the decision tree algorithm choose the next attribute to split on?
Most decision tree algorithms (like C4.5) use an extra measure called the gain ratio or another tie-breaking rule to decide which one to pick. If none of those are used, some implementations may just pick one randomly or based on a fixed order — but the standard approach is to use a heuristic like the gain ratio.
4. You are constructing a decision tree using Gini Index. For a node with class distribution [4 Yes, 1 No], what is the Gini value?
5. In decision tree learning, entropy = 1 means:
-
If entropy = 0, the data in that node is pure — all samples belong to the same class (no confusion).
-
If entropy = 1, the data is completely impure — classes are perfectly mixed (maximum confusion).
6. Which attribute will be chosen if one has high entropy but large sample size, and another has low entropy but few samples?
7. When manually calculating entropy, what happens if all samples in a node belong to the same class?
-
If a node has samples of different classes, there is some confusion — entropy is greater than 0.
-
But if all samples belong to the same class, there is no confusion at all — we’re 100% sure of the class.
When there’s no uncertainty, Entropy=0
8. If attribute A reduces entropy by 0.4 and B reduces entropy by 0.2, which one should be chosen?
9. Which of the following is not a stopping criterion during manual decision tree construction?
-
All records belong to the same class → Node is pure → stop splitting
-
No remaining attributes → Nothing left to split → stop splitting
-
Minimum sample size reached → Node is too small to split reliably → stop splitting
10. Suppose a dataset split yields subsets of size [10, 10] and [5, 0]. Which split is better in terms of information gain?
No comments:
Post a Comment