Computer Science and Engineering - Tutorials, Notes, MCQs, Questions and Answers: Top 10 Decision Tree MCQs for Manual Construction & Entropy Basics

✔ Scroll down and test yourself — answers are hidden under the “View Answer” button.

Top 10 Decision Tree MCQs for Manual Construction & Entropy Basics

1. You have a dataset with attributes Weather = {Sunny, Rainy}, Wind = {Weak, Strong}, and the target variable Play = {Yes, No}. If all samples where Weather = Rainy have Play = No, what is the information gain of splitting on “Weather”?

A) 1.0

B) 0.0

C) Depends on other features

D) 0.5

Answer: A

Explanation: “Weather” perfectly separates the classes into pure subsets → Entropy = 0 → Information Gain = 1.

2. When manually constructing a decision tree, which step comes immediately before calculating information gain for each attribute?

A) Computing class probabilities

B) Normalizing data

C) Calculating entropy of the parent node

D) Pruning the tree

Answer: C

Explanation: Information Gain = Parent Entropy − Weighted Child Entropy. Hence, compute parent entropy first.

3. When two or more attributes have the same information gain, how does the decision tree algorithm choose the next attribute to split on?

A) Choosing the alphabetically first attribute

B) Randomly

C) Using gain ratio or another tie-breaking heuristic

D) Skipping the split

Answer: C

Explanation: Information gain tells us which attribute separates the data best. Sometimes, two or more attributes give exactly the same gain — that means they are equally good for splitting. So, the algorithm needs a tie-breaker.

Most decision tree algorithms (like C4.5) use an extra measure called the gain ratio or another tie-breaking rule to decide which one to pick. If none of those are used, some implementations may just pick one randomly or based on a fixed order — but the standard approach is to use a heuristic like the gain ratio.

4. You are constructing a decision tree using Gini Index. For a node with class distribution [4 Yes, 1 No], what is the Gini value?

A) 0.16

B) 0.32

C) 0.48

D) 0.64

Answer: B

Explanation:

G i n i = 1 - (p_{Y e s}^{2} + p_{N o}^{2}) = 1 - ({0.8}^{2} + {0.2}^{2}) = 0.32

5. In decision tree learning, entropy = 1 means:

A) The dataset is perfectly pure

B) The dataset is completely impure

C) The tree has overfitted

D) There is no need to split

Answer: B

Explanation: Entropy measures how mixed or impure a group of examples is. It tells us how uncertain we are about the class of a randomly chosen sample.

If entropy = 0, the data in that node is pure — all samples belong to the same class (no confusion).
If entropy = 1, the data is completely impure — classes are perfectly mixed (maximum confusion).

6. Which attribute will be chosen if one has high entropy but large sample size, and another has low entropy but few samples?

A) The one with higher entropy

B) The one with lower entropy

C) The one giving higher weighted information gain

D) Random choice

Answer: C

Explanation: When a decision tree decides where to split, it uses information gain, not just entropy. It doesn’t just look at how pure (low entropy) each split is. It also considers how many samples go into each child node — that’s the weight part.

7. When manually calculating entropy, what happens if all samples in a node belong to the same class?

A) Entropy = 0

B) Entropy = 1

C) Entropy = 0.5

D) Cannot be determined

Answer: A

Explanation: Entropy measures how mixed the data is — how much uncertainty there is about the class.

If a node has samples of different classes, there is some confusion — entropy is greater than 0.
But if all samples belong to the same class, there is no confusion at all — we’re 100% sure of the class.

When there’s no uncertainty, Entropy=0

8. If attribute A reduces entropy by 0.4 and B reduces entropy by 0.2, which one should be chosen?

A) A

B) B

C) Either

D) None

Answer: A

Explanation: When building a decision tree, we pick the attribute that gives the largest reduction in entropy — this reduction is called information gain. The higher the information gain, the better the attribute at splitting the data and making the node purer.

9. Which of the following is not a stopping criterion during manual decision tree construction?

A) All records belong to the same class

B) No remaining attributes

C) Entropy = 1

D) Minimum sample size reached

Answer: C

Explanation: Stopping criteria are conditions that tell us “stop splitting this node”. Common stopping criteria include:

All records belong to the same class → Node is pure → stop splitting
No remaining attributes → Nothing left to split → stop splitting
Minimum sample size reached → Node is too small to split reliably → stop splitting

10. Suppose a dataset split yields subsets of size [10, 10] and [5, 0]. Which split is better in terms of information gain?

A) The first

B) The second

C) Both equal

D) Depends on class distribution

Answer: B

Explanation: The second subset is pure → lower entropy → higher information gain.

Go to TOP 10 MCQs in Machine Learning - Home page

TOPICS (Click to Navigate)

Saturday, October 18, 2025

Top 10 Decision Tree MCQs for Manual Construction & Entropy Basics

Top 10 Decision Tree MCQs for Manual Construction & Entropy Basics

1. You have a dataset with attributes Weather = {Sunny, Rainy}, Wind = {Weak, Strong}, and the target variable Play = {Yes, No}. If all samples where Weather = Rainy have Play = No, what is the information gain of splitting on “Weather”?

2. When manually constructing a decision tree, which step comes immediately before calculating information gain for each attribute?

3. When two or more attributes have the same information gain, how does the decision tree algorithm choose the next attribute to split on?

4. You are constructing a decision tree using Gini Index. For a node with class distribution [4 Yes, 1 No], what is the Gini value?

5. In decision tree learning, entropy = 1 means:

6. Which attribute will be chosen if one has high entropy but large sample size, and another has low entropy but few samples?

7. When manually calculating entropy, what happens if all samples in a node belong to the same class?

8. If attribute A reduces entropy by 0.4 and B reduces entropy by 0.2, which one should be chosen?

9. Which of the following is not a stopping criterion during manual decision tree construction?

10. Suppose a dataset split yields subsets of size [10, 10] and [5, 0]. Which split is better in terms of information gain?

No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

All time most popular contents

Report Abuse