Week 7 solution
Week 7 solution
A) Entropy
B) Information Gain
C) Chi-Square
D) K-Nearest Neighbors (KNN)
Answer: D) K-Nearest Neighbors (KNN)
Explanation:
Decision trees use measures like Entropy, Information Gain, and Chi-Square to determine the best
split at each node. KNN is a classification algorithm and is not used for attribute selection in decision
trees.
2. What is the key difference between a binary split and a multiway split in decision trees?
A) Binary splits divide the data into two groups, while multiway splits create multiple child nodes.
B) Multiway splits are used only for numerical attributes, whereas binary splits are for categorical
attributes.
C) Binary splits use entropy, while multiway splits use Gini index.
D) Multiway splits always result in better accuracy than binary splits.
Answer: A) Binary splits divide the data into two groups, while multiway splits create multiple child
nodes.
Explanation:
Binary splits create two branches from a node, dividing the data into two groups.
Multiway splits allow multiple branches, creating more than two child nodes.
Both binary and multiway splits can be used for numerical or categorical attributes,
depending on the decision tree implementation.
3. In decision tree pruning, which technique removes unnecessary nodes AFTER the tree has
been fully grown?
A) Pre-Pruning
B) Post-Pruning
C) Overfitting Pruning
D) Random Forest
Answer: B) Post-Pruning
Explanation:
Post-pruning (also called Reduced Error Pruning) removes nodes after the full tree has
been built.
It evaluates subtrees and removes branches that do not significantly improve accuracy,
helping to reduce overfitting.
Pre-pruning, in contrast, stops tree growth early based on conditions like minimum samples
per split.
4. How is the Chi-Square test used for decision tree splitting?
A) It calculates entropy to determine the best split.
B) It measures the statistical significance of differences between parent and child nodes.
C) It ensures all splits are binary.
D) It helps reduce the number of categorical features.
Answer: B) It measures the statistical significance of differences between parent and child nodes.
Explanation:
The Chi-Square test measures whether a split significantly improves classification by checking
differences in observed vs. expected frequencies of target variables. A higher Chi-Square value
means a better split.
5. What is the main disadvantage of decision trees compared to other machine learning
algorithms?
A) Decision trees are difficult to interpret.
B) They always underfit the data.
C) They are prone to overfitting, especially with deep trees.
D) They require extensive data cleaning.
Answer: C) They are prone to overfitting, especially with deep trees.
Explanation:
Decision trees tend to overfit when they become too complex, learning noise in the training
data.
This issue can be addressed using pruning or ensemble methods like Random Forest to
improve generalization.
MCQs (Decision Tree)
1. Given entropy of parent = 1, weights averages = (3/4,1/4) and entropy of children =
(0.9, 0). What is the information gain?
a) 0.675
b) 0.75
c) 0.325
d) 0.1
Ans: c)
Explanation: We know Information Gain = Entropy (Parent) – ∑ (weights average *
entropy (Child)).
3. If a dataset has three classes with probabilities 0.2, 0.3, and 0.5, what is the Gini
index?
a) 0.50
b) 0.62
c) 0.42
d) 0.38
Ans: c)
Explanation: Gini=1−((0.2)2+ (0.3)2+ (0.5)2) = 1−(0.04+0.09+0.25) =1−0.38=0.62