0% found this document useful (0 votes)
8 views

P02 DecisionTrees SolutionNotes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

P02 DecisionTrees SolutionNotes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Aprendizagem 2023

Lab 2: Decision Trees

Practical exercises

I. Decision tree learning

1. Consider the following dataset: y1 y2 y3 class


𝐱1 a a a +
𝐱2 c b c +
𝐱3 c a c +
𝐱4 b a a ─
𝐱5 a b c ─
𝐱6 b b c ─

Plot the learned decision tree using information gain (Shannon entropy). Show your calculus.

Brief notes:
y1 provides the highest gain, 𝐼𝐺(𝑦𝑜𝑢𝑡 |𝑦1 ) = 1 − 0.33, hence selected.
y1 correctly classifies all observations when 𝑦1 = 𝑏 and 𝑦1 = 𝑐
Entropies of 𝑦2 and 𝑦3 for (𝑦1 = 𝑎)-conditional data are both zero, so
we can select either.
There is no more uncertainty.

2. Show if a decision tree can learn the following logical functions and, if so, plot the corresponding
decision boundaries.
a) AND
b) OR
c) XOR
3. Consider the following testing targets, 𝑧, and the corresponding predictions, 𝑧̂, by a decision tree:

𝑧 = [𝐴 𝐴 𝐴 𝐵 𝐵 𝐵 𝐶 𝐶 𝐶 𝐶]
𝑧̂ = [𝐵 𝐵 𝐴 𝐶 𝐵 𝐴 𝐶 𝐴 𝐵 𝐶]

a) Draw the confusion matrix


true
A B C
A 1 1 1
predicted

B 2 1 1
C 0 1 2
b) Compute the accuracy and sensitivity/recall per class
1 1 1
𝑎𝑐𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 0.4, 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦𝐴 = , 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦𝐵 = , 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦𝐶 =
3 3 2

c) Considering class C, identify precision and 𝐹1 -measure


2
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝐶 = , 𝐹1𝐶 = 0.57
3
d) Identify the accuracy, sensitivity, and precision of the random classifier
𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦𝑟𝑎𝑛𝑑𝑜𝑚 = 0. (3), 𝑟𝑒𝑐𝑎𝑙𝑙𝑟𝑎𝑛𝑑𝑜𝑚 (𝐴) = 0. (3), 𝑟𝑒𝑐𝑎𝑙𝑙𝑟𝑎𝑛𝑑𝑜𝑚 (𝐵) = 0. (3), 𝑟𝑒𝑐𝑎𝑙𝑙𝑟𝑎𝑛𝑑𝑜𝑚 (𝐶) = 0. (3)

𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑟𝑎𝑛𝑑𝑜𝑚 (𝐴) = 0.3, 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑟𝑎𝑛𝑑𝑜𝑚 (𝐵) = 0.3, 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑟𝑎𝑛𝑑𝑜𝑚 (𝐶) = 0.4

4. Consider a dataset composed by 374 records,


described by 6 variables, and classified according
to the decision tree below. Each leaf in the tree
shows the label, number of classified records with
the label, and total number of observations in the
leaf. The positive class is the minority class.

a) Compute the confusion matrix.

#𝐴 = 66 + 19 + 14 + 15 + 18 = 132
#𝐵 = 0 + 0 + 2 + 4 + 236 = 242
The minority class is A, hence is seen as positive.

True
P (A) N (B)
P (A) 114 6
Predicted
N (B) 18 236
b) Compare the accuracy of the given tree versus a pruned tree with only two nodes.
Is there any evidence towards overfitting?

Considering training accuracy: a𝑐𝑐𝑢𝑟𝑎𝑐𝑦depth=4 = 0.936, a𝑐𝑐𝑢𝑟𝑎𝑐𝑦depth=2 = 0.874

Without the testing accuracy, there is no sufficient evidence to assume the tree is prone to overfit input data.

c) [optional] Are decision trees learned from high-dimensional data susceptible to underfitting?
Why an ensemble of decision trees minimizes this problem?

Assuming a limited depth, relevant data may be discarded due to a focus on a compact subset of overall
input variables. In ensemble models, such as random forests, different decision trees can be learned from
data subsamples and subspaces, leading to decisions that consider a broader set of input variables.

Programming quests
5. Following the provided Jupyter notebook on Classification, learn and evaluate a decision tree
classifier on the breast.w.arff dataset (available at the webpage) using sklearn.

Considering a 80-20 train-test split:


a) visualize the decision tree learned from the training observations with default parameters
b) compare the train and test accuracy of decision trees with a varying maximum depth

You might also like