P02 DecisionTrees SolutionNotes

Uploaded by

Emília Morgado Santos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

P02 DecisionTrees SolutionNotes

Uploaded by

Emília Morgado Santos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Aprendizagem 2023

Lab 2: Decision Trees

Practical exercises

I. Decision tree learning

1. Consider the following dataset: y1 y2 y3 class

𝐱1 a a a +
𝐱2 c b c +
𝐱3 c a c +
𝐱4 b a a ─
𝐱5 a b c ─
𝐱6 b b c ─

Plot the learned decision tree using information gain (Shannon entropy). Show your calculus.

Brief notes:
y1 provides the highest gain, 𝐼𝐺(𝑦𝑜𝑢𝑡 |𝑦1 ) = 1 − 0.33, hence selected.
y1 correctly classifies all observations when 𝑦1 = 𝑏 and 𝑦1 = 𝑐
Entropies of 𝑦2 and 𝑦3 for (𝑦1 = 𝑎)-conditional data are both zero, so
we can select either.
There is no more uncertainty.

2. Show if a decision tree can learn the following logical functions and, if so, plot the corresponding
decision boundaries.
a) AND
b) OR
c) XOR
3. Consider the following testing targets, 𝑧, and the corresponding predictions, 𝑧̂, by a decision tree:

𝑧 = [𝐴 𝐴 𝐴 𝐵 𝐵 𝐵 𝐶 𝐶 𝐶 𝐶]
𝑧̂ = [𝐵 𝐵 𝐴 𝐶 𝐵 𝐴 𝐶 𝐴 𝐵 𝐶]

a) Draw the confusion matrix

true
A B C
A 1 1 1
predicted

B 2 1 1
C 0 1 2
b) Compute the accuracy and sensitivity/recall per class
1 1 1
𝑎𝑐𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 0.4, 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦𝐴 = , 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦𝐵 = , 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦𝐶 =
3 3 2

c) Considering class C, identify precision and 𝐹1 -measure

2
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝐶 = , 𝐹1𝐶 = 0.57
3
d) Identify the accuracy, sensitivity, and precision of the random classifier
𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦𝑟𝑎𝑛𝑑𝑜𝑚 = 0. (3), 𝑟𝑒𝑐𝑎𝑙𝑙𝑟𝑎𝑛𝑑𝑜𝑚 (𝐴) = 0. (3), 𝑟𝑒𝑐𝑎𝑙𝑙𝑟𝑎𝑛𝑑𝑜𝑚 (𝐵) = 0. (3), 𝑟𝑒𝑐𝑎𝑙𝑙𝑟𝑎𝑛𝑑𝑜𝑚 (𝐶) = 0. (3)

𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑟𝑎𝑛𝑑𝑜𝑚 (𝐴) = 0.3, 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑟𝑎𝑛𝑑𝑜𝑚 (𝐵) = 0.3, 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑟𝑎𝑛𝑑𝑜𝑚 (𝐶) = 0.4

4. Consider a dataset composed by 374 records,

described by 6 variables, and classified according
to the decision tree below. Each leaf in the tree
shows the label, number of classified records with
the label, and total number of observations in the
leaf. The positive class is the minority class.

a) Compute the confusion matrix.

#𝐴 = 66 + 19 + 14 + 15 + 18 = 132
#𝐵 = 0 + 0 + 2 + 4 + 236 = 242
The minority class is A, hence is seen as positive.

True
P (A) N (B)
P (A) 114 6
Predicted
N (B) 18 236
b) Compare the accuracy of the given tree versus a pruned tree with only two nodes.
Is there any evidence towards overfitting?

Considering training accuracy: a𝑐𝑐𝑢𝑟𝑎𝑐𝑦depth=4 = 0.936, a𝑐𝑐𝑢𝑟𝑎𝑐𝑦depth=2 = 0.874

Without the testing accuracy, there is no sufficient evidence to assume the tree is prone to overfit input data.

c) [optional] Are decision trees learned from high-dimensional data susceptible to underfitting?
Why an ensemble of decision trees minimizes this problem?

Assuming a limited depth, relevant data may be discarded due to a focus on a compact subset of overall
input variables. In ensemble models, such as random forests, different decision trees can be learned from
data subsamples and subspaces, leading to decisions that consider a broader set of input variables.

Programming quests
5. Following the provided Jupyter notebook on Classification, learn and evaluate a decision tree
classifier on the breast.w.arff dataset (available at the webpage) using sklearn.

Considering a 80-20 train-test split:

a) visualize the decision tree learned from the training observations with default parameters
b) compare the train and test accuracy of decision trees with a varying maximum depth

Montessori Materials PDF
100% (4)
Montessori Materials PDF
7 pages
Module 2 Grammatical & Syntactic Awareness
No ratings yet
Module 2 Grammatical & Syntactic Awareness
63 pages
Harry Potter Symphony Suite
0% (5)
Harry Potter Symphony Suite
14 pages
Statistics: Data Gathering and Organizing
100% (1)
Statistics: Data Gathering and Organizing
22 pages
P02 DecisionTrees
No ratings yet
P02 DecisionTrees
2 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
EBUS537 Theme4 Week 5
No ratings yet
EBUS537 Theme4 Week 5
26 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Lecture 7.2 - DTC Algorithm Implementation
No ratings yet
Lecture 7.2 - DTC Algorithm Implementation
7 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
Classification Error: Training Errors Generalization Errors
No ratings yet
Classification Error: Training Errors Generalization Errors
39 pages
DMDM Part 2
No ratings yet
DMDM Part 2
94 pages
Practical 7 Classification Revision Questions
No ratings yet
Practical 7 Classification Revision Questions
8 pages
ID3
No ratings yet
ID3
5 pages
MIS410-Chapter6
No ratings yet
MIS410-Chapter6
47 pages
Homework3
No ratings yet
Homework3
10 pages
M6 - Model Overfitting
No ratings yet
M6 - Model Overfitting
30 pages
Lec 16,17
No ratings yet
Lec 16,17
90 pages
ML Assignment
No ratings yet
ML Assignment
7 pages
Decision Trees in Sklearn Decision Trees in Sklearn
No ratings yet
Decision Trees in Sklearn Decision Trees in Sklearn
7 pages
Exp4 - Supervised Learning
No ratings yet
Exp4 - Supervised Learning
10 pages
Classification Problems
No ratings yet
Classification Problems
53 pages
UCS622
No ratings yet
UCS622
1 page
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Data Mining Journal 4 Kashan
No ratings yet
Data Mining Journal 4 Kashan
8 pages
LAB4
No ratings yet
LAB4
5 pages
CONTENTS
No ratings yet
CONTENTS
7 pages
ML New record (5)
No ratings yet
ML New record (5)
51 pages
601 sp09 Midterm Solutions
No ratings yet
601 sp09 Midterm Solutions
14 pages
Soft Computing Lab Practical Assignment 2
No ratings yet
Soft Computing Lab Practical Assignment 2
10 pages
Lecture 07 On Decision Trees
No ratings yet
Lecture 07 On Decision Trees
36 pages
Mlsp Lab Exp4
No ratings yet
Mlsp Lab Exp4
9 pages
Department of Electronics & Telecommunications Engineering: ETEL71A-Machine Learning and AI
No ratings yet
Department of Electronics & Telecommunications Engineering: ETEL71A-Machine Learning and AI
4 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
ML LAB 146
No ratings yet
ML LAB 146
50 pages
22.InfoTheory-DecisionTrees-short
No ratings yet
22.InfoTheory-DecisionTrees-short
25 pages
Machine Learning CA 2
No ratings yet
Machine Learning CA 2
19 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
2324-CS420-21CTT1-IA05
No ratings yet
2324-CS420-21CTT1-IA05
4 pages
Midterm 2002
No ratings yet
Midterm 2002
10 pages
AIH_Lab2
No ratings yet
AIH_Lab2
10 pages
practical 15 python
No ratings yet
practical 15 python
6 pages
5b Python Implementation of Decision Tree
No ratings yet
5b Python Implementation of Decision Tree
7 pages
Unit 1
No ratings yet
Unit 1
12 pages
Stochastic and Overfitting
No ratings yet
Stochastic and Overfitting
19 pages
Machine Learning: Decision Trees: CS540 Jerry Zhu University of Wisconsin-Madison
No ratings yet
Machine Learning: Decision Trees: CS540 Jerry Zhu University of Wisconsin-Madison
49 pages
Chapter18 2
No ratings yet
Chapter18 2
40 pages
dwm_06
No ratings yet
dwm_06
4 pages
Types of Pruning Techniques
No ratings yet
Types of Pruning Techniques
10 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Decision Tree
No ratings yet
Decision Tree
52 pages
Assign 6 Solution
No ratings yet
Assign 6 Solution
11 pages
Classification: Basic Concepts, Decision Trees, and Model Evaluation
No ratings yet
Classification: Basic Concepts, Decision Trees, and Model Evaluation
46 pages
Name: Le Ho Thao Nguyen Student ID: 20194224
No ratings yet
Name: Le Ho Thao Nguyen Student ID: 20194224
9 pages
21 Decision Trees
No ratings yet
21 Decision Trees
62 pages
2025-Lecture07-P1-ID3
No ratings yet
2025-Lecture07-P1-ID3
41 pages
Big Data Lesson 5 Lucrezia Noli
No ratings yet
Big Data Lesson 5 Lucrezia Noli
30 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
DSA5102_lecture3
No ratings yet
DSA5102_lecture3
34 pages
Title: Implementation of Decision Tree Classification: Department of Computer Science and Engineering
No ratings yet
Title: Implementation of Decision Tree Classification: Department of Computer Science and Engineering
8 pages
14MachineLearningDecisionTreeRandomForest - Ipynb - Colaboratory
No ratings yet
14MachineLearningDecisionTreeRandomForest - Ipynb - Colaboratory
29 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
PROJECT GUIDELINE
No ratings yet
PROJECT GUIDELINE
5 pages
Caregiving 9: Teachnology and Livelihood Education Learning Module 9
100% (2)
Caregiving 9: Teachnology and Livelihood Education Learning Module 9
11 pages
Artificial Intelligence Career 2022 - USAII™
100% (1)
Artificial Intelligence Career 2022 - USAII™
6 pages
Resume - Ahmad Jafari Ghavamabad
No ratings yet
Resume - Ahmad Jafari Ghavamabad
6 pages
ECA2+ Tests Language Test 8B 2018
No ratings yet
ECA2+ Tests Language Test 8B 2018
4 pages
Terrapsychology Further Inquiry into Self Place and Planet 1st Edition Craig Chalquist pdf download
100% (1)
Terrapsychology Further Inquiry into Self Place and Planet 1st Edition Craig Chalquist pdf download
43 pages
March SNMP Priority Invite Round - Last Invited by Occupation
No ratings yet
March SNMP Priority Invite Round - Last Invited by Occupation
8 pages
Bursar
No ratings yet
Bursar
5 pages
Form 5A8 Curriculum Vitae (CV) For Each Proposed Professional Staff
No ratings yet
Form 5A8 Curriculum Vitae (CV) For Each Proposed Professional Staff
3 pages
Story Sack Report
No ratings yet
Story Sack Report
12 pages
Cox News Volume 8 Issue 7
No ratings yet
Cox News Volume 8 Issue 7
4 pages
Case Study - TESCO
No ratings yet
Case Study - TESCO
10 pages
Jusma Wahidah Toefl-Certificate
100% (1)
Jusma Wahidah Toefl-Certificate
2 pages
Eureka Forbes Case Study
No ratings yet
Eureka Forbes Case Study
3 pages
Theangulartutorial PDF
No ratings yet
Theangulartutorial PDF
537 pages
Gen Bio1 Module 8
100% (1)
Gen Bio1 Module 8
23 pages
21PE17 Modeling of Electrical Machines Laboratory
0% (1)
21PE17 Modeling of Electrical Machines Laboratory
3 pages
10 Qualities of A Good Teacher
No ratings yet
10 Qualities of A Good Teacher
6 pages
Wahyu Darmawan BAB II
No ratings yet
Wahyu Darmawan BAB II
16 pages
Assignment # 2 - Education For Sustainable Development
No ratings yet
Assignment # 2 - Education For Sustainable Development
3 pages
Alpadia 2025 Yaz Okulu Fiyat Listesi FWJIFC4s9jR
No ratings yet
Alpadia 2025 Yaz Okulu Fiyat Listesi FWJIFC4s9jR
6 pages
Eisenach (1685-1795) : Ambrosius Bach
No ratings yet
Eisenach (1685-1795) : Ambrosius Bach
3 pages
Isixhosa Fal p2 2023
0% (1)
Isixhosa Fal p2 2023
11 pages
X X X X: Triala National High School
50% (2)
X X X X: Triala National High School
4 pages
Product Oriented Performance Based Assessment
100% (1)
Product Oriented Performance Based Assessment
2 pages
Makalah Dan Notulen Bing Kelompok 2
No ratings yet
Makalah Dan Notulen Bing Kelompok 2
17 pages