0% found this document useful (0 votes)

9 views48 pages

7 DecisioinTrees

Uploaded by

João Paulo Dellasta do Nascimento

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views48 pages

7 DecisioinTrees

Uploaded by

João Paulo Dellasta do Nascimento

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

ITCS 6156/8156 Spring 2024

Machine Learning

Decision Trees
Instructor: Hongfei Xue
Email: [email protected]
Class Meeting: Mon & Wed, 4:00 PM – 5:15 PM, Denny 109

Some content in the slides is based on Dr. Raquel Urtasun’s lecture

Another Classiﬁcation Idea

2
Another Classiﬁcation Idea

3
Another Classiﬁcation Idea

4
Another Classiﬁcation Idea

5
Example with Discrete Inputs

6
Example with Discrete Inputs

7
Decision Trees

8
Decision Tree Algorithm

9
Decision Boundary

Decision trees divide the feature space into axis- parallel (hyper-)
rectangles.

Each rectangular region is labeled with one label (or a probability

distribution over labels).

10
Classification and Regression

11
Expressiveness

12
How to Specify Test Condition?

• Depends on attribute types

• Nominal
• Ordinal
• Continuous

• Depends on number of ways to split

• 2-way split
• Multi-way split

13
Splitting Based on Nominal Attributes

• Multi-way split: Use as many partitions as distinct

values
CarType
Family Luxury
Sports

• Binary split: Divides values into two subsets

Need to find optimal partitioning

CarType CarType
{Sports, OR {Family,
Luxury} {Family} Luxury} {Sports}

14
Splitting Based on Ordinal Attributes

• Multi-way split: Use as many partitions as distinct

values.
Size
Small Large
Medium

• Binary split: Divides values into two subsets

Need to find optimal partitioning
Size Size
{Small,
{Large}
OR {Medium,
{Small}
Medium} Large}

Size
{Small,
• What about this split? Large} {Medium}

15
Splitting Based on Continuous Attributes

• Different ways of handling

• Discretization to form an ordinal categorical attribute

• Binary Decision: (A < v) or (A ³ v)

• consider all possible splits and finds the best cut
• can be more computation intensive

16
Splitting Based on Continuous Attributes

Taxable Taxable
Income Income?
> 80K?
< 10K > 80K
Yes No

[10K,25K) [25K,50K) [50K,80K)

(i) Binary split (ii) Multi-way split

17
Learn a Decision Tree

The best tree?

Occam’s Razor: The smallest decision tree that correctly classifies all of
the training examples is best.
Finding the the smallest (simplest) decision tree is an NP-hard problem [if
you are interested, check: Hyafil & Rivest’76].

18
Choosing a Good Attribute

19
Choosing a Good Attribute

20
We Flip Two Diﬀerent Coins

21
Quantifying Uncertainty

22
Quantifying Uncertainty

23
Entropy

24
Entropy of a Joint Distribution

25
Speciﬁc Conditional Entropy

26
Conditional Entropy

27
Conditional Entropy

28
Conditional Entropy

29
Information Gain

30
Constructing Decision Trees

31
Decision Tree Construction Algorithm

32
Back to Our Example

33
Attribute Selection

34
How to determine the Best Split: Impurity

Before Splitting: 10 records of class 0,

10 records of class 1

On- Car Student

Campus? Type? ID?

Yes No Family Luxury c1 c20

c10 c11
Sports
C0: 6 C0: 4 C0: 1 C0: 8 C0: 1 C0: 1 ... C0: 1 C0: 0 ... C0: 0
C1: 4 C1: 6 C1: 3 C1: 0 C1: 7 C1: 0 C1: 0 C1: 1 C1: 1

35
How to determine the Best Split: Impurity

• Greedy approach:
• Nodes with homogeneous class distribution are preferred
• Need a measure of node impurity:

C0: 5 C0: 9
C1: 5 C1: 1

Non-homogeneous, Homogeneous,
High degree of impurity Low degree of impurity

36
Measure of Impurity: GINI

• Gini Index for a given node t :

GINI (t ) = 1 - å[ p( j | t )]2
j

(NOTE: p( j | t) is the relative frequency of class j at node t).

• Maximum (1 - 1/nc) when records are equally distributed

among all classes, implying least interesting information
• Minimum (0) when all records belong to one class, implying
most useful information

C1 0 C1 1 C1 2 C1 3
C2 6 C2 5 C2 4 C2 3
Gini=0.000 Gini=0.278 Gini=0.444 Gini=0.500

37
Measure of Impurity: GINI

GINI (t ) = 1 - å[ p( j | t )]2
j

C1 0 P(C1) = 0/6 = 0 P(C2) = 6/6 = 1

C2 6 Gini = 1 – P(C1)2 – P(C2)2 = 1 – 0 – 1 = 0

C1 1 P(C1) = 1/6 P(C2) = 5/6

C2 5 Gini = 1 – (1/6)2 – (5/6)2 = 0.278

C1 2 P(C1) = 2/6 P(C2) = 4/6

C2 4 Gini = 1 – (2/6)2 – (4/6)2 = 0.444

38
Which Tree is Better?

39
What Makes a Good Tree?

40
Decision Tree Miscellany

41
Comparison to k-NN

42
Applications of Decision Trees: XBox!

43
Applications of Decision Trees: XBox!

44
Applications of Decision Trees: XBox!

45
Applications of Decision Trees: XBox!

46
Applications of Decision Trees

47
Questions?

Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Dok Chart and Stems PDF
No ratings yet
Dok Chart and Stems PDF
5 pages
07.2.decision Trees
No ratings yet
07.2.decision Trees
33 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
59 pages
CSE445 NSU Week - 4
No ratings yet
CSE445 NSU Week - 4
48 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Trees
No ratings yet
Decision Trees
19 pages
ML Classification Tree
No ratings yet
ML Classification Tree
36 pages
DM Lect8
No ratings yet
DM Lect8
56 pages
Lec 07
No ratings yet
Lec 07
66 pages
Classification: Decision Trees: Business Analytics Lecture 7/8
No ratings yet
Classification: Decision Trees: Business Analytics Lecture 7/8
35 pages
Classification: Basic Concepts and Decision Trees
No ratings yet
Classification: Basic Concepts and Decision Trees
71 pages
Lecture 5 DecisionTree
No ratings yet
Lecture 5 DecisionTree
21 pages
Ecture Ecision REE: Sajal Halder Bsmrstu
100% (1)
Ecture Ecision REE: Sajal Halder Bsmrstu
22 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
07.2.decision Trees - ML
No ratings yet
07.2.decision Trees - ML
32 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
5 Classification
No ratings yet
5 Classification
59 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
10 - Cart
No ratings yet
10 - Cart
39 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Lec05 Classification DecisionTree
No ratings yet
Lec05 Classification DecisionTree
67 pages
Decision Tree
No ratings yet
Decision Tree
22 pages
3 - Sınıflandırma 2
No ratings yet
3 - Sınıflandırma 2
62 pages
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
No ratings yet
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
28 pages
Supervised Learning Algorithm
No ratings yet
Supervised Learning Algorithm
59 pages
Lecture 5 - Decision Tree
No ratings yet
Lecture 5 - Decision Tree
49 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Decision Tree
No ratings yet
Decision Tree
47 pages
Lecture 5 Classification P2 Decision Tree
No ratings yet
Lecture 5 Classification P2 Decision Tree
54 pages
ML-chap9 2024 110217
No ratings yet
ML-chap9 2024 110217
52 pages
Concepts - Decision Trees
No ratings yet
Concepts - Decision Trees
23 pages
Decision Tree
No ratings yet
Decision Tree
38 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Lecture 5 - Decision Tree
No ratings yet
Lecture 5 - Decision Tree
48 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
No ratings yet
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
34 pages
ML Lecture 8 9 Classification
No ratings yet
ML Lecture 8 9 Classification
35 pages
Python Decision Tree Classification
No ratings yet
Python Decision Tree Classification
14 pages
DM Chapter 4
No ratings yet
DM Chapter 4
6 pages
Decision Trees
No ratings yet
Decision Trees
19 pages
7 - Classfication - Concept - DecisionTree - Evaluation
No ratings yet
7 - Classfication - Concept - DecisionTree - Evaluation
47 pages
DMDW Classification
No ratings yet
DMDW Classification
18 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
DM Unit 4
No ratings yet
DM Unit 4
24 pages
Data Mining - Lecture 5
No ratings yet
Data Mining - Lecture 5
33 pages
Trees
No ratings yet
Trees
78 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
IML Unit04 - Learning Decision Trees
No ratings yet
IML Unit04 - Learning Decision Trees
28 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Machine - Learning - Lecture - 08 - Decision Tree Learning
No ratings yet
Machine - Learning - Lecture - 08 - Decision Tree Learning
67 pages
Unit 1 Classification & Prediction DM
No ratings yet
Unit 1 Classification & Prediction DM
71 pages
Decision Tree
No ratings yet
Decision Tree
66 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Lec 6
No ratings yet
Lec 6
39 pages
Decision Trees
No ratings yet
Decision Trees
19 pages
20-Minute (Or Less) Filter Hacks
From Everand
20-Minute (Or Less) Filter Hacks
Sheela Preuitt
No ratings yet
GCSE Maths Revision: Cheeky Revision Shortcuts
From Everand
GCSE Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (2)
Md-070 Application Extensions Technical Design
100% (1)
Md-070 Application Extensions Technical Design
16 pages
Small Lab Design
No ratings yet
Small Lab Design
1 page
L9 Adsorption 2024
No ratings yet
L9 Adsorption 2024
15 pages
Brand Perception of Honda Products
No ratings yet
Brand Perception of Honda Products
64 pages
Application of IR - ITC
No ratings yet
Application of IR - ITC
23 pages
JZC 32F Etc PDF
No ratings yet
JZC 32F Etc PDF
1 page
Transportation Engineering Ii: Classification and Axle Loading of Commercial Vehicles
No ratings yet
Transportation Engineering Ii: Classification and Axle Loading of Commercial Vehicles
13 pages
Smart Assistive Multi Final
No ratings yet
Smart Assistive Multi Final
11 pages
Hollow Earth - Wikipedia
No ratings yet
Hollow Earth - Wikipedia
54 pages
Archmodels Vol 127
100% (1)
Archmodels Vol 127
71 pages
Research Proposal Listowel Dika
No ratings yet
Research Proposal Listowel Dika
32 pages
Avasa Hotel, Hydreabad
75% (4)
Avasa Hotel, Hydreabad
12 pages
10th Science Sample Paper 2024
No ratings yet
10th Science Sample Paper 2024
13 pages
Ba01572cen 0320
No ratings yet
Ba01572cen 0320
16 pages
Saffola
No ratings yet
Saffola
2 pages
Unit 5 Pointers
No ratings yet
Unit 5 Pointers
9 pages
Jin Memorial Temple
No ratings yet
Jin Memorial Temple
2 pages
54dgftmar2006 07
No ratings yet
54dgftmar2006 07
107 pages
Ecr RS
No ratings yet
Ecr RS
11 pages
Avr4311 E2
No ratings yet
Avr4311 E2
2 pages
Gul Nawaz CV
No ratings yet
Gul Nawaz CV
2 pages
Instant Download Activate College Reading 1st Edition Ivan Dole PDF All Chapter
100% (2)
Instant Download Activate College Reading 1st Edition Ivan Dole PDF All Chapter
55 pages
Closed-Loop Control of DC Drives With Controlled Rectifier
0% (1)
Closed-Loop Control of DC Drives With Controlled Rectifier
40 pages
PSAT Bahasa Inggris Kelas 10
No ratings yet
PSAT Bahasa Inggris Kelas 10
5 pages
Land-Productivity Dynamics in Europe
No ratings yet
Land-Productivity Dynamics in Europe
80 pages
Pre Cal Circle
No ratings yet
Pre Cal Circle
16 pages
CBSE Class 6 Maths Practice Worksheets
100% (1)
CBSE Class 6 Maths Practice Worksheets
2 pages
Type VR Vacuum Circuit Breaker Interruptor Automático Al Vacío Tipo VR Disjoncteur Sous Vide Type VR
No ratings yet
Type VR Vacuum Circuit Breaker Interruptor Automático Al Vacío Tipo VR Disjoncteur Sous Vide Type VR
113 pages

7 DecisioinTrees

Uploaded by

7 DecisioinTrees

Uploaded by

ITCS 6156/8156 Spring 2024

Some content in the slides is based on Dr. Raquel Urtasun’s lecture

Each rectangular region is labeled with one label (or a probability

• Depends on attribute types

• Depends on number of ways to split

• Multi-way split: Use as many partitions as distinct

• Binary split: Divides values into two subsets

• Multi-way split: Use as many partitions as distinct

• Binary split: Divides values into two subsets

• Different ways of handling

• Binary Decision: (A < v) or (A ³ v)

[10K,25K) [25K,50K) [50K,80K)

(i) Binary split (ii) Multi-way split

The best tree?

Before Splitting: 10 records of class 0,

On- Car Student

Yes No Family Luxury c1 c20

• Gini Index for a given node t :

(NOTE: p( j | t) is the relative frequency of class j at node t).

• Maximum (1 - 1/nc) when records are equally distributed

C1 0 P(C1) = 0/6 = 0 P(C2) = 6/6 = 1

C1 1 P(C1) = 1/6 P(C2) = 5/6

C1 2 P(C1) = 2/6 P(C2) = 4/6

You might also like