0% found this document useful (0 votes)

7 views43 pages

Lec4 - Decision Trees

Uploaded by

ahmedhussien01554

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views43 pages

Lec4 - Decision Trees

Uploaded by

ahmedhussien01554

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Decision Trees

Function Approximation
Problem Setting
• Set of possible instances X
• Set of possible labels Y
• Unknown target function
• Set of function hypotheses
Sample Dataset
• Columns denote features X i
• Rows denote labeled instances
• Class label denotes whether a tennis game was played
Decision Tree
• A possible decision tree for the data:

• Each internal node: test one attribute X i

• Each branch from a node: selects one value for X i
• Each leaf node: predict Y (or p(Y | x ∈ leaf) )
Decision Tree
• A possible decision tree for the data:

• What prediction would we make for

<outlook=sunny, temperature=hot, humidity=high, wind=weak> ?
Decision Tree
• If features are continuous, internal nodes can
test the value of a feature against a threshold
Decision Tree Induced Partition
Decision Tree – Decision Boundary
• Decision trees divide the feature space into axis-
parallel (hyper-)rectangles
• Each rectangular region is labeled with one label
– or a probability distribution over labels

Decision
boundary
Expressiveness
• Decision trees can represent any boolean function of
the input attributes

Truth table row  path to leaf

• In the worst case, the tree will require exponentially

many nodes
Expressiveness
Decision trees have a variable-sized hypothesis space
• As the #nodes (or depth) increases, the hypothesis
space grows
– Depth 1 (“decision stump”): can represent any boolean
function of one feature
– Depth 2: any boolean fn of two features; some involving
three features (e.g., (x 1 A x 2 ) V (¬x1 A ¬x3) )
– etc.

Based on slide by Pedro Domingos

Another Example:
Restaurant Domain (Russell & Norvig)
Model a patron’s decision of whether to wait for a table at a restaurant

~7,000 possible cases

Decision Tree Techniques
Basic Algorithm for Top-Down
Induction of Decision Trees
[ID3, C4.5 by Quinlan]

node = root of decision tree

Main loop:
1. A  the “best” decision attribute for the next node.
2. Assign A as decision attribute for node.
3. For each value of A, create a new descendant of node.
4. Sort training examples to leaf nodes.
5. If training examples are perfectly classified, stop.
Else, recurse over new leaf nodes.

How do we choose which attribute is best?

Choosing the Best Attribute
Key problem: choosing which attribute to split a
given set of examples
• Some possibilities are:
– Random: Select any attribute at random
– Least-Values: Choose the attribute with the smallest
number of possible values
– Most-Values: Choose the attribute with the largest
number of possible values
– Max-Gain: Choose the attribute that has the largest
expected information gain
• i.e., attribute that results in smallest expected size of subtrees
rooted at its children

• The ID3 algorithm uses the Max-Gain method of

selecting the best attribute
Choosing an Attribute
Idea: a good attribute splits the examples into subsets
that are (ideally) “all positive” or “all negative”

Which split is more informative: Patrons? or Type?

Information Gain
Which test is more informative?
Split over whether Split over whether
Balance exceeds 50K applicant is employed

Less or equal 50K Over 50K Unemployed Employed

Information Gain
Impurity/Entropy (informal)
– Measures the level of impurity in a group of
examples
Information Gain
Impurity
Very impure group Less impure Minimum
impurity
Information Gain
• We want to determine which attribute in a given set
of training feature vectors is most useful for
discriminating between the classes to be learned.

• Information gain tells us how important a given

attribute of the feature vectors is.

• We will use it to decide the ordering of attributes in

the nodes of a decision tree.
Information Gain
Information Gain is the expected reduction in entropy of target
variable Y for data sample S, due to sorting on variable A
Play
Day Outlook Temp Humidity Wind
Tennis
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No

3 Overcast Hot High Weak Yes

4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No

7 Overcast Cool Normal Weak Yes

8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Strong Yes
11 Sunny Mild Normal Strong Yes

12 Overcast Mild High Strong Yes

13 Overcast Hot Normal Weak Yes

Rafael Nadal 14 Rain Mild High Strong No
Example

Play
Day Outlook Temp Humidity Wind
Tennis

Outlook 1 Sunny Hot High Weak No

2 Sunny Hot High Strong No

3 Overcast Hot High Weak Yes

Overcast

4 Rain Mild High Weak Yes

5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No

7 Overcast Cool Normal Weak Yes

Humidity Yes Wind
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Strong Yes
11 Sunny Mild Normal Strong Yes

12 Overcast Mild High Strong Yes

No Yes No Yes 13 Overcast Hot Normal Weak Yes

14 Rain Mild High Strong No

Example
Question 1 Question 2

Yes No Yes No
Example
Question 3 Question 4

Yes No Yes No
Example
Example
Question 1 Question 2

E=1 E=1

Yes No Yes No

E=0.97 E=0.92 E=0.72 E=0

Information Gain
k
Example
Question 1 Question 2

E=1 E=1

Yes No Yes No

E=0.97 E=0.92 E=0.72 E=0

Example

Play
Day Outlook Temp Humidity Wind
Tennis
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No

3 Overcast Hot High Weak Yes

4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No

7 Overcast Cool Normal Weak Yes

8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Strong Yes
11 Sunny Mild Normal Strong Yes

12 Overcast Mild High Strong Yes

13 Overcast Hot Normal Weak Yes

14 Rain Mild High Strong No

Example

E=0.954

Wind

E=0.811 E=1
Example

E=0.954
G (S, W ind) = 0.048

Humidity

E=0.985 E=0.592
Example

G (S, W i nd) = 0.048 E=0.954

G (S , H umidity) = 0.151

Temp

d
Mil
E=1 E=0.92 E=0.81
Example

G (S, W ind) = 0.048 E=0.954

G (S , H umidity) = 0.151
G (S , T emp) = 0.042

Outlook

Overcast
E=0.971 E=0 E=0.971
Example
Outlook

Overcast
Humidity Yes Wind

No Yes No Yes
Which Tree Should We Output?
• ID3 performs heuristic
search through space of
decision trees
• It stops at smallest
acceptable tree. Why?

Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Unit 3
No ratings yet
Unit 3
46 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Screenshot 2024-02-06 at 1.43.15 PM
No ratings yet
Screenshot 2024-02-06 at 1.43.15 PM
66 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Tree Models
No ratings yet
Tree Models
42 pages
02 DecisionTrees Done
No ratings yet
02 DecisionTrees Done
68 pages
Lec-3-Decision Trees
No ratings yet
Lec-3-Decision Trees
47 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Classification Trees
No ratings yet
Classification Trees
48 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
Decision Tree Intro MDT903
No ratings yet
Decision Tree Intro MDT903
40 pages
2025 Lecture07 P1 ID3
No ratings yet
2025 Lecture07 P1 ID3
41 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
ML Lec5
No ratings yet
ML Lec5
7 pages
14 2 DT
No ratings yet
14 2 DT
40 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Module 3
No ratings yet
Module 3
101 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
DecisionTree Numerical ID3Prob
No ratings yet
DecisionTree Numerical ID3Prob
114 pages
Module 3
No ratings yet
Module 3
102 pages
Unit-3 MLT
No ratings yet
Unit-3 MLT
74 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
Decisiontree 2
No ratings yet
Decisiontree 2
16 pages
Decision Tree: Dept of CS & IT Bahauddin Zakariya University, Sahiwal Campus
No ratings yet
Decision Tree: Dept of CS & IT Bahauddin Zakariya University, Sahiwal Campus
31 pages
Chapter18 2
No ratings yet
Chapter18 2
40 pages
DataMining-Handouts1 5
No ratings yet
DataMining-Handouts1 5
8 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Session 5b Classification by Decision Tree Induction
No ratings yet
Session 5b Classification by Decision Tree Induction
42 pages
Learning by Asking Questions: Decision Trees: Piyush Rai Machine Learning (CS771A)
No ratings yet
Learning by Asking Questions: Decision Trees: Piyush Rai Machine Learning (CS771A)
22 pages
Decision Trees
No ratings yet
Decision Trees
7 pages
15.module6 Decisiontree-Updated 14
No ratings yet
15.module6 Decisiontree-Updated 14
20 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Wilderness Living
From Everand
Wilderness Living
Gregory J. Davenport
3.5/5 (5)
Engineering Digest DSA Sheet
No ratings yet
Engineering Digest DSA Sheet
8 pages
Rohini - 54944938803 Heap
No ratings yet
Rohini - 54944938803 Heap
6 pages
BSC 2 Unit III Tree
No ratings yet
BSC 2 Unit III Tree
3 pages
Dsa Case Study 27
No ratings yet
Dsa Case Study 27
4 pages
Trie
No ratings yet
Trie
6 pages
Mergeable Heaps
No ratings yet
Mergeable Heaps
27 pages
Binary Trees
No ratings yet
Binary Trees
7 pages
Chapter 6 - Multiway-Tree
No ratings yet
Chapter 6 - Multiway-Tree
31 pages
7) Heap
No ratings yet
7) Heap
19 pages
B Tree: Max Keys m-1 Min Keys (m/2) - 1 Max Child M Min Children m/2
No ratings yet
B Tree: Max Keys m-1 Min Keys (m/2) - 1 Max Child M Min Children m/2
8 pages
DAA Unit 2 Notes
No ratings yet
DAA Unit 2 Notes
54 pages
DSU and MST
No ratings yet
DSU and MST
35 pages
ChatGPT - Insertion in Red-Black Tree
No ratings yet
ChatGPT - Insertion in Red-Black Tree
22 pages
UNIT5 - Comparison Tree
No ratings yet
UNIT5 - Comparison Tree
52 pages
07 Priority Queues Heaps
No ratings yet
07 Priority Queues Heaps
37 pages
AVL Trees: by Jyostna Devi Bodapati
No ratings yet
AVL Trees: by Jyostna Devi Bodapati
55 pages
Tutorals Exercises
No ratings yet
Tutorals Exercises
54 pages
1071
No ratings yet
1071
123 pages
AVL Example
No ratings yet
AVL Example
21 pages
Important Questions
No ratings yet
Important Questions
13 pages
Astar
No ratings yet
Astar
9 pages
Threaded Binary Trees
No ratings yet
Threaded Binary Trees
3 pages
Binary Tree (Part 1) - Chapter 6
No ratings yet
Binary Tree (Part 1) - Chapter 6
30 pages
Data Stuructures 2nd Internal
No ratings yet
Data Stuructures 2nd Internal
1 page
Dsa Lab File Complete
No ratings yet
Dsa Lab File Complete
19 pages
CS301 Short Notes For Midterm Exams 2021 Download
No ratings yet
CS301 Short Notes For Midterm Exams 2021 Download
6 pages
All Algorithms (2 Files Merged)
No ratings yet
All Algorithms (2 Files Merged)
15 pages
Hierarchical Data Structures
No ratings yet
Hierarchical Data Structures
21 pages
Design and Analysis of Algorithms: CSE 5311 Lecture 20 Minimum Spanning Tree
No ratings yet
Design and Analysis of Algorithms: CSE 5311 Lecture 20 Minimum Spanning Tree
44 pages
5 Tree 1
No ratings yet
5 Tree 1
15 pages

Lec4 - Decision Trees

Uploaded by

Lec4 - Decision Trees

Uploaded by

Decision Trees

• Each internal node: test one attribute X i

• What prediction would we make for

Truth table row  path to leaf

• In the worst case, the tree will require exponentially

Based on slide by Pedro Domingos

~7,000 possible cases

node = root of decision tree

How do we choose which attribute is best?

• The ID3 algorithm uses the Max-Gain method of

Which split is more informative: Patrons? or Type?

Less or equal 50K Over 50K Unemployed Employed

• Information gain tells us how important a given

• We will use it to decide the ordering of attributes in

3 Overcast Hot High Weak Yes

7 Overcast Cool Normal Weak Yes

12 Overcast Mild High Strong Yes

13 Overcast Hot Normal Weak Yes

Outlook 1 Sunny Hot High Weak No

3 Overcast Hot High Weak Yes

4 Rain Mild High Weak Yes

7 Overcast Cool Normal Weak Yes

12 Overcast Mild High Strong Yes

14 Rain Mild High Strong No

E=0.97 E=0.92 E=0.72 E=0

E=0.97 E=0.92 E=0.72 E=0

3 Overcast Hot High Weak Yes

7 Overcast Cool Normal Weak Yes

12 Overcast Mild High Strong Yes

13 Overcast Hot Normal Weak Yes

14 Rain Mild High Strong No

G (S, W i nd) = 0.048 E=0.954

G (S, W ind) = 0.048 E=0.954

You might also like