0% found this document useful (0 votes)

122 views17 pages

Machine Learning - Decision Trees

The document discusses machine learning and decision trees. It defines machine learning as improving performance on tasks through experience. Decision trees are described as classifiers that predict categories or outcomes based on attributes, using a tree structure. The document outlines the algorithm for building a decision tree from a training set using attributes to split the data into branches at each node.

Uploaded by

xorpho

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

122 views17 pages

Machine Learning - Decision Trees

Uploaded by

xorpho

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Machine Learning

-Decision Trees-
Gulden Uchyigit

Advances in AI
MACHINE LEARNING

• Introduction
• Decision trees

Russel and Norvig, AI, a modern approach (second edition)

Chapter 18 – sections 1-3

1
Machine learning
• So far, we have focused on building systems that do something
(behaviour/performance), given some knowledge
• (Machine) learning to
• improve behaviour/performance:

• learn to perform new tasks (more)

• increase ability on existing tasks (better)
• increase speed on existing tasks (faster)
• produce and increase knowledge:
• formulate explicit concept descriptions
• formulate explicit rules
• discover regularities in data
• discover the way the world behaves
Overall, machine learning to promote autonomy of agents
3

Definition
• A computer program is said to learn from
experience E with respect to some class of tasks T
and performance measure P, if its performance at
tasks in T, as measured by P, improves with
experience E.

• E.g A computer program that learns to play checkers might

improve its performance as measured by its ability to win
at of tasks involving playing checkers games, through
experience obtained by playing games against itself.

2
Learning architecture

Experience (e.g. inputs/outputs) problem/

task
Reasoning/
Learning
performing
element
element
answer/
background knowledge/ performance
bias

Learning Issues

• Expressiveness - what can be learnt?

• Efficiency - how easily is learning performed?
• Transparency - can we understand what has been
learnt?
• Bias - which hypotheses are preferred?
• Background knowledge - available or not?
• Assessing performance - cross-validation and
learning curves
• Coping with noise
6

3
Assessing performance
• Cross-validation: set of examples split into
training set (to learn) + test set (to check)
• Learning curves: growing the training set, how
does the behaviour of the learnt system improve
upon the test set?

[from Russell & Norvig]

Learning techniques

• Decision tree learning

• Learning neural networks
• Reinforcement learning
• Inductive logic programming

4
Decision Trees - Introduction

Goal: Categorization
• Given an event, predict its category. Examples:
• Who won a given football match?
• How should we file a given e-mail?
• Event=list of attributes. Examples
• Football: Who was the goalie?
• Email: Who sent the e-mail?

Introduction …..cont.

• Use a decision tree to predict categories for

new events.
• Use training data to build the decision tree.
New
Events

Training
Decision
Events &
Tree
Categories

•Each decision node is labeled

with an attribute.
•Each arc is labeled with a
value for the nodes attribute.
•Each leaf node is labeled with
a category.

Example

• Whether or not I will be going to a party depends

on the four attributes:
• Distance - distance of the party (short, medium, long).
• Friends - whether or not any of my friends are going
(yes,no).
• Prior - whether or not I have any prior plans (yes, no).
• Rain- whether or not it is raining (yes, no).
• Tired – whether or not I am tired (yes, no).

6
Decision trees: example
Goal predicate: prior
attend-party commitment?
no yes

distance? no

long
short med

friends no
tired?
attending?
yes no yes
no
no yes raining? no

no yes

yes no
13

Reading the Decision Tree

•Decision trees implicitly define
prior?
logical sentences (conjunctions of implications)
no yes
dist? no
short long
med
friends? tired? no
no yes no yes
no yes rain? no
e.g. no yes
yes no
∀P attend-party(P) if not prior(P) & dist(P,short) & friends(P)
∀P attend-party(P) if not prior(P) & dist(P,med) & not tired(P) & not rain(P)

∀P attend-party(P) iff
[not prior(P) & dist(P,short) & friends(P)] or
[not prior(P) & dist(P,med) & not tired(P) & not rain(P)] 14

7
Decision tree learning algorithm
• Start with a set of examples (training set), set of
attributes SA, default value for goal predicate.
• If the set of examples is empty, then add a leaf with
the default value for the goal predicate and
terminate, otherwise
• If all examples have the same classification, then
add a leaf with that classification and terminate,
otherwise
• If the set of attributes SA is empty, then return the
default value for the goal predicate and terminate,
otherwise 15

Decision tree learning alg.

..cont.
• Choose an attribute A to split on.
• Add a corresponding test to the tree.
• Create new branches for each value of the
attribute.
• Assign each example to the appropriate branch.
• Iterate from step 1) on each branch, with set of
attributes SA-{A} and default value the majority
value for the current set of examples .

8
Example: training set (step 1)
Set of attributes SA

prior dist friend tired rain classification

1. Y L N Y N N
2. N M N Y Y N
3. N S Y Y Y Y
4. N S N Y N N
5. N M Y N N Y
6. N S Y Y N Y
7. Y S Y Y N N
8. Y M Y Y Y N
9. Y L Y Y N N
10. Y L Y Y Y N
Default value: Y
17

Example: decision tree learning

prior dist friend tired rain classification
1. Y L N Y N N
2. N M N Y Y N
3. N S Y Y Y Y
4.
5.
N
N
S
M
N
Y
Y
N
N
N
N
Y
distance
6.
7.
N
Y
S
S
Y
Y
Y
Y
N
N
Y
N
long
8. Y M Y Y Y N short med
9. Y L Y Y N N
2-,5+,8- 1-,9-,10-
10. Y L Y Y Y N
3+,4-,6+,7-
Default value: Y

9
Example: decision tree learning
distance All negative!

short med long 1-,9-,10-

3+,4-,6+,7- 2-,5+,8-
no
prior friend tired rain classification prior friend tired rain classification
prior friend tired rain classification
3. N Y Y Y Y 2. N N Y Y N
4. N N Y N N 5. N Y N N Y 1. Y N Y N N
6. N Y Y N Y 8. Y Y Y Y N 9. Y Y Y N N
7. Y Y Y N N 10. Y Y Y Y N
Default value: N
Default value: N Default value: N

Example: decision tree learning

prior friend tired rain classification
3. N Y Y Y Y distance
4. N N Y N N
6. N Y Y N Y
7. Y Y Y N N short med long
Default value: N 3+,4-,6+,7-
prior friend tired rain classification no
prior 2. N N Y Y N
5. N Y N N Y
commit? 8. Y Y Y Y N

3+,4-,6+ no yes
7- Default value: N

10
Example: decision tree learning

distance
short
med long
prior
commit? prior friend tired rain classification
no
2. N N Y Y N
no yes 5. N Y N N Y
8. Y Y Y Y N
friend tired rain classification
no
Default value: N
3. Y Y Y Y
4. N Y N N
6. Y Y N Y

Default value: Y

Example: decision tree learning

final tree
distance

short med long

prior no
tired?
commit?
no yes no yes

friends no yes no
attending?
no yes
no yes

11
Empty set of attributes if noise
A B classification
1. Y N N
2. N Y N
noise
3. N Y Y

yes A no
1- 2-,3+
no B yes
no
2-,3+ ?
Default value

Choose the “best” attribute?

• Intuition:
• The aim is to minimise the depth of the final tree
• choose attribute that provides as exact as possible a
classification :
• “perfect” attribute: all examples are either positive or negative
• “useless” attribute: the proportion of positive and negative
examples in the new sets is roughly the same as in the original set
• Information theory for defining
“perfect/useful/useless” attributes by computing the
information gain from choosing attributes

12
Entropy

• S is a sample of training examples

• p+ is the proportion of positive examples in
S
• p- is the proportion of negative examples in
S

Entropy(S)=- p+log2 p+ - p-log2 p-

Example

• Imagine that a total of 14 examples contains

9 positive examples and 5 negative
examples.
9 9 5 5
Entropy([9+,5−]) = − log2   − log2   = 0.940
14  14  14  14 
Note:
ln( x)
log 2 ( x) =
ln(2)
26

13
The Entropy Function
1.0

Equal number of positive

Entropy(S)

and negative examples

0.5

All positive examples

0.0
0.5 1.0
p
+ 27

Information Gain
Gain( S , A)
• = Expected reduction in
entropy due to sorting on A
| Sv |
Gain( S , A) ≡ Entropy ( S ) − ∑ Entropy ( Sv )
v∈Values ( A ) | S |

Where,
S is the number of training examples.
A is the attribute.
v is the values of the attributes.

14
A Worked Example
Day Outlook Temperature Humidity Wind PlayTennis

D1 Sunny Hot High Weak No

D2 Sunny Hot High Strong No

D3 Overcast Hot High Weak Yes

D4 Rain Mild High Weak Yes

D5 Rain Cool Normal Weak Yes

D6 Rain Cool Normal Strong No

D7 Overcast Cool Normal Strong Yes

D8 Sunny Mild High Weak No

D9 Sunny Cool Normal Weak Yes

D10 Rain Mild Normal Weak Yes

D11 Sunny Mild Normal Strong Yes

D12 Overcast Mild High Strong Yes

D13 Overcast Hot Normal Weak Yes

D14 Rain Mild High Strong No

Selecting the Root Attribute

S: [9+,5-] S: [9+,5-]
E=0.940 E=0.940

Humidity Wind

High Normal Weak Strong

S: [3+,4-] S: [6+,1-] S: [6+,2-] S: [3+,3-]

E=0.985 E=0.592 E=0.0.811 E=1.00
Gain ( S , Humidity ) Gain ( S , Humidity )
= 0 . 940 − ( 7 / 14 ) × 0 . 985 − ( 7 / 14 ) × 0 . 592 = 0 . 940 − ( 8 / 14 ) × 0 . 811 − ( 6 / 14 ) × 1 . 0
= 0 . 151 = 0 . 048

Which attribute is the best classifier? 30

15
Selecting the Root Attribute
S: [9+,5-]
E=0.940

Outlook

Sunny Overcast Rain

S: [2+,3-]
S: [4+,0-] S:[3+,2- ]
E=0.970
E=0 E=0.970
Gain ( S , Sunny )
= 0 . 940 − ( 5 / 14 ) × ( 0 . 970 ) + ( 4 / 14 ) × ( 0 ) + ( 5 / 14 ) × 0 . 970
= 0 . 246

Information Gain Values

• Gain(S,Outlook)=0.246
• Gain(S,Humidity)=0.151
• Gain(S,Wind)=0.048
• Gain(S,Temperature)=0.029
The ”best” attribute is the one with the highest information gain
value.
The ”worst” attribute is the one with the smallest information gain
value
32

16
Selecting root attribute
Outlook

Sunny
Overcast Rain

[D1,D2,D8,D9,D11]
[2+,3-]
?

Which Attribute Goes Here?

Selecting the next attribute

S: [2+,3-]
E=0.970 [D9+,D11+, D1-,D2-,D8-]
Temperature

Hot Mild Cool

[D1-,D2-] [D11+,D8-] [D9+]

S: [0+,2-] S: [1+,1-] S: [1+,0-]

E:? E:? E:?

Ai Unit-4
No ratings yet
Ai Unit-4
60 pages
Machine Learning - Part 1
100% (1)
Machine Learning - Part 1
80 pages
Iec60364 5 52 (Ed2.0) en - D PDF
50% (2)
Iec60364 5 52 (Ed2.0) en - D PDF
8 pages
Hospital List
75% (4)
Hospital List
4 pages
Tycs Ai Unit 2
No ratings yet
Tycs Ai Unit 2
84 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Unit 3
No ratings yet
Unit 3
90 pages
Wind On Sheds PDF
100% (1)
Wind On Sheds PDF
12 pages
Chapter 4
No ratings yet
Chapter 4
103 pages
Slides
No ratings yet
Slides
174 pages
3-Classification, Clustering and Prediction
No ratings yet
3-Classification, Clustering and Prediction
142 pages
DMDM Part 2
No ratings yet
DMDM Part 2
94 pages
Module 3
No ratings yet
Module 3
102 pages
Chapter 4
No ratings yet
Chapter 4
103 pages
Lecture W5ab
No ratings yet
Lecture W5ab
56 pages
Unit 3
No ratings yet
Unit 3
81 pages
Unit - 3
No ratings yet
Unit - 3
83 pages
Chapter 7
No ratings yet
Chapter 7
74 pages
AI Lecture 9
No ratings yet
AI Lecture 9
69 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
Lecture 6 Classification-Decision Tree Rule Based K-NN
No ratings yet
Lecture 6 Classification-Decision Tree Rule Based K-NN
73 pages
Artificial Intelligence: Machine Learning
No ratings yet
Artificial Intelligence: Machine Learning
110 pages
Classification and Clustering
No ratings yet
Classification and Clustering
59 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
41 pages
W2
No ratings yet
W2
33 pages
Machine Learning: Mona Leeza Email: Monaleeza - Bukc@bahria - Edu.pk
No ratings yet
Machine Learning: Mona Leeza Email: Monaleeza - Bukc@bahria - Edu.pk
60 pages
ML Unit 3 Part 3
No ratings yet
ML Unit 3 Part 3
33 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Unit 3
No ratings yet
Unit 3
34 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Decision Trees CLS
No ratings yet
Decision Trees CLS
43 pages
Class 16 Decision Tree
No ratings yet
Class 16 Decision Tree
45 pages
Machine Learning
No ratings yet
Machine Learning
27 pages
Slide 3
No ratings yet
Slide 3
23 pages
Lecture 06 Part A - Macine Learning
No ratings yet
Lecture 06 Part A - Macine Learning
77 pages
7-Decision Trees Learning
No ratings yet
7-Decision Trees Learning
51 pages
Unit 2
No ratings yet
Unit 2
20 pages
Classification Chapter 5
No ratings yet
Classification Chapter 5
26 pages
Data Mining Classification Algorithms: Credits: Padhraic Smyth
No ratings yet
Data Mining Classification Algorithms: Credits: Padhraic Smyth
54 pages
Classification and Clustering Algorithm Notes
No ratings yet
Classification and Clustering Algorithm Notes
19 pages
4 Classification
No ratings yet
4 Classification
20 pages
Tanaman Hias
No ratings yet
Tanaman Hias
8 pages
Cse Vsem 503 B PR Unit 2 Notes
No ratings yet
Cse Vsem 503 B PR Unit 2 Notes
17 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Classification&Decision Tree
No ratings yet
Classification&Decision Tree
13 pages
Unit 3 MLT
No ratings yet
Unit 3 MLT
18 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
DataMining Unit-3
No ratings yet
DataMining Unit-3
8 pages
Ai 01 Id3
No ratings yet
Ai 01 Id3
7 pages
Decision Trees: Decision Tree Representation ID3 Learning Algorithm Entropy, Information Gain Overfitting
No ratings yet
Decision Trees: Decision Tree Representation ID3 Learning Algorithm Entropy, Information Gain Overfitting
33 pages
Decisiontrees
No ratings yet
Decisiontrees
46 pages
Practical 7 Classification Revision Questions
No ratings yet
Practical 7 Classification Revision Questions
8 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Topic 8 Basic Classification Methods
No ratings yet
Topic 8 Basic Classification Methods
2 pages
Attachment 14940535 2 4 - S-GATE - Presentation
No ratings yet
Attachment 14940535 2 4 - S-GATE - Presentation
14 pages
Building Classification Models - ID3 and C4.5
No ratings yet
Building Classification Models - ID3 and C4.5
1 page
Placer Gold Operations Manual
100% (1)
Placer Gold Operations Manual
178 pages
BREAK Character Sheet (Tam)
No ratings yet
BREAK Character Sheet (Tam)
1 page
Chapter One Transformer
No ratings yet
Chapter One Transformer
45 pages
Capacity Planning For Products and Services
No ratings yet
Capacity Planning For Products and Services
31 pages
Retail Management
No ratings yet
Retail Management
8 pages
Blood Bank 3
No ratings yet
Blood Bank 3
14 pages
Ig Quasar Ceiling Titan (b5)
No ratings yet
Ig Quasar Ceiling Titan (b5)
48 pages
Tata Motors
No ratings yet
Tata Motors
38 pages
SME - Metal Enclosed Switchgears
No ratings yet
SME - Metal Enclosed Switchgears
4 pages
E103-W02 UserManual EN V3.0
No ratings yet
E103-W02 UserManual EN V3.0
54 pages
After The Storm
No ratings yet
After The Storm
4 pages
636379840590994941
100% (1)
636379840590994941
55 pages
Mad Catz Street Fighter V Arcade FightStick TE2 PS4 PS3 Product Guide
No ratings yet
Mad Catz Street Fighter V Arcade FightStick TE2 PS4 PS3 Product Guide
14 pages
97-680 Multiprime
No ratings yet
97-680 Multiprime
2 pages
How To Build Data Pipelines For Machine Learning - by Shaw Talebi - Towards Data Science
No ratings yet
How To Build Data Pipelines For Machine Learning - by Shaw Talebi - Towards Data Science
21 pages
Chapter 2 Different Types of Fixtures
No ratings yet
Chapter 2 Different Types of Fixtures
20 pages
"Blended Wing Body" (BWD)
No ratings yet
"Blended Wing Body" (BWD)
28 pages
G2 3 1 2HowBearLostHisTail5
No ratings yet
G2 3 1 2HowBearLostHisTail5
15 pages
Erdogan 1999 Celebrity Endorsement A Literature Review
No ratings yet
Erdogan 1999 Celebrity Endorsement A Literature Review
8 pages
Cotton Case Study
No ratings yet
Cotton Case Study
2 pages
Unit 5 - Systems of Equations and Inequalities Study Guide
No ratings yet
Unit 5 - Systems of Equations and Inequalities Study Guide
6 pages
9709 s10 QP 32
No ratings yet
9709 s10 QP 32
4 pages
Dual, Low Noise, High Performance Uncompensated Operational Amplifier
No ratings yet
Dual, Low Noise, High Performance Uncompensated Operational Amplifier
5 pages
Personal Letter Exercise
No ratings yet
Personal Letter Exercise
3 pages
Ec2209 Set 3
No ratings yet
Ec2209 Set 3
2 pages
Algebra - Task & Drill Sheets Gr. 3-5
From Everand
Algebra - Task & Drill Sheets Gr. 3-5
Nat Reed
No ratings yet