0% found this document useful (0 votes)
10 views36 pages

Decision Tree For Class

Uploaded by

bn23mer2r15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views36 pages

Decision Tree For Class

Uploaded by

bn23mer2r15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Decision Trees

Overview

◼ What is a Decision Tree


◼ Sample Decision Trees
◼ How to Construct a Decision Tree
◼ Problems with Decision Trees
What is a Decision Tree?
◼ An inductive learning task
 Use particular facts to make more generalized
conclusions

◼ A predictive model based on a branching


series of Boolean tests
 These smaller Boolean tests are less complex
than a one-stage classifier

◼ Let’s look at a sample decision tree…


Predicting Commute Time
Leave At If we leave at
10 AM 9 AM 10 AM and
8 AM
there are no
Stall? Accident?
cars stalled on
No Yes Long No Yes
the road, what
will our
Short Long Medium Long commute time
be?
Inductive Learning
◼ In this decision tree, we made a series of
Boolean decisions and followed the
corresponding branch
 Did we leave at 10 AM?
 Did a car stall on the road?
 Is there an accident on the road?

◼ By answering each of these yes/no


questions, we then came to a conclusion on
how long our commute might take
Decision Trees as Rules

◼ We did not have represent this tree


graphically

◼ We could have represented as a set of


rules. However, this may be much
harder to read…
Decision Tree as a Rule Set
if hour == 8am ◼ Notice that all attributes to
commute time = long not have to be used in each
else if hour == 9am path of the decision.
if accident == yes
commute time = long ◼ As we will see, all attributes
may not even appear in the
else tree.
commute time =
medium
else if hour == 10am
if stall == yes
commute time = long
else
commute time = short
How to Create a Decision Tree

◼ We first make a list of attributes that


we can measure
 These attributes (for now) must be
discrete
◼ We then choose a target attribute that
we want to predict
◼ Then create an experience table that
lists what we have seen in the past
Sample Experience Table
Example Attributes Target
Hour Weather Accident Stall Commute
D1 8 AM Sunny No No Long
D2 8 AM Cloudy No Yes Long
D3 10 AM Sunny No No Short
D4 9 AM Rainy Yes No Long
D5 9 AM Sunny Yes Yes Long
D6 10 AM Sunny No No Short
D7 10 AM Cloudy No No Short
D8 9 AM Rainy No No Medium
D9 9 AM Sunny Yes No Long
D10 10 AM Cloudy Yes Yes Long
D11 10 AM Rainy No No Short
D12 8 AM Cloudy Yes No Long
D13 9 AM Sunny No No Medium
Choosing Attributes

◼ The previous experience decision


table showed 4 attributes: hour,
weather, accident and stall
◼ But the decision tree only showed 3
attributes: hour, accident and stall
◼ Why is that?
Choosing Attributes

◼ Methods for selecting attributes show


that weather is not a discriminating
attribute
Choosing Attributes

◼ The basic structure of creating a


decision tree is the same for most
decision tree algorithms
◼ The difference lies in how we select
the attributes for the tree
Decision Tree Algorithms

◼ The basic idea behind any decision tree


algorithm is as follows:
 Choose the best attribute(s) to split the
remaining instances and make that attribute a
decision node
 Repeat this process for recursively for each child
 Stop when:
◼ All the instances have the same target attribute value
◼ There are no more attributes
◼ There are no more instances
Identifying the Best Attributes

◼ Refer back to our original decision tree


Leave At

10 AM 9 AM
8 AM

Stall? Accident?
Long
No Yes No Yes
Short Long Medium Long

◼ How did we know to split on leave at


and then on stall and accident and not
weather?
Problems with Decision Trees

◼ While decision trees classify quickly,


the time for building a tree may be
higher than another type of classifier

◼ Decision trees suffer from a problem of


errors propagating throughout a tree
 A very serious problem as the number of
classes increases
Error Propagation

◼ Since decision trees work by a series


of local decisions, what happens when
one of these local decisions is wrong?
 Every decision from that point on may be
wrong
 We may never return to the correct path
of the tree
Rotational Coolent(Yes Surface
Exp No Speed /No) Treatment(Yes/No) Surface Finish
1 High Yes Yes Good
2 High No Yes Moderate
3 Slow Yes Yes Good
4 Medium Yes No Good
5 Medium No Yes No Change
6 Medium Yes No Good
7 Slow No No Good
8 Slow No Yes Minimal Effect
9 Slow Yes Yes Good
10 High No Yes Moderate
GINI

There are four possible output variables, Good, Moderate, No


Change and Minimal

There are six instances of Good, There are two instances of


Moderate, One Instance of No Change and One Instance of
Minimal Effect

GINI=1-{(6/10)2+(2/10)2+(1/10)2+(1/10)2} = 0.58
Computation of GINI Index for Surface Treatment

It has two possible values, either Yes or No

There are seven instances of Yes and three instances of NO

For surface treatment = No, there are three instances of Good


Surface Finish

GINI=1-{(3/3)2} = 0
For surface treatment = Yes, there are two instances of Moderate
Surface Finish, three instances of Good Surface Finish, one instance
with No Effect and One instance with Minimal surface finish
GINI=1-{(2/7)2+(3/7)2+(1/7)2+(1/7)2} = 0.694

Weighted Average of Surface Treatment=0*(3/10)+0.694*(7/10)=0.486


Rotational Coolent(Yes Surface
Exp No Speed /No) Treatment(Yes/No) Surface Finish
1 High Yes Yes Good
2 High No Yes Moderate
3 Slow Yes Yes Good
4 Medium Yes No Good
5 Medium No Yes No Change
6 Medium Yes No Good
7 Slow No No Good
8 Slow No Yes Minimal Effect
9 Slow Yes Yes Good
10 High No Yes Moderate
Computation of GINI Index for Coolant
It has two possible values, either Yes or No

Both Yes and No occur 5 times

For coolent = Yes, there are 5 instances and all of them have
produced good surface finish

GINI=1-{(5/5)2} = 0

For Coolent = No, there are two instances which provides Moderate
surface finish, one instance which provides Good surface finish, one
provides Minimal surface finish and another provides No Change in the
surface finish

Gini(Coolent=No)=1-[(2/5)2+(1/5)2+(1/5)2+(1/5)2]=0.72
Weighted Average of Surface Treatment= 0*(5/10)+0.72*(5/10)=0.36
Rotational Coolent(Yes Surface
Exp No Speed /No) Treatment(Yes/No) Surface Finish
1 High Yes Yes Good
2 High No Yes Moderate
3 Slow Yes Yes Good
4 Medium Yes No Good
5 Medium No Yes No Change
6 Medium Yes No Good
7 Slow No No Good
8 Slow No Yes Minimal Effect
9 Slow Yes Yes Good
10 High No Yes Moderate
Computation of GINI Index for Rotational Speed
It has three possible values, three instances of High,
three instances of Medium, and four instances of Slow
For Rotational Speed=High, there are one instance of Good Surface
Finish and Two instances of Moderate Surface Finish
Gini(Rotational Speed=High)=1-[(2/3)2+(1/3)2]=0.444
For Rotational Speed=Medium, there are two instances of Good Surface
Finish and one instance of No Change Surface Finish
Gini(Rotational Speed=Medium)=1-[(2/3)2+(1/3)2]=0.444
For Rotational Speed=Slow, there are three instances of Good Surface
Finish and One instance of Minimal Effect Surface Finish
Gini(Rotational Speed=Slow)=1-[(3/4)2+(1/4)2]=0.375
Weighted Average of Surface Treatment=
0.444*(3/10)+0.444*(3/10)+0.375*(4/10)=0.416
Computation of GINI Index

For Coolent, GINI = 0.36


For Rotational Speed, GINI = 0.416
For Surface Treatment, GINI=0.486

Coolant is having the minimum GINI index. So it is taken as


the root of the Tree
Coolant
YES
Surface
Exp Rotational Coolant Treatment(Yes/N Surface
No Speed (Yes/No) o) Finish NO
1 High Yes Yes Good
3 Slow Yes Yes Good
4 Medium Yes No Good
6 Medium Yes No Good
9 Slow Yes Yes Good
For Coolant = Yes, the surface finish is good for all the
conditions. So, no more decision is needed Surface
Coolent(Ye Treatment(
Exp No Rotational Speed s/No) Yes/No) Surface Finish
2 High No Yes Moderate
5 Medium No Yes No Change
7 Slow No No Good
8 Slow No Yes Minimal Effect
10 High No Yes Moderate
Computation of GINI Index for Coolant = No & Rotational Speeds
For Coolant = No three are three possible Rotational Speed Conditions

For Coolent = No and Rotational Speed =High, there are two


instances of surfacefinish viz Moderate
Gini(Coolent= No and Rotational Speed=High)=1-[(2/2)2]=0
For Coolant = No and Rotational Speed =Medium, there is only one
surfacefinish outcome viz No Change
Gini(Coolent= No and Rotational Speed=Medium)=1-[(1/1)2]=0
For Coolant = No and Rotational Speed =Slow, there is only one surface
finish outcome of Good and One Outcome equal to Minimal Effect

Gini(Coolant=No & Rotational Speed=Slow)=1-[(1/2)2+(1/2)2]=0.5


Weighted Average of Coolent and Rotational Speed= 0*(2/5)+0*(1/5)+0.5*(2/5)=0.2
Computation of GINI Index for Coolant = No & Surface Treatment

For Coolant = No three are two possible Surface Treatment Conditions,


Yes and No

For Coolent = No and Surface Treatment =Yes, there are two instances of
surfacefinish viz Moderate, one instance of Surface Finish=Minimal and
another instance of No Change

Gini(Coolent=No and Surface Treatment = Yes)=1-


[(2/4)2+(1/4)2+(1/4)2]=0.625
Computation of GINI Index for Coolant = No & Surface Treatment

For Coolant = No and Surface Treatment =No, there is only one possible
outcome ie Surface Finish=Good

Gini(Coolent=No and Surface Treatment = No)=1-[(1/1)2]=0

Weighted Average of Coolant and Surface Treatment=


0*(1/5)+0.625*(4/5)=0.5
For Coolants = No and Rotational Speed, GINI = 0.2
For Coolants = No and Surface Treatment, GINI = 0.5

Rotational Speed is Chosen as the root as it has the lowest GINI index
For Cholent = No and Rotational Speed = High the
outcome is same(Moderate)

Surface
Rotational Treatment Surface
Exp No Speed Coolant(Yes/No) (Yes/No) Finish
2 High No Yes Moderate
10 High No Yes Moderate
For Cholent = No and Rotational Speed = Medium the
outcome is only one(No Change)

Surface
Rotational Coolant Treatment Surface
Exp No Speed (Yes/No) (Yes/No) Finish
5 Medium No Yes No Change
For Coolant = No and Rotational Speed = Slow the
outcome is two one being Good and another Minimal
Effect

Surface
Rotational Coolant Treatment Surface
Exp No Speed (Yes/No) (Yes/No) Finish
7 Slow No No Good
Minimal
8 Slow No Yes Effect
Decision Tree
Coolant
Yes No
Rotational Speed
Surface Finish=Good
RS=Slow RS=High

RS=Medium
Surface Surface
Treatment No Change Finish=Moderate
Yes
No

Minimal Effect
Good

You might also like