Lec4 - Decision Trees
Lec4 - Decision Trees
Function Approximation
Problem Setting
• Set of possible instances X
• Set of possible labels Y
• Unknown target function
• Set of function hypotheses
Sample Dataset
• Columns denote features X i
• Rows denote labeled instances
• Class label denotes whether a tennis game was played
Decision Tree
• A possible decision tree for the data:
Decision
boundary
Expressiveness
• Decision trees can represent any boolean function of
the input attributes
Play
Day Outlook Temp Humidity Wind
Tennis
Yes No Yes No
Example
Question 3 Question 4
Yes No Yes No
Example
Example
Question 1 Question 2
E=1 E=1
Yes No Yes No
Information Gain
k
Example
Question 1 Question 2
E=1 E=1
Yes No Yes No
Play
Day Outlook Temp Humidity Wind
Tennis
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
E=0.954
Wind
E=0.811 E=1
Example
E=0.954
G (S, W ind) = 0.048
Humidity
E=0.985 E=0.592
Example
Temp
d
Mil
E=1 E=0.92 E=0.81
Example
G (S , H umidity) = 0.151
G (S , T emp) = 0.042
Outlook
Overcast
E=0.971 E=0 E=0.971
Example
Outlook
Overcast
Humidity Yes Wind
No Yes No Yes
Which Tree Should We Output?
• ID3 performs heuristic
search through space of
decision trees
• It stops at smallest
acceptable tree. Why?