Decision Trees Boosting Example Problem
Decision Trees Boosting Example Problem
Consider the following data, where the Y label is whether or not the child goes out to play.
= −𝑃(𝑌 = 𝑦𝑒𝑠) log 2 𝑃(𝑌 = 𝑦𝑒𝑠) − 𝑃(𝑌 = 𝑛𝑜) log 2 𝑃(𝑌 = 𝑛𝑜)
= 1
Temperature:
Temperature
Weather
= 0.6
Humidity
STRONG WEAK
Y, Y, Y, N, N, N, N Y, N, Y
= 0.8651
Wind
STRONG WEAK
Y, Y, N, N, N, N N, Y, Y, Y
= 0.8755
Temperature
Temperature Temperature
HOT HOT
N, N -
MILD MILD
Y N, Y, N
COOL
- COOL
N
1 1 2 2
Entropy of “Sunny” node = −((3) log 2 (3) + (3) log 2 (3)) = 0.9183
IG = 0.9183
1 1 3 3
Entropy of “Rainy” node = −((4) log 2 (4) + (4) log 2 (4)) = 0.8113
3 1 1 2 2
Entropy of children = −(4)((3) log 2 (3) + (3) log 2 (3)) + 0 = 0.6887
IG = 0.1226
Humidity
Humidity Humidity
HIGH HIGH
N, N N, Y, N
NORMAL NORMAL
Y N
1 1 2 2
Entropy of “Sunny” node = −((3) log 2 (3) + (3) log 2 (3)) = 0.9183
IG = 0.9183
1 1 3 3
Entropy of “Rainy” node = −(( ) log 2 ( ) + ( ) log 2 ( )) = 0.8113
4 4 4 4
3 1 1 2 2
Entropy of children = −(4)((3) log 2 (3) + (3) log 2 (3)) + 0 = 0.6887
IG = 0.1226
Wind
Wind Wind
STRONG STRONG
N, Y N, N, N
WEAK WEAK
N Y
1 1 2 2
Entropy of “Sunny” node = −(( ) log 2 ( ) + ( ) log 2 ( )) = 0.9183
3 3 3 3
2 1 1 1 1
Entropy of its children = −(3)((2) log 2 (2) + (2) log 2 (2)) + 0 = 0.6667
IG = 0.2516
1 1 3 3
Entropy of “Rainy” node = −((4) log 2 (4) + (4) log 2 (4)) = 0.8113
Entropy of children = 0
IG = 0.8113
Step 4: Choose feature for each node to split on!
“Sunny node”:
“Rainy node”:
Final Tree!
Weather
Humidity Wind
HIGH STRONG
N, N N, N, N
NORMAL WEAK
Y Y
Boosting
(https://www.ccs.neu.edu/home/vip/teach/MLcourse/4_boosting/slides/boosting.pdf)