Decision Trees Example Problem
Consider the following data, where the Y label is whether or not the child goes out to play.
Day Weather Temperature Humidity Wind Play?
1 Sunny Hot High Weak No
2 Cloudy Hot High Weak Yes
3 Sunny Mild Normal Strong Yes
4 Cloudy Mild High Strong Yes
5 Rainy Mild High Strong No
6 Rainy Cool Normal Strong No
7 Rainy Mild High Weak Yes
8 Sunny Hot High Strong No
9 Cloudy Hot Normal Weak Yes
10 Rainy Mild High Strong No
Step 1: Calculate the IG (information gain) for each attribute (feature)
Initial entropy = 𝐻(𝑌) = − ∑𝑦 𝑃(𝑌 = 𝑦) log 2 𝑃(𝑌 = 𝑦)
= −𝑃(𝑌 = 𝑦𝑒𝑠) log 2 𝑃(𝑌 = 𝑦𝑒𝑠) − 𝑃(𝑌 = 𝑛𝑜) log 2 𝑃(𝑌 = 𝑛𝑜)
= −(0.5) log 2 (0.5) − (0.5) log 2 (0.5)
= 1
Temperature:
Temperature
HOT MILD COLD
N, Y, N, Y Y, Y, N, Y, N N
Total entropy of this division is:
𝐻(𝑌 | 𝑡𝑒𝑚𝑝) = − ∑ 𝑃(𝑡𝑒𝑚𝑝 = 𝑥) ∑ 𝑃(𝑌 = 𝑦 | 𝑡𝑒𝑚𝑝 = 𝑥) log 2 𝑃(𝑌 = 𝑦 | 𝑡𝑒𝑚𝑝 = 𝑥)
𝑥 𝑦
= −(𝑃(𝑡𝑒𝑚𝑝 = 𝐻) ∑𝑦 𝑃(𝑌 = 𝑦 |𝑡𝑒𝑚𝑝 = 𝐻) log 2 𝑃(𝑌 = 𝑦 | 𝑡𝑒𝑚𝑝 = 𝐻) +
𝑃(𝑡𝑒𝑚𝑝 = 𝑀) ∑𝑦 𝑃(𝑌 = 𝑦 |𝑡𝑒𝑚𝑝 = 𝑀) log 2 𝑃(𝑌 = 𝑦 |𝑡𝑒𝑚𝑝 = 𝑀) +
𝑃(𝑡𝑒𝑚𝑝 = 𝐶) ∑𝑦 𝑃(𝑌 = 𝑦 | 𝑡𝑒𝑚𝑝 = 𝐶) log 2 𝑃(𝑌 = 𝑦 | 𝑡𝑒𝑚𝑝 = 𝐶))
1 1 1 1 3 3 2 2
= −((0.4)((2) log 2 (2) + (2) log 2 (2)) + (0.5)((5) log 2 (5) + (5) log 2 (5)) +
(0.1)((1) log 2 (1) + (0) log 2 (0)))
= 0.7884
IG(Y, temp) = 1 – 0.7884 = 0.2116
Weather:
Weather
SUNNY CLOUDY RAINY
N, Y, N Y, Y, Y N, N, Y, N
Total entropy of this division is:
𝐻(𝑌 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟) = − ∑ 𝑃(𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑥) ∑ 𝑃(𝑌 = 𝑦 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑥) log 2 𝑃(𝑌 = 𝑦 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑥)
𝑥 𝑦
= −(𝑃(𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑆) ∑𝑦 𝑃(𝑌 = 𝑦 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑆) log 2 𝑃(𝑌 = 𝑦 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑆) +
𝑃(𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝐶) ∑𝑦 𝑃(𝑌 = 𝑦 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝐶) log 2 𝑃(𝑌 = 𝑦 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝐶) +
𝑃(𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑅) ∑𝑦 𝑃(𝑌 = 𝑦 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑅) log 2 𝑃(𝑌 = 𝑦 | 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 = 𝑅))
1 1 2 2
= −((0.3)((3) log 2 (3) + (3) log 2 (3)) + (0.3)((1) log 2 (1) + (0) log 2 (0)) +
1 1 3 3
(0.4)((4) log 2 (4) + (4) log 2 (4)))
= 0.6
IG(Y, weather) = 1 – 0.6 = 0.4
Humidity:
Humidity
STRONG WEAK
Y, Y, Y, N, N, N, N Y, N, Y
Total entropy of this division is:
𝐻(𝑌 | ℎ𝑢𝑚) = − ∑ 𝑃(ℎ𝑢𝑚 = 𝑥) ∑ 𝑃(𝑌 = 𝑦 | ℎ𝑢𝑚 = 𝑥) log 2 𝑃(𝑌 = 𝑦 | ℎ𝑢𝑚 = 𝑥)
𝑥 𝑦
= −(𝑃(ℎ𝑢𝑚 = 𝐻) ∑𝑦 𝑃(𝑌 = 𝑦 |ℎ𝑢𝑚 = 𝐻) log 2 𝑃(𝑌 = 𝑦 | ℎ𝑢𝑚 = 𝐻) +
𝑃(ℎ𝑢𝑚 = 𝑁) ∑𝑦 𝑃(𝑌 = 𝑦 |ℎ𝑢𝑚 = 𝑁) log 2 𝑃(𝑌 = 𝑦 |ℎ𝑢𝑚 = 𝑁)
3 3 4 4 2 2 1 1
= −((0.7)(( ) log 2 ( ) + ( ) log 2 ( )) + (0.3)(( ) log 2 ( ) + ( ) log 2 ( ))
7 7 7 7 3 3 3 3
= 0.8651
IG(Y, hum) = 1 – 0.8651 = 0.1349
Wind:
Wind
STRONG WEAK
Y, Y, N, N, N, N N, Y, Y, Y
Total entropy of this division is:
𝐻(𝑌 | 𝑤𝑖𝑛𝑑) = − ∑ 𝑃(𝑤𝑖𝑛𝑑 = 𝑥) ∑ 𝑃(𝑌 = 𝑦 | 𝑤𝑖𝑛𝑑 = 𝑥) log 2 𝑃(𝑌 = 𝑦 | 𝑤𝑖𝑛𝑑 = 𝑥)
𝑥 𝑦
= −(𝑃(𝑤𝑖𝑛𝑑 = 𝑆) ∑𝑦 𝑃(𝑌 = 𝑦 |𝑤𝑖𝑛𝑑 = 𝑆) log 2 𝑃(𝑌 = 𝑦 | 𝑤𝑖𝑛𝑑 = 𝑆) +
𝑃(𝑤𝑖𝑛𝑑 = 𝑊) ∑𝑦 𝑃(𝑌 = 𝑦 |𝑤𝑖𝑛𝑑 = 𝑊) log 2 𝑃(𝑌 = 𝑦 |𝑤𝑖𝑛𝑑 = 𝑊)
2 2 4 4 1 1 3 3
= −((0.6)(( ) log 2 ( ) + ( ) log 2 ( )) + (0.4)(( ) log 2 ( ) + ( ) log 2 ( ))
6 6 6 6 4 4 4 4
= 0.8755
IG(Y, wind) = 1 – 0.8755 = 0.1245
Step 2: Choose which feature to split with!
IG(Y, wind) = 0.1245
IG(Y, hum) = 0.1349
IG(Y, weather) = 0.4
IG(Y, temp) = 0.2116
Step 3: Repeat for each level (sad, I know)
Temperature
SUNNY CLOUDY RAINY
N, Y, N Y, Y, Y N, N, Y, N
Temperature Temperature
HOT HOT
N, N -
MILD MILD
Y N, Y, N
COOL
- COOL
N
1 1 2 2
Entropy of “Sunny” node = −((3) log 2 (3) + (3) log 2 (3)) = 0.9183
Entropy of its children = 0
IG = 0.9183
1 1 3 3
Entropy of “Rainy” node = −((4) log 2 (4) + (4) log 2 (4)) = 0.8113
3 1 1 2 2
Entropy of children = −(4)((3) log 2 (3) + (3) log 2 (3)) + 0 = 0.6887
IG = 0.1226
Humidity
SUNNY CLOUDY RAINY
N, Y, N Y, Y, Y N, N, Y, N
Humidity Humidity
HIGH HIGH
N, N N, Y, N
NORMAL NORMAL
Y N
1 1 2 2
Entropy of “Sunny” node = −((3) log 2 (3) + (3) log 2 (3)) = 0.9183
Entropy of its children = 0
IG = 0.9183
1 1 3 3
Entropy of “Rainy” node = −(( ) log 2 ( ) + ( ) log 2 ( )) = 0.8113
4 4 4 4
3 1 1 2 2
Entropy of children = −(4)((3) log 2 (3) + (3) log 2 (3)) + 0 = 0.6887
IG = 0.1226
Wind
SUNNY CLOUDY RAINY
N, Y, N Y, Y, Y N, N, Y, N
Wind Wind
STRONG STRONG
N, Y N, N, N
WEAK WEAK
N Y
1 1 2 2
Entropy of “Sunny” node = −(( ) log 2 ( ) + ( ) log 2 ( )) = 0.9183
3 3 3 3
2 1 1 1 1
Entropy of its children = −(3)((2) log 2 (2) + (2) log 2 (2)) + 0 = 0.6667
IG = 0.2516
1 1 3 3
Entropy of “Rainy” node = −((4) log 2 (4) + (4) log 2 (4)) = 0.8113
Entropy of children = 0
IG = 0.8113
Step 4: Choose feature for each node to split on!
“Sunny node”:
IG(Y, weather) = IG(humidity) = 0.9183
IG(Y, wind) = 0.2516
“Rainy node”:
IG(Y, weather) = IG(Y, humidity) = 0.1226
IG(Y, wind) = 0.8113
Final Tree!
Weather
SUNNY CLOUDY RAINY
N, Y, N Y, Y, Y N, N, Y, N
Humidity Wind
HIGH STRONG
N, N N, N, N
NORMAL WEAK
Y Y
Boosting
(https://www.ccs.neu.edu/home/vip/teach/MLcourse/4_boosting/slides/boosting.pdf)