Decision Tree
Decision Tree
Decision Tree
Basic algorithm (a greedy algorithm)
• Tree is constructed in a top-down recursive divide-and-conquer manner
• At start, all the training examples are at the root
• Attributes are categorical (if continuous-valued, they are discretized in advance)
• Examples are partitioned recursively based on selected attributes
• Test attributes are selected on the basis of a heuristic or statistical measure
(e.g., information gain)
Decision Tree: Information Gain
Sl. No. Outlook Temp Humidity Windy Play Golf
1 Rainy Hot High Weak No
2 Rainy Hot High Strong No
3 Overcast Hot High Weak Yes
4 Sunny Mild High Weak Yes
5 Sunny Cool Normal Weak Yes
6 Sunny Cool Normal Strong No
7 Overcast Cool Normal Strong Yes
8 Rainy Mild High Weak No
9 Rainy Cool Normal Weak Yes
10 Sunny Mild Normal Weak Yes
11 Rainy Mild Normal Strong Yes
12 Overcast Mild High Strong Yes
13 Overcast Hot Normal Weak Yes
14 Sunny Mild High Strong No
Decision Tree: Information Gain
= 0.940 Wind
Yes No
𝑃 ( 𝑆 𝑊𝑒𝑎𝑘 ) =
𝑁𝑜 . 𝑜𝑓 𝑊𝑒𝑎𝑘
𝑇𝑜𝑡𝑎𝑙
= ( )
8
14 Weak 6 2
Strong 3 3
=( )
𝑁𝑜. 𝑜𝑓 𝑆𝑡𝑟𝑜𝑛𝑔 6
𝑃 ( 𝑆 𝑆𝑡𝑟𝑜𝑛𝑔 )= 𝑇𝑜𝑡𝑎𝑙 14 Total 9 5
= 0.811
=1
𝑛
𝐼𝐺 ( 𝑆 ,𝑊𝑖𝑛𝑑 )=𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑆 ) − ∑ 𝑝 ( 𝑥 ) ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑥 )
𝑖=0
𝐼𝐺 ( 𝑆 ,𝑊𝑖𝑛𝑑 )=0.940 −
8
14 ( )
(0.811)−
6
14 ( )
(1)= 0.048
5
( )
𝑁𝑜 .𝑜𝑓 𝑆𝑢𝑛𝑛𝑦 5 Overcast
𝑃 ( 𝑆 𝑆𝑢𝑛𝑛𝑦 ) = =
𝑇𝑜𝑡𝑎𝑙 14 Yes No
Sunny 3 2
𝑃 ( 𝑆 𝑂𝑣𝑒𝑟𝑐𝑎𝑠𝑡 ) =
𝑁𝑜. 𝑜𝑓 𝑂𝑣𝑒𝑟𝑐𝑎𝑠𝑡
𝑇𝑜𝑡𝑎𝑙
=( )
4
14 Overcast 4 0
Rainy 2 3
𝑃 ( 𝑆 𝑅𝑎𝑖𝑛𝑦 )=
𝑁𝑜 . 𝑜𝑓 𝑅𝑎𝑖𝑛𝑦
𝑇𝑜𝑡𝑎𝑙
=( )
5
14 Total 9 5
0.940
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑆 𝑆𝑢𝑛𝑛𝑦 ) =− ( ) () () ( )
3
5
𝑙𝑜𝑔2
3
5
−
2
5
𝑙𝑜𝑔 2
2
5
=0.970951
𝐼𝐺 ( 𝑆 , 𝑂𝑣𝑒𝑟𝑐𝑎𝑠𝑡 )=0.940 − ( )
5
14 ( ) ( )
( 0.970951 ) −
4
14
( 0) −
5
14
(0.970951)= 0.0246
6
𝑃 ( 𝑆 𝐻𝑜𝑡 ) =
𝑁𝑜 . 𝑜𝑓 𝐻𝑜𝑡
𝑇𝑜𝑡𝑎𝑙
=( )
4
14
Temperature
Yes No
Hot 2 2
𝑃 ( 𝑆 𝑀𝑖𝑙𝑑 ) =
𝑁𝑜 .𝑜𝑓 𝑀𝑖𝑙𝑑
𝑇𝑜𝑡𝑎𝑙
= ( )
6
14
Mild 4
Cool 3
2
1
𝑃 ( 𝑆 𝐶𝑜𝑜𝑙 ) =
𝑁𝑜 . 𝑜𝑓 𝐶𝑜𝑜𝑙
( )
=
4 Total 9 5
0.940
𝑇𝑜𝑡𝑎𝑙 14
) (4) ( 4 ) ( 4 ) ( 4 )=1
2 2 2 2
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑆 𝐻𝑜𝑡 =− 𝑙𝑜𝑔 2 − 𝑙𝑜𝑔
2
= 0.918296
𝐺 ( 𝑆 ,𝑇𝑒𝑚𝑝 )=0.940 −
4
14 ( )
(1 ) −
6
14 ( )
( 0.918296 ) −
4
14 ( )
(0.811278) = 0.028937
7
Humidity
𝑃 ( 𝑆 𝐻𝑖𝑔h )=
𝑁𝑜 . 𝑜𝑓 𝐻𝑖𝑔h
𝑇𝑜𝑡𝑎𝑙 ( )
=
7
14 Yes No
High 3 4
𝑃 ( 𝑆 𝑁𝑜𝑟𝑚𝑎𝑙 ) =
𝑁𝑜 .𝑜𝑓 𝑁𝑜𝑟𝑚𝑎𝑙
𝑇𝑜𝑡𝑎𝑙
=
7
14 ( ) Normal 6 1
= 0.985228
Total 9 5
0.940
𝐼𝐺 ( 𝑆 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 )=0.940 − ( )
7
14
( 0.985228)−
7
14( )
( 0.591673)= 0.151
8
Decision Tree: Information Gain
𝐼𝐺 ( 𝑆 , 𝑂𝑣𝑒𝑟𝑐𝑎𝑠𝑡 )=0.246
Highest Information Gain
Shall be selected as Root Node
𝐼𝐺 ( 𝑆 ,𝑇𝑒𝑚𝑝 )=0.029
𝐼𝐺 ( 𝑆 ,𝑊𝑖𝑛𝑑 ) = 0.048
Outlook
𝐼𝐺 ( 𝑆 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 )=0.151
Sunny Rainy
Yes
Decision Tree: Gini Index Sunny
Strong 0 2
=( )
𝑁𝑜. 𝑜𝑓 𝑆𝑡𝑟𝑜𝑛𝑔 2
𝑃 ( 𝑆 𝑆𝑡𝑟𝑜𝑛𝑔 )= 𝑇𝑜𝑡𝑎𝑙 5
Total 3 2
=0
=0
𝑛
𝐼𝐺 ( 𝑆𝑢𝑛𝑛𝑦 , 𝑊𝑖𝑛𝑑 ) =𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑆𝑢𝑛𝑛𝑦 ) − ∑ 𝑝 ( 𝑥 ) ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑥 )
𝑖=0
()
𝑁𝑜 . 𝑜𝑓 𝐶𝑜𝑜𝑙 2 Mild 2 1
𝑃 ( 𝑆 𝐶𝑜𝑜𝑙 ) = = Cool 1 1
𝑇𝑜𝑡𝑎𝑙 5
Total 3 2
0.970
= 0.88975
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑆 𝐶𝑜𝑜𝑙 ) =− () () () ( )
1
2
𝑙𝑜𝑔2
1
2
−
1
2
𝑙𝑜𝑔2
1
2
=1
=1
Total 3 2
0.970
2 3
() ()
𝐼𝐺 ( 𝑆𝑢𝑛𝑛𝑦 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 ) =0.970 − (1)− (0.591673)= 0.014961
5 5 13
Decision Tree: Information Gain Highest Information Gain
Shall be selected as Root Node
𝐼𝐺 ( 𝑆𝑢𝑛𝑛𝑦 , 𝑊𝑖𝑛𝑑 ) =0.970
Outlook
𝐼𝐺 ( 𝑆𝑢𝑛𝑛𝑦 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 ) =0.014961
Windy Yes
Weak Strong
Yes No
= 0.970 Wind
Yes No
𝑁𝑜 . 𝑜𝑓 𝑊𝑒𝑎𝑘 3
()
Weak 1 2
𝑃 ( 𝑆 𝑊𝑒𝑎𝑘 ) = =
𝑇𝑜𝑡𝑎𝑙 5 Strong 1 1
Total 2 3
=( )
𝑁𝑜. 𝑜𝑓 𝑆𝑡𝑟𝑜𝑛𝑔 2
𝑃 ( 𝑆 𝑆𝑡𝑟𝑜𝑛𝑔 )= 𝑇𝑜𝑡𝑎𝑙 5
= 0.723308
=1
𝑛
𝐼𝐺 ( 𝑅𝑎𝑖𝑛𝑦 , 𝑊𝑖𝑛𝑑 ) =𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑅𝑎𝑖𝑛𝑦 ) − ∑ 𝑝 ( 𝑥 ) ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑥 )
𝑖=0
𝑃 ( 𝑆 𝑀𝑖𝑙𝑑 ) =
𝑁𝑜 .𝑜𝑓 𝑀𝑖𝑙𝑑
𝑇𝑜𝑡𝑎𝑙
= ( 14 )
2 Mild
Cool
1
1
1
0
𝑃 ( 𝑆 𝐶𝑜𝑜𝑙 ) =
𝑁𝑜 . 𝑜𝑓 𝐶𝑜𝑜𝑙
= ( )
4 Total 2 3
0.970
𝑇𝑜𝑡𝑎𝑙 14
=1
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑆 𝐶𝑜𝑜𝑙 ) =− () () () ( )
1
1
𝑙𝑜𝑔2
1
1
−
0
1
𝑙𝑜𝑔2
0
1
=0
𝐼𝐺 ( 𝑆 ,𝑇𝑒𝑚𝑝 )=0.970 −
2
14
( 0) −
2
14 ( )
( 1) −
1
14 ( ) ( )
(0) = 0.828093
16
Humidity
𝑃 ( 𝑆 𝐻𝑖𝑔h )=
𝑁𝑜 . 𝑜𝑓 𝐻𝑖𝑔h 3
𝑇𝑜𝑡𝑎𝑙
=
5() High
Yes
0
No Count P(No)
3 3 4/5
𝑃 ( 𝑆 𝑁𝑜𝑟𝑚𝑎𝑙 ) =
𝑁𝑜 .𝑜𝑓 𝑁𝑜𝑟𝑚𝑎𝑙 2
𝑇𝑜𝑡𝑎𝑙
=
5 () Nor
mal
2 0 2 1/5
=0
Total 2 3 100% 100%
0.970
Overcast
Windy Yes
Humidity
Yes No Yes No
Decision Tree: Gini Index
Decision Tree: Gini Index Sunny
Gini Index
• Faster to compute compared to other impurity measures like entropy
• More sensitive to changes in class probabilities, which can be beneficial for certain datasets
• Can be biased towards attributes with more categories, which might not always be desirable
𝑖=1 𝑖 =1
++]
( )( ( ) ( ) )
2 2
5 3 2 Overcast
𝐺 ( 𝑆 ,𝑜𝑢𝑡𝑙𝑜𝑜𝑘 )=1 − 1− −
14 5 5
Yes No Count
( )( ( ) − ( ) )
2 2
4 4 0
− 1− Sunny 3 2 5
14 4 4
Overcast 4 0 4
( )( ( ) − ( ) )
2 2
5 2 3
− 1−
14 5 5 Rainy 2 3 5
( )( ( ) ( ) )
2 2
4 2 2
𝐺 ( 𝑆 ,𝑇𝑒𝑚𝑝 )=1 − 1− −
14 4 4
14 ) ( ( 6 ) ( 6 ) )¿ 0.559524
−(
2 2
6 4 2
1 − −
14 ( 4 )
−( ) 1 −( ) − ( )
2 2
4 3 1 Temperature
4 Yes No Count
Hot 2 2 4
Mild 4 2 6
Cool 3 1 4
𝐺𝑖𝑛𝑖 𝐼𝑛𝑑𝑒𝑥 ( 𝑆 ,𝑇𝑒𝑚𝑝 )=0.4591 − 0.55952 4=−0.10042 Total 9 5 100%
Decision Tree: Gini Index
Entropy of the Class Variable
+
+]
( )( ( ) ( ) )
2 2
7 3 4
𝐺 ( 𝑆 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 )=1 − 1− − Humidity
14 7 7
1 ¿ 0.632653
Yes No Count
−( ) ( 7 )
1 −( ) − ( )
2 2
7 6
14 7 High 3 4 7
Normal 6 1 7
=-0.17355
Total 9 5 100%
Decision Tree: Gini Index
Entropy of the Class Variable
+
+]
( )( ( ) ( ) )
2 2
8 6 2
𝐺 ( 𝑆 ,𝑊𝑖𝑛𝑑𝑦 )=1 − 1− − Wind
14 8 8
¿ − 0.571429 Yes No Count
14 ) ( (6) (6 ) )
−(
2 2
6 3 3
1 − − Weak 6 2 8
Strong 3 3 6
=-
Total 9 5 100%
Decision Tree: Gini Gain
𝐺𝑖𝑛𝑖 𝐼𝑛𝑑𝑒𝑥 ( 𝑆 , 𝑂𝑢𝑡𝑙𝑜𝑜𝑘) =0.4591 −0.657143=−0.19804
𝐺𝑖𝑛𝑖 𝐼𝑛𝑑𝑒𝑥 ( 𝑆 ,𝑇𝑒𝑚𝑝 )=0.4591 − 0.55952 4=−0.10042 Minimum Gini Index or Maximum Gini Impurity
=-0.17355
Should be selected as Root Node
=- Outlook
Sunny Rainy
Yes
Decision Tree: Gini Index Sunny
++]
( )( ( ) ( ) )
2 2
3 2 1
𝐺 ( 𝑆𝑢𝑛𝑛𝑦 , 𝑇𝑒𝑚𝑝 )=1− 1− −
5 3 3 Temperature
¿ 0.5333 Yes No Count
( )( ( ) ( ) )
2 2
2 1 1 Mild 2 1 3
− 1− −
5 2 2 Cool 1 1 2
Total 3 2 100%
𝐺𝑖𝑛𝑖 𝐺𝑎𝑖𝑛 ( 𝑆𝑢𝑛𝑛𝑦 , 𝑇𝑒𝑚𝑝 ) =0.48 − 0.5333=−0.05333
Decision Tree: Gini Index Sunny
Entropy of the Class Variable
+
+]
( )( ( ) ( ) )
2 2
2 1 1
𝐺 ( 𝑆𝑢𝑛𝑛𝑦 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 ) =1 − 1− −
5 2 2
¿0.5333
( )( ( ) − ( ) )
2 2
3 2 1
− 1− Humidity
5 3 3
Yes No Count
High 1 1 2
𝐺𝑖𝑛𝑖 𝐺𝑎𝑖𝑛 ( 𝑆𝑢𝑛𝑛𝑦 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 )=0.48 − 0.5333=−0.05333 Normal 2 1 3
Total 3 2 100%
Decision Tree: Gini Index Sunny
Entropy of the Class Variable
+
+]
( )( ( ) ( ) )
2 2
3 3 0
𝐺 ( 𝑆𝑢𝑛𝑛𝑦 , 𝑊𝑖𝑛𝑑𝑦 ) =1− 1− −
5 3 3
¿1
( )( ( ) ( ) )
2 2
2 0 2 Wind
− 1− −
5 2 2 Yes No Count
Weak 3 0 3
𝐺𝑖𝑛𝑖 𝐺𝑎𝑖𝑛 ( 𝑆𝑢𝑛𝑛𝑦 , 𝑊𝑖𝑛𝑑𝑦 ) =0.48 −1=− 0.52 Strong 0 2 2
Total 3 2 100%
Choose the one with highest Gini
Decision Tree: Gini Index Gain
(or Lowest Gini Impurity) for root
node
Windy Yes
Weak Strong
Yes No
Decision Tree: Gini Index Rainy
Entropy of the Class Variable
+
+]
( )( ( ) ( ) )
2 2
2 0 2
𝐺 ( 𝑅𝑎𝑖𝑛𝑦 , 𝑇𝑒𝑚𝑝 )=1− 1− −
5 2 2 Temperature
0.8
Yes No Count
5 ( 2 )
− ( ) 1− ( ) − ( )
2 2
2 1 1
Hot 0 2 2
2
Mild 1 1 2
5 ( 1 )
Cool 1 0 1
− ( ) 1− ( ) − ( )
2 2
1 1 0
1 Total 2 3 100%
0.32
Decision Tree: Gini Index Rainy
Entropy of the Class Variable
+
+]
( )( ( ) ( ) )
2 2
3 0 3
𝐺 ( 𝑅𝑎𝑖𝑛𝑦 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 ) =1− 1− − Humidity
5 3 3
Yes No Count
¿1 High 0 3 3
( )( ( ) ( ) )
2 2
2 2 0
− 1− − Norma
2 0 2
5 2 2 l
Total 2 3 100%
𝐺𝑖𝑛𝑖 𝐺𝑎𝑖𝑛 ( 𝑅𝑎𝑖𝑛𝑦 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 ) =0.48 −1=− 0.2
Decision Tree: Gini Index Rainy
Entropy of the Class Variable
+
+]
( )( ( ) ( ) )¿ 0.5333
2 2
3 1 2 Wind
𝐺 ( 𝑅𝑎𝑖𝑛𝑦 , 𝑊𝑖𝑛𝑑 ) =1− 1− −
5 3 3 Yes No Count
5 ( 2 )
− ( ) 1− ( ) − ( )
2 2
2 1 1 Weak 1 2 3
2
Strong 1 1 2
0.26667
Total 2 3 100%
Decision Tree: Gini Gain RainyChoose the one with highest Gini
Gain
(or Lowest Gini Impurity) for root
0.32 node
Outlook
𝐺𝑖𝑛𝑖 𝐺𝑎𝑖𝑛 ( 𝑅𝑎𝑖𝑛𝑦 , 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 ) =0.48 −1=− 0.2
0.26667
Sunny Rain
Overcast
Windy Yes
Humidity
Yes No Yes No