0% found this document useful (0 votes)
9 views

Decision Tree

Uploaded by

arkadebmisra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Decision Tree

Uploaded by

arkadebmisra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Decision Tree

Entropy
• Entropy is the measure of uncertainty of a random variable, it
characterizes the impurity of an arbitrary collection of examples.
The higher the entropy more the information content.

H(s) = σ −𝑝ሺ𝑥 ሻ log 𝑝ሺ𝑥ሻ

H(s) = Measure of the amount of uncertainty in data


e.g. Toss of a coin
• 𝑃 𝐻 = 𝑃 𝑇 = 0.5
• Outcome is random so entropy is 1
• 𝑃 𝐻 = 1, 𝑃 𝑇 = 0
• No randomness so entropy is 0

• Information gain
It is also known as Kullback-Leibler divergence denoted by 𝐼𝐺ሺ𝑆, 𝐴ሻ is the effective
change in entropy after deciding on a particular attribute A.

𝐼𝐺 𝑆, 𝐴 = 𝐻 𝑆 − 𝐻 𝑆, 𝐴
𝑥 is the possible value of attribute
= 𝐻 𝑆 − σ 𝑃 𝑋 × 𝐻ሺ𝑥ሻ
Dataset
Day Outlook Temperature Humidity Wind Play Golf
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild Normal Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
1. Play(Yes) = 9 Play(No) = 5 Total = 14
Entropy(S) = H(S) = -σ 𝑃 𝑥 log 𝑝 𝑥
9 9 5 5
=− log − log = 0.94
14 14 14 14

2. Compute the attribute with highest information gain


𝐼𝐺ሺ𝑆, 𝑤𝑖𝑛𝑑ሻ = 𝐻 𝑆 − σ 𝑃 𝑥 × 𝐻 𝑥
𝑥 is the possible value of attribute
Total = 14, wind (weak) = 8, wind(strong) = 6
P(Sweak) = 8/14 P(Sstrong) = 6/14
Out of 8 weak -> 6 play yes and 2 No.
6 6 2 2
So entropy(Sweak) = − log − ሺlog = 0.811
8 8 8 8
3 3 3 3
Entropy(Sstrong) = − log − log =1
6 6 6 6
IG(S,wind) = 𝐻 𝑆 − σ 𝑃 𝑥 × 𝐻 𝑥
=𝐻 𝑆 − 𝑃 𝑆𝑤𝑒𝑎𝑘 × 𝐻 𝑆𝑤𝑒𝑎𝑘 − 𝑃 𝑆𝑠𝑡𝑟𝑜𝑛𝑔 × 𝐻 𝑆𝑠𝑡𝑟𝑜𝑛𝑔
8 6
=0.940 − × 0.811 − × 1 = 0.048
14 14
Similarly –
• IG(S, Outlook) = 0.246
• IG(S, Temperature) = 0.029
• IG(S, Humidity) = 0.151
• IG(S, wind) = 0.048

Highest information gain IG(S, outlook), so the decision tree is -


Outlook

Sunny Overcast Rain

Yes
• To further split the sunny node
Temperature Humidity Wind Play
Hot High Weak No
Hot High Strong No
Mild Hot Weak No
Cool No Weak Yes
Mild No Strong Yes

• IG(Sunny, Humidity) = 0.96 ----------- Highest


• IG(Sunny, Temperature) = 0.57
• IG(Sunny, wind) = 0.019

In the same way, Srain will provide us wind as the highest information gain. So the decision
tree becomes -
Yes

Yes No No Yes
ID3 Algorithm
1. Create root node for the tree
2. If all the example are positive return leaf node positive
3. Else all example are negative then return leaf node negative
4. Calculate entropy of current state H(S)
5. For each attribute x, compute entropy w.r.t x H(S,x)
6. Select the attribute which has maximum IG(S,x)
7. Remove the attribute that offer highest IG
8. Repeat until there is no attribute or the decision has all leaf nodes.

You might also like