Decision Tree Practice
Decision Tree Practice
Dataset:
Hours of Attendanc CGPA (> 3.0)
Study e
Low Low No
Low High No
Medium Low No
Low Low No
Gini=1−∑(pi)2Gini=1−∑(pi)2
● Total instances = 8
● Yes (CGPA > 3.0) = 4 instances, p(Yes)=48=0.5p(Yes)=84=0.5
● No (CGPA ≤ 3.0) = 4 instances, p(No)=48=0.5p(No)=84=0.5
Gini(root)=1−(0.52+0.52)=1−(0.25+0.25)=1−0.5=0.5Gini(root)=1−(0.52+0.52)=1−(0.25+0.25)=1
−0.5=0.5
Weighted Gini=(38)×0+(28)×0.5+(38)×0.375=0.265625Weighted
Gini=(83)×0+(82)×0.5+(83)×0.375=0.265625
Weighted Gini=(48)×0.375+(28)×0.5+(28)×0=0.4375Weighted
Gini=(84)×0.375+(82)×0.5+(82)×0=0.4375
Hours of Study
/ | \
Dataset:
Color Size Fruit (Apple/NotApple)
Entropy formula:
Entropy=−∑pi⋅log2(pi)Entropy=−∑pi⋅log2(pi)
● Total instances = 8
● Apple = 4 instances, p(Apple)=48=0.5p(Apple)=84=0.5
● NotApple = 4 instances, p(NotApple)=48=0.5p(NotApple)=84=0.5
Entropy(root)=−(0.5×log2(0.5)+0.5×log2(0.5))=−(0.5×−1+0.5×−1)=1Entropy(root)=−(0.5×log2(0.
5)+0.5×log2(0.5))=−(0.5×−1+0.5×−1)=1
Weighted Entropy=(38)×0+(38)×0.918+(28)×0=0.34425Weighted
Entropy=(83)×0+(83)×0.918+(82)×0=0.34425
Weighted Entropy=(28)×0+(38)×1+(38)×0.918=0.73425Weighted
Entropy=(82)×0+(83)×1+(83)×0.918=0.73425
Conclusion:
Since Color has the highest Information Gain (0.65575), it is chosen as the root split.
Color
/ | \
Red Green Yellow
(Apple) (1 Apple, 2 NotApple) (NotApple)