0% found this document useful (0 votes)
45 views

Decision Tree Practice

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Decision Tree Practice

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Question 1: CGPA Prediction (Using Gini Index)

Dataset:
Hours of Attendanc CGPA (> 3.0)
Study e

Low Low No

High High Yes

Medium Medium Yes

Low High No

Medium Low No

High Medium Yes

Low Low No

High Low Yes

Step 1: Calculate Gini Index for the root node:

Gini Index formula:

Gini=1−∑(pi)2Gini=1−∑(pi​)2

Where pipi​is the probability of each class (Yes or No in this case).

● Total instances = 8
● Yes (CGPA > 3.0) = 4 instances, p(Yes)=48=0.5p(Yes)=84​=0.5
● No (CGPA ≤ 3.0) = 4 instances, p(No)=48=0.5p(No)=84​=0.5

Gini(root)=1−(0.52+0.52)=1−(0.25+0.25)=1−0.5=0.5Gini(root)=1−(0.52+0.52)=1−(0.25+0.25)=1
−0.5=0.5

Step 2: Calculate Gini Index for splits based on Hours of Study:

1. For 'Low' Hours of Study:


○ 3 instances, all 'No'
○ Gini(Low) = 1−(0/3)2−(3/3)2=01−(0/3)2−(3/3)2=0
2. For 'Medium' Hours of Study:
○ 2 instances, 1 'Yes', 1 'No'
○ Gini(Medium) =
1−(1/2)2−(1/2)2=1−0.25−0.25=0.51−(1/2)2−(1/2)2=1−0.25−0.25=0.5
3. For 'High' Hours of Study:
○ 3 instances, 3 'Yes', 1 'No'
○ Gini(High) =
1−(3/4)2−(1/4)2=1−0.5625−0.0625=0.3751−(3/4)2−(1/4)2=1−0.5625−0.0625=0.3
75

Weighted Gini for Hours of Study:

Weighted Gini=(38)×0+(28)×0.5+(38)×0.375=0.265625Weighted
Gini=(83​)×0+(82​)×0.5+(83​)×0.375=0.265625

Gini Gain for Hours of Study:

Gini Gain=0.5−0.265625=0.234375Gini Gain=0.5−0.265625=0.234375

Step 3: Calculate Gini Index for splits based on Attendance:

1. For 'Low' Attendance:


○ 4 instances, 1 'Yes', 3 'No'
○ Gini(Low) =
1−(1/4)2−(3/4)2=1−0.0625−0.5625=0.3751−(1/4)2−(3/4)2=1−0.0625−0.5625=0.3
75
2. For 'Medium' Attendance:
○ 2 instances, 1 'Yes', 1 'No'
○ Gini(Medium) = 1−(1/2)2−(1/2)2=0.51−(1/2)2−(1/2)2=0.5
3. For 'High' Attendance:
○ 2 instances, 2 'Yes', 0 'No'
○ Gini(High) = 1−(2/2)2−(0/2)2=01−(2/2)2−(0/2)2=0

Weighted Gini for Attendance:

Weighted Gini=(48)×0.375+(28)×0.5+(28)×0=0.4375Weighted
Gini=(84​)×0.375+(82​)×0.5+(82​)×0=0.4375

Gini Gain for Attendance:

Gini Gain=0.5−0.4375=0.0625Gini Gain=0.5−0.4375=0.0625


Conclusion:Since Hours of Study has the highest Gini Gain (0.234375), it is chosen as the
root split.

Hours of Study

/ | \

Low Medium High (No)

(1 Yes, 1 No) (3 Yes, 1 No)


Question 2: Fruit Classification (Using Entropy)

Dataset:
Color Size Fruit (Apple/NotApple)

Red Small Apple

Green Large NotApple

Yellow Medium NotApple

Red Medium Apple

Green Small Apple

Yellow Large NotApple

Red Large Apple

Green Medium NotApple

Step 1: Calculate Entropy for the root node:

Entropy formula:

Entropy=−∑pi⋅log⁡2(pi)Entropy=−∑pi​⋅log2​(pi​)

Where pipi​is the probability of each class.

● Total instances = 8
● Apple = 4 instances, p(Apple)=48=0.5p(Apple)=84​=0.5
● NotApple = 4 instances, p(NotApple)=48=0.5p(NotApple)=84​=0.5

Entropy(root)=−(0.5×log⁡2(0.5)+0.5×log⁡2(0.5))=−(0.5×−1+0.5×−1)=1Entropy(root)=−(0.5×log2​(0.
5)+0.5×log2​(0.5))=−(0.5×−1+0.5×−1)=1

Step 2: Calculate Entropy for splits based on Color:

1. For 'Red' Color:


○ 3 instances, all 'Apple'
○ Entropy(Red) = 0 (since all are 'Apple')
2. For 'Green' Color:
○ 3 instances, 1 'Apple', 2 'NotApple'
○ Entropy(Green) =
−(1/3×log⁡2(1/3)+2/3×log⁡2(2/3))=0.918−(1/3×log2​(1/3)+2/3×log2​(2/3))=0.918
3. For 'Yellow' Color:
○ 2 instances, all 'NotApple'
○ Entropy(Yellow) = 0

Weighted Entropy for Color:

Weighted Entropy=(38)×0+(38)×0.918+(28)×0=0.34425Weighted
Entropy=(83​)×0+(83​)×0.918+(82​)×0=0.34425

Information Gain for Color:

Information Gain=1−0.34425=0.65575Information Gain=1−0.34425=0.65575

Step 3: Calculate Entropy for splits based on Size:

1. For 'Small' Size:


○ 2 instances, all 'Apple'
○ Entropy(Small) = 0
2. For 'Medium' Size:
○ 3 instances, 1 'Apple', 2 'NotApple'
○ Entropy(Medium) = 1
3. For 'Large' Size:
○ 3 instances, 1 'Apple', 2 'NotApple'
○ Entropy(Large) = 0.918

Weighted Entropy for Size:

Weighted Entropy=(28)×0+(38)×1+(38)×0.918=0.73425Weighted
Entropy=(82​)×0+(83​)×1+(83​)×0.918=0.73425

Information Gain for Size:

Information Gain=1−0.73425=0.26575Information Gain=1−0.73425=0.26575

Conclusion:

Since Color has the highest Information Gain (0.65575), it is chosen as the root split.

Color
/ | \
Red Green Yellow
(Apple) (1 Apple, 2 NotApple) (NotApple)

You might also like