0% found this document useful (0 votes)
53 views3 pages

Grades Hardworking Intelligent Unlucky

The document describes the steps to build a decision tree to classify student grades based on attributes like hardworking, intelligent, and unlucky. It calculates the entropy of the overall data and each attribute to determine the best attribute to use for the root node, which is hardworking. It then builds out the tree, determining intelligent is the best attribute for the next node. While unlucky also gains information, it is not needed since intelligent provides perfect classification. The tree shows how decision trees use entropy and information gain to recursively partition data until pure classifications are reached.

Uploaded by

Xâlmañ Kháñ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views3 pages

Grades Hardworking Intelligent Unlucky

The document describes the steps to build a decision tree to classify student grades based on attributes like hardworking, intelligent, and unlucky. It calculates the entropy of the overall data and each attribute to determine the best attribute to use for the root node, which is hardworking. It then builds out the tree, determining intelligent is the best attribute for the next node. While unlucky also gains information, it is not needed since intelligent provides perfect classification. The tree shows how decision trees use entropy and information gain to recursively partition data until pure classifications are reached.

Uploaded by

Xâlmañ Kháñ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

1. Show the steps of decision tree induction algorithm for the following data.

The
concepts to be learnt are Grades (Grades are class labels).
- If two or more attributes have similar Information Gain, pick any of them.
- Since the best entropy is zero, hence if the average entropy of an attribute is
zero, there is no need to check other candidate attributes.

Grades Hardworking Intelligent Unlucky


1. Good Yes Yes No
2. Bad No Yes Yes
3. Bad Yes No Yes
4. Good Yes Yes No
5. Good Yes Yes No
6. Bad No Yes No
7. Bad Yes No No

The entropy of the data:


There are 3 samples of Good Grades and 4 samples of Bad Grades.
Hence the entropy the data is {- (3/7) log2 (3/7) - (4/7) log2 (4/7)} =
0.524 + 0.461 = 0.985

First we calculate which of the three attributes will be best for the root node.

Attribute “Hardworking”
Branch Hardworking = Yes (5 samples fulfill this condition)
There are 3 samples of Good Grades and 2 samples of Bad Grades.
Entropy is {- (3/5) log2 (3/5) - (2/5) log2 (2/5)} = 0.442 + 0.53 = 0.972
Branch Hardworking = No (2 samples fulfill this condition)
Both of these 2 samples are of Bad Grades.
Entropy is {- (0/2) log2 (0/2) - (2/2) log2 (2/2)} = 0
Average Entropy of Attribute “Hardworking” = 5/7 * 0.972 + 2/7 * 0 = 0.694
Information Gain of Attribute “Hardworking” = 0.985 – 0.694 = 0.291

Attribute “Intelligent”
Branch Intelligent = Yes (5 samples fulfill this condition)
There are 3 samples of Good Grades and 2 samples of Bad Grades.
Entropy is {- (3/5) log2 (3/5) - (2/5) log2 (2/5)} = 0.442 + 0.53 = 0.972
Branch Intelligent = No (2 samples fulfill this condition)
Both of these 2 samples are of Bad Grades.
Entropy is {- (0/2) log2 (0/2) - (2/2) log2 (2/2)} = 0
Average Entropy of Attribute “Intelligent” = 5/7 * 0.972 + 2/7 * 0 = 0.694
Information Gain of Attribute “Intelligent” = 0.985 – 0.694 = 0.291

Attribute “Unlucky”
Branch Unlucky = Yes (2 samples fulfill this condition)
Both of these 2 samples are of Bad Grades.
Entropy is {- (0/2) log2 (0/2) - (2/2) log2 (2/2)} = 0
Branch Unlucky = No (5 samples fulfill this condition)
There are 3 samples of Good Grades and 2 samples of Bad Grades.
Entropy is {- (3/5) log2 (3/5) - (2/5) log2 (2/5)} = 0.442 + 0.53 = 0.972
Average Entropy of Attribute “Unlucky” = 5/7 * 0.972 + 2/7 * 0 = 0.694
Information Gain of Attribute “Unlucky” = 0.985 – 0.694 = 0.291

Since all the three Information Gains are equal, we select any of these attributes.
Suppose we choose the attribute “Hardworking” as the root node.

---------------------------------------

The samples in the branch “Hardworking” = No are of homogenous class labels,


hence we do not grow this branch.

The branch “Hardworking” = Yes has to be grown. We test the two remaining
attributes “Intelligent” and “Unlucky”

The entropy of the samples for the branch “Hardworking” = Yes:


There are a total of 5 samples.
There are 3 samples of Good Grades and 2 samples of Bad Grades.
Entropy is {- (3/5) log2 (3/5) - (2/5) log2 (2/5)} = 0.442 + 0.53 = 0.972

Attribute “Intelligent”
Branch Intelligent = Yes (3 samples fulfill this condition)
All the 3 samples are of Good Grades. Hence Entropy = 0
Branch Intelligent = No (2 samples fulfill this condition)
Both of these 2 samples are of Bad Grades. Hence Entropy = 0
Average Entropy of Attribute “Intelligent” = 3/5 * 0 + 2/5 * 0 = 0
Information Gain of Attribute “Intelligent” = 0.972 – 0 = 0.972

Since we cannot have better results, therefore there is no need to test the other
attribute “Unlucky”. However, just for sake of practice we do the calculations as
follows:

Attribute “Unlucky”
Branch Unlucky = Yes (1 sample fulfills this condition)
The sample is of Bad Grades. Hence Entropy = 0
Branch Unlucky = No (4 samples fulfill this condition)
Three of them are of Good Grades and one is of Bad Grade
Entropy is {- (3/4) log2 (3/4) - (1/4) log2 (1/4)} = 0.311 + 0.5 = 0.811
Average Entropy of Attribute “Unlucky” = 1/5 * 0 + 4/5 * = 0.6488
Information Gain of Attribute “Unlucky” = 0.972 - 0.6488 = 0.324

You might also like