0% found this document useful (0 votes)
19 views23 pages

Entropy and IG

Uploaded by

khansanadeem44
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views23 pages

Entropy and IG

Uploaded by

khansanadeem44
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

Entropy and

Information Grain
Entropy
 Entropy is a measure of disorder or uncertainty
and the goal of machine learning models and
Data Scientists in general is to reduce
uncertainty.
High, Low Entropy
 “High Entropy”
 X is from a uniform like distribution
 Flat histogram
 Values sampled from it are less predictable
 “Low Entropy”
 X is from a varied (peaks and valleys) distribution
 Histogram has many lows and highs
 Values sampled from it are more predictable
Decision tree-classification
to build a decision tree, we need to calculate two types of
entropy using frequency tables as follows:

a) Entropy using the frequency table of one attribute:


Entropy
 b) Entropy using the frequency table of two
attributes:
Information Gain

The information gain is based on the


decrease in entropy after a dataset is
split on an attribute. Constructing a
decision tree is all about finding attribute
that returns the highest information gain
(i.e., the most homogeneous branches).
Information gain
 Step 1: Calculate
entropy of the target.
Information gain cont…
 Step 2:
 The dataset is then split on the different attributes. The entropy for
each branch is calculated.
 Then it is added proportionally, to get total entropy for the split.
 The resulting entropy is subtracted from the entropy before the split.
 The result is the Information Gain, or decrease in entropy.
Information gain cont…
Information gain cont..

Step 3: Choose attribute with the largest


information gain as the decision node, divide the
dataset by its branches and repeat the same
process on every branch
Information gain cont..
 Step 4a: A branch
with entropy of 0 is a
leaf node.
Information gain cont..
 Step 4b: A branch with entropy
more than 0 needs further
splitting.
Information gain cont…
 A decision tree can easily be transformed to a
set of rules by mapping from the root node to
the leaf nodes one by one.
Decision Trees
When do I play tennis?
Decision Tree
Is the decision tree correct?
 Let’s check whether the split on Wind attribute is
correct.
 We need to show that Wind attribute has the
highest information gain.
When do I play tennis?
Wind attribute – 5 records match

Note: calculate the entropy only on examples that got


“routed” in our branch of the tree (Outlook=Rain)
Calculation
 Let
S = {D4, D5, D6, D10, D14}
 Entropy:
H(S) = – 3/5log(3/5) – 2/5log(2/5) = 0.971
 Information Gain
IG(S,Temp) = H(S) – H(S|Temp) = 0.01997
IG(S, Humidity) = H(S) – H(S|Humidity) = 0.01997
IG(S,Wind) = H(S) – H(S|Wind) = 0.971
Assignment #01
 Imagine your own example for classification
 Everyone should have different example
 What will be the root node?
 Make rules after finalizing the decision tree.
 Calculate entropy and IG
Note:
 23 rd
feb 2021 last date to submit.
 No handwritten assignment will be accepted.
 Copied assignment will be graded “0”.
 No late submission will be accepted.

You might also like