Entropy and IG
Entropy and IG
Information Grain
Entropy
Entropy is a measure of disorder or uncertainty
and the goal of machine learning models and
Data Scientists in general is to reduce
uncertainty.
High, Low Entropy
“High Entropy”
X is from a uniform like distribution
Flat histogram
Values sampled from it are less predictable
“Low Entropy”
X is from a varied (peaks and valleys) distribution
Histogram has many lows and highs
Values sampled from it are more predictable
Decision tree-classification
to build a decision tree, we need to calculate two types of
entropy using frequency tables as follows: