Chapter 3 Decision Trees
Chapter 3 Decision Trees
Algorithm
Mark. A. Magumba
Decision Trees: ID3 (Iterative Dichotomizer)
• Decision Trees operate on categorical data
• They are primarily classification algorithms
• However, they can be modified into regression trees
• One basic algorithm is ID3 which is based on entropy
• Entropy of some set X is given by:
Entropy
• Entropy is a measure of uncertainty, least entropy is when all the probability mass
is in one outcome and maximal entropy is when the probability mass is uniformly
distributed
ID3 Concrete Example: Data
ID3 Steps
1. Compute the entropy of the data (Entropy(S))
On our data
Entropy (S) =
Entropy (S) =
= 0.94
2. Next you compute the entropy given some attribute value
This can be expressed as:
=
In other words, this time we take the probabilities given each value v of the
attribute set V
Entropy given v
For Outlook
V = “Sunny”
=
= Entropy (S|”Sunny”) =
=
= 0.97
Entropy given v
• Similarly entropy for other branches of outlook can be computed
• = 0.97
• =0
Information Gain
• Next we have to compute the information gain on the attribute
• This can be obtained by:
• ID3 algorithm may also favor features with many branches leading to
sub optimal solutions.
• Solution: Some updates to the algorithm like c4.5 algorithm algorithmically
adjust for splitting. C4.5 normalizes information gain by dividing it with the
split information. The split information is given by:
Random Forests
• An additional solution to both problems is random forests
• Random forests algorithms generate multiple trees each on a subset
of the data
• The final decision is made after aggregating the output of these
different trees