DT Classifier
DT Classifier
Entropy
Entropy is a measure of the randomness in the information being
processed.
The higher the entropy, the harder it is to draw any conclusions from that
information.
E(sunny, Humidity)
= (3/5)*E(0,3) + (2/5)*E(2,0) =0
Now calculate information gain.
IG(sunny, humidity)
= 0.971–0.0=0.971
Play
Now wind Attribute yes No Total
Weak 1 2 3
wind Strong 1 1 2
5
E(sunny, wind)
= (3/5)*E(1,2) + (2/5)*E(1,1) =0.95098
Now calculate information gain.
IG(sunny, wind)
= 0.971–0.95098=0.020
We get
IG(sunny, Temperature) = 0.571
IG(sunny, Humidity) = 0.971
IG(sunny, Windy) = 0.020
Here IG(sunny, Humidity) is the largest value. So Humidity is the node that
comes under sunny.
E(rain) = (-(3/5)log(3/5)-
(2/5)log(2/5)) = 0.971.
Now Calculate the information gain of Temperature.
IG(rain, Temperature)
Play
yes No Total
E(rain, Temperature)
Temp Mild 2 1 3
= (3/5)*E(2,1) + (2/5)*E(1,1)
= 0.95098
Cool 1 1 2
Now calculate information gain.
5
IG(rain, Temperature)
= 0.971–0.95098 =0.020
Now consider Humidity
E(rain) = (-(3/5)log(3/5)-
(2/5)log(2/5)) = 0.971.
Now Calculate the information gain of Humidity.
IG(rain, humidity)
Play E(rain, Humidity)
= (3/5)*E(2,1) + (2/5)*E(1,1)
yes No Total
= 0.95098
Now calculate information gain.
humidity High 1 1 2
IG(rain, Humidity)
Normal 2 1 3 = 0.971–0.95098 =0.020
5
Choose one that has a higher Gini gain. Gini gain is higher for outlook. So we can
choose it as our root node.
Now you have got an idea of how to proceed further. Repeat the same steps
we used in the ID3 algorithm.
Decision Tree Algorithm
• Decision Tree is a Supervised learning technique that can be
used for both classification and Regression problems, but
mostly it is preferred for solving Classification problems.
• It is a tree-structured classifier, where internal nodes represent the
features of a dataset, branches represent the decision
rules and each leaf node represents the outcome.
• In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node. Decision nodes are used to make any
decision and have multiple branches, whereas Leaf nodes are the
output of those decisions and do not contain any further branches.
• It is called a decision tree because, similar to a tree, it starts with the
root node, which expands on further branches and constructs a tree-
like structure.
• A decision tree simply asks a question, and based on the answer
(Yes/No), it further split the tree into subtrees.
Why use Decision Trees?
• Decision Trees usually mimic human
thinking ability while making a decision, so
it is easy to understand.
• Easy to understand as based on if/else
conditions
Assumptions
• At the beginning, we consider the whole
training set as the root.
• Feature values are preferred to be
categorical. If the values are continuous
then they are discretized prior to building
the model. On the basis of attribute
values, records are distributed recursively.
• We use statistical methods for ordering
attributes as root or the internal node.
Decision Tree Terminologies
• Root Node: Root node is from where the decision tree starts. It represents the entire
dataset, which further gets divided into two or more homogeneous sets.
• Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated
further after getting a leaf node.
• Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes
according to the given conditions.
• Branch/Sub Tree: A tree formed by splitting the tree.
• Pruning: Pruning is the process of removing the unwanted branches from the tree.
• Parent/Child node: The root node of the tree is called the parent node, and other nodes
are called the child nodes.
How does the Decision Tree
algorithm Work?
• Step-1: Begin the tree with the root node, says S, which contains the
complete dataset.
• Step-2: Find the best attribute in the dataset using Attribute Selection
Measure (ASM).
• Step-3: Divide the S into subsets that contains possible values for the best
attributes.
• Step-4: Generate the decision tree node, which contains the best attribute.
• Step-5: Recursively make new decision trees using the subsets of the
dataset created in step -3. Continue this process until a stage is reached
where you cannot further classify the nodes and called the final node as a
leaf node.
Attribute Selection Measures