Decision Tree
Decision Tree
1.Root Node: This attribute is used for dividing the data into two or more
sets. The feature attribute in this node is selected based on Attribute
Selection Techniques.
2.Branch or Sub-Tree: A part of the entire decision tree is called a branch or
sub-tree.
3.Splitting: Dividing a node into two or more sub-nodes based on if-else
conditions.
4.Decision Node: After splitting the sub-nodes into further sub-nodes, then
it is called the decision node.
5.Leaf or Terminal Node: This is the end of the decision tree where it cannot
be split into further sub-nodes.
6.Pruning: Removing a sub-node from the tree is called pruning.
Cond..
Cond..
• In Decision Tree the major challenge is to identification of the attribute for
the root node in each level. This process is known as attribute selection.
We have two popular attribute selection measures:
1.Information Gain
2.Gini Index
• 1.Information Gain
When we use a node in a decision tree to partition the training instances
into smaller subsets the entropy changes. Information gain is a measure of
this change in entropy.
Definition: Suppose S is a set of instances, A is an attribute, Sv is the subset
of S with A = v, and Values (A) is the set of all possible values of A, then
Information Gain = (Entropy of parent node)-(Entropy of child node)
• For example, in a binary classification problem (two classes),
we can calculate the entropy of the data sample as follows:
Cond …
Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)
Gini index
• Gini Index is a metric to measure how often a randomly chosen
element would be incorrectly identified.
• It means an attribute with lower Gini index should be preferred.
• Sklearn supports “Gini” criteria for Gini Index and by default, it takes
“gini” value.
• The Formula for the calculation of the of the Gini Index is given below.
What are the steps in ID3 algorithm?
Here, the attribute with maximum information gain is Outlook. So, the decision tree built so far -
Cond ..
• Here, when Outlook == overcast, it is of pure class(Yes).
Now, we have to repeat same procedure for the data with rows
consist of Outlook value as Sunny and then for Outlook value
as Rain.
• Now, finding the best attribute for splitting the data with
Outlook=Sunny values{ Dataset rows = [1, 2, 8, 9, 11]}.