Decision Tree
Algorithm
Piranti Cerdas 2021
https://towardsai.net/p/programming/decision-trees-explained-with-a-practical-
example-fe47872d3b53
Introduction
A decision tree is one of the supervised machine learning
algorithms. This algorithm can be used for regression
and classification problems — yet, is mostly used for
classification problems. A decision tree follows a set of
if-else conditions to visualize the data and classify it
according to the conditions.
Root Node: This attribute is used for dividing the data into
two or more sets. The feature attribute in this node is
Overview
selected based on Attribute Selection Techniques.
Branch or Sub-Tree: A part of the entire decision tree is
called a branch or sub-tree.
Splitting: Dividing a node into two or more sub-nodes
based on if-else conditions.
Decision Node: After splitting the sub-nodes into further
sub-nodes, then it is called the decision node.
Leaf or Terminal Node: This is the end of the decision tree
where it cannot be split into further sub-nodes.
Pruning: Removing a sub-node from the tree is called
pruning.
Working of Decision Tree
1. The root node feature is selected based on the results
from the Attribute Selection Measure(ASM).
2. The ASM is repeated until a leaf node, or a terminal
node cannot be split into sub-nodes.
What is Attribute Selective Measure(ASM)?
Attribute Subset Selection Measure is a technique used in the data mining process for data reduction.
The data reduction is necessary to make better analysis and prediction of the target variable.
The two main ASM techniques are
+Gini index
+Information Gain(ID3)
Gini index
The measure of the degree of probability of a
particular variable being wrongly classified when
it is randomly chosen is called the Gini index or
Gini impurity.
When you use the Gini index as the criterion for
the algorithm to select the feature for the root
node.,The feature with the least Gini index is
selected.
Pi= probability of an object being classified into a
particular class.
Information Gain(ID3)
Entropy is the main concept of this algorithm,
which helps determine a feature or attribute
that gives maximum information about a class
is called Information gain or ID3 algorithm. By
using this method, we can reduce the level of
entropy from the root node to the leaf node.
‘p’, denotes the probability of E(S), which
denotes the entropy. The feature or attribute
with the highest ID3 gain is used as the root for
the splitting.
Example
Dataset :
https://www.kaggle.com/madhansing/bank-
loan2?select=madfhantr.csv