Decision Tree Algorithm
o Decision Tree is a Supervised learning technique that can be
used for both classification and Regression problems, but mostly it
is preferred for solving Classification problems. It is a tree-structured
classifier, where internal nodes represent the features of a
dataset, branches represent the decision rules and each leaf
node represents the outcome.
o It is a graphical representation for getting all the possible
solutions to a problem/decision based on given conditions.
o A decision tree simply asks a question, and based on the answer
(Yes/No), it further split the tree into subtrees.
Why Decision tree?
Below are the two reasons for using the Decision tree:
o Decision Trees usually mimic human thinking ability while making a
decision, so it is easy to understand.
o The logic behind the decision tree can be easily understood because
it shows a tree-like structure.
Decision Tree Terminologies
Root Node: Root node is from where the decision tree starts. It represents the entire dataset,
which further gets divided into two or more homogeneous sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further
after getting a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes
according to the given conditions.
Branch/Sub Tree: A tree formed by splitting the tree.
Pruning: Pruning is the process of removing the unwanted branches from the tree.
Parent/Child node: The root node of the tree is called the parent node, and other nodes are
called the child nodes.
How does the Decision Tree algorithm Work?
o Step-1: Begin the tree with the root node, says S, which contains
the complete dataset.
o Step-2: Find the best attribute in the dataset using Attribute
Selection Measure (ASM).
o Step-3: Divide the S into subsets that contains possible values for
the best attributes.
o Step-4: Generate the decision tree node, which contains the best
attribute.
o Step-5: Recursively make new decision trees using the subsets of
the dataset created in step -3. Continue this process until a stage is
reached where you cannot further classify the nodes and called the
final node as a leaf node.
Example: Suppose there is a candidate who has a job offer and wants to decide
whether he should accept the offer or Not.
Attribute Selection Measures
o Information Gain
o Gini Index
1. Information Gain:
o It calculates how much information a feature provides us about a class.
o According to the value of information gain, we split the node and build the
decision tree.
o A decision tree algorithm always tries to maximize the value of
information gain, and a node/attribute having the highest information gain
is split first. It can be calculated using the below formula:
1. Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each featu
re)
Entropy: Entropy is a metric to measure the impurity in a given attribute.
It specifies randomness in data. Entropy can be calculated as:
Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)
Where,
o S= Total number of samples
o P(yes)= probability of yes
o P(no)= probability of no
2. Gini Index:
o Gini index is a measure of impurity or purity used while creating a decision
tree in the CART(Classification and Regression Tree) algorithm.
o An attribute with the low Gini index should be preferred as compared to
the high Gini index.
o It only creates binary splits, and the CART algorithm uses the Gini index to
create binary splits.
o Gini index can be calculated using the below formula:
Gini Index= 1- ∑jPj2.
Advantages of the Decision Tree
o It is simple to understand as it follows the same process which a human
follow while making any decision in real-life.
o It can be very useful for solving decision-related problems.
o It helps to think about all the possible outcomes for a problem.
o There is less requirement of data cleaning compared to other algorithms.
Disadvantages of the Decision Tree
o The decision tree contains lots of layers, which makes it complex.
o It may have an overfitting issue, which can be resolved using the Random
Forest algorithm.
o For more class labels, the computational complexity of the decision tree
may increase.
Python Implementation of Decision Tree
Steps will also remain the same, which are given below:
o Data Pre-processing step
o Fitting a Decision-Tree algorithm to the Training set
o Predicting the test result
o Test accuracy of the result(Creation of Confusion matrix)
o Visualizing the test set result.