0% found this document useful (0 votes)

17 views28 pages

Decision Tree-31-01-2025

The document provides an overview of the Decision Tree algorithm, a supervised learning method used for both regression and classification problems. It details the structure of decision trees, including key terminology such as root nodes, leaf nodes, and the processes of splitting and pruning. Additionally, it discusses various attribute selection measures, including Entropy, Information Gain, Gini Index, Gain Ratio, and Chi-Square, which are essential for building effective decision trees.

Uploaded by

aparnajagatramka.15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views28 pages

Decision Tree-31-01-2025

Uploaded by

aparnajagatramka.15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Duration: 55 min AIML Credit: 4 ML | AI3201

Duration: 45 min AIML Credit: 4 ML | AI3201

Decision Tree Algorithm

30/04/2025 1
Duration: 55 min AIML Credit: 4 ML | AI3201

• Decision Tree algorithm belongs to the family of supervised learning

algorithms. Unlike other supervised learning algorithms, the decision tree
algorithm can be used for solving regression and classification problems too.

• The goal of using a Decision Tree is to create a training model that can use to
predict the class or value of the target variable by learning simple decision
rules inferred from prior data(training data).

• In Decision Trees, for predicting a class label for a record we start from the
root of the tree. We compare the values of the root attribute with the record’s
attribute. On the basis of comparison, we follow the branch corresponding to
that value and jump to the next node.

30/04/2025 2
Duration: 55 min AIML Credit: 4 ML | AI3201

Types of Decision Trees

• Types of decision trees are based on the type of target variable we have. It can
be of two types:
1.Categorical Variable Decision Tree: Decision Tree which has a categorical
target variable then it called a Categorical variable decision tree.

2.Continuous Variable Decision Tree: Decision Tree has a continuous target

variable then it is called Continuous Variable Decision Tree.

• Example:- Let’s say we have a problem to predict whether a customer will pay
his renewal premium with an insurance company (yes/ no). Here we know that
the income of customers is a significant variable but the insurance company
does not have income details for all customers. Now, as we know this is an
important variable, then we can build a decision tree to predict customer
income based on occupation, product, and various other variables. In this case,
we are predicting values for the continuous variables.
30/04/2025 3
Duration: 55 min AIML Credit: 4 ML | AI3201

Important Terminology related to Decision Trees

Root Node: It represents the entire population or sample and this further gets
divided into two or more homogeneous sets.

Splitting: It is a process of dividing a node into two or more sub-nodes.

Decision Node: When a sub-node splits into further sub-nodes, then it is called the
decision node.

Leaf / Terminal Node: Nodes do not split is called Leaf or Terminal node.

Pruning: When we remove sub-nodes of a decision node, this process is called

pruning. You can say the opposite process of splitting.

Branch / Sub-Tree: A subsection of the entire tree is called branch or sub-tree.

Parent and Child Node: A node, which is divided into sub-nodes is called a parent
node of sub-nodes whereas sub-nodes are the child of a parent node.
30/04/2025 4
Duration: 55 min AIML Credit: 4 ML | AI3201

•Decision trees classify the examples by sorting them down the tree from the root to
some leaf/terminal node, with the leaf/terminal node providing the classification of
the example.

•Each node in the tree acts as a test case for some attribute, and each edge
descending from the node corresponds to the possible answers to the test case. This
process is recursive in nature and is repeated for every subtree rooted at the new
04/30/2025 5

node.
Duration: 55 min AIML Credit: 4 ML | AI3201
Assumptions while creating Decision Tree
•Below are some of the assumptions we make while using Decision tree:

•In the beginning, the whole training set is considered as the root.
•Feature values are preferred to be categorical. If the values are continuous then they are
discretized prior to building the model.
•Records are distributed recursively on the basis of attribute values.
•Order to placing attributes as root or internal node of the tree is done by using some
statistical approach.

•Decision Trees follow Sum of Product (SOP) representation. The Sum of product (SOP) is
also known as Disjunctive Normal Form. For a class, every branch from the root of the
tree to a leaf node having the same class is conjunction (product) of values, different
branches ending in that class form a disjunction (sum).

•The primary challenge in the decision tree implementation is to identify which attributes
do we need to consider as the root node and each level. Handling this is to know as the
attributes selection. We have different attributes selection measures to identify the
attribute which can be considered as the root note at each level.
04/30/2025 6
Duration: 55 min AIML Credit: 4 ML | AI3201

How Does the Decision Tree Algorithm Work?

• The basic idea behind any decision tree algorithm is as follows:
1.Select the best attribute using Attribute Selection Measures (ASM) to split the
records.
2.Make that attribute a decision node and breaks the dataset into smaller
subsets.
3.Start tree building by repeating this process recursively for each child until one
of the conditions will match:
1.All the tuples belong to the same attribute value.
2.There are no more remaining attributes.
3.There are no more instances.

30/04/2025 7
Duration: 55 min AIML Credit: 4 ML | AI3201
How do Decision Trees work?
•The decision of making strategic splits heavily affects a tree’s accuracy. The decision
criteria are different for classification and regression trees.

•Decision trees use multiple algorithms to decide to split a node into two or more sub-
nodes. The creation of sub-nodes increases the homogeneity of resultant sub-nodes. In
other words, we can say that the purity of the node increases with respect to the target
variable. The decision tree splits the nodes on all available variables and then selects the
split which results in most homogeneous sub-nodes.

•The algorithm selection is also based on the type of target variables. Let us look at some
algorithms used in Decision Trees:
ID3 → (extension of D3)
C4.5 → (successor of ID3)
CART → (Classification And Regression Tree)
CHAID → (Chi-square automatic interaction detection Performs multi-level splits when
computing classification trees)
MARS → (multivariate adaptive regression splines)
The ID3 algorithm builds decision trees using a top-down greedy search approach through
the space of possible branches with no backtracking. A greedy algorithm, as the name
suggests, always makes the choice that seems to be the best at that moment.
04/30/2025 8
Duration: 55 min AIML Credit: 4 ML | AI3201

Steps in ID3 algorithm:

1.It begins with the original set S as the root node.
2.On each iteration of the algorithm, it iterates through the very unused
attribute of the set S and calculates Entropy(H) and Information
gain(IG) of this attribute.
3.It then selects the attribute which has the smallest Entropy or Largest
Information gain.
4.The set S is then split by the selected attribute to produce a subset of
the data.
5.The algorithm continues to recur on each subset, considering only
attributes never selected before.

30/04/2025 9
Duration: 55 min AIML Credit: 4 ML | AI3201
Attribute Selection Measures
• If the dataset consists of N attributes then deciding which attribute to place at the root
or at different levels of the tree as internal nodes is a complicated step. By just
randomly selecting any node to be the root can’t solve the issue. If we follow a random
approach, it may give us bad results with low accuracy.

• For solving this attribute selection problem, researchers worked and devised some
solutions. They suggested using some criteria like :
• Entropy,
• Information gain,
• Gini index,
• Gain Ratio,
• Reduction in Variance
• Chi-Square

• These criteria will calculate values for every attribute. The values are sorted, and
attributes are placed in the tree by following the order i.e, the attribute with a high
value(in case of information gain) is placed at the root.
• While using Information Gain as a criterion, we assume attributes to be categorical,
and for the Gini index, attributes are assumed to be continuous.
30/04/2025 10
Duration: 55 min AIML Credit: 4 ML | AI3201
Entropy
• Entropy is a measure of the randomness in the information being processed. The higher the
entropy, the harder it is to draw any conclusions from that information. Flipping a coin is an
example of an action that provides information that is random.

• From the above graph, it is quite evident that the entropy H(X) is zero when the probability is
either 0 or 1. The Entropy is maximum when the probability is 0.5 because it projects perfect
randomness in the data and there is no chance if perfectly determining the outcome.
• ID3 follows the rule — A branch with an entropy of zero is a leaf node and A branch with entropy
more than zero needs further splitting.
30/04/2025 11
Duration: 55 min AIML Credit: 4 ML | AI3201

Mathematically Entropy for 1 attribute is represented as:

• Where S → Current state, and Pi → Probability of an event i of state S

or Percentage of class i in a node of state S.

30/04/2025 12
Duration: 55 min AIML Credit: 4 ML | AI3201

Mathematically Entropy for multiple attributes is represented

as:

• where T→ Current state and X → Selected attribute

30/04/2025 13
Duration: 55 min AIML Credit: 4 ML | AI3201

Information Gain
• Information gain or IG is a statistical property that measures how well a given
attribute separates the training examples according to their target classification.
Constructing a decision tree is all about finding an attribute that returns the highest
information gain and the smallest entropy.

• Information gain is a decrease in entropy. It computes the difference between

entropy before split and average entropy after split of the dataset based on given
attribute values. ID3 (Iterative Dichotomiser) decision tree algorithm uses
information gain.

30/04/2025 14
Duration: 55 min AIML Credit: 4 ML | AI3201

Mathematically, IG is represented as:

• In a much simpler way, we can conclude that:

• Where “before” is the dataset before the split, K is the number of subsets generated by the
split, and (j, after) is subset j after the split.

30/04/2025 15
Duration: 55 min AIML Credit: 4 ML | AI3201

Gini Index

• You can understand the Gini index as a cost function used to evaluate splits in the
dataset. It is calculated by subtracting the sum of the squared probabilities of each
class from one. It favours larger partitions and easy to implement whereas
information gain favours smaller partitions with distinct values.

• Gini Index works with the categorical target variable “Success” or “Failure”. It
performs only Binary splits.
• Higher value of Gini index implies higher inequality, higher heterogeneity.

30/04/2025 16
Duration: 55 min AIML Credit: 4 ML | AI3201

Steps to Calculate Gini index for a split

1.Calculate Gini for sub-nodes, using the above formula for success(p) and
failure(q) (p²+q²).
2.Calculate the Gini index for split using the weighted Gini score of each node
of that split.

• CART (Classification and Regression Tree) uses the Gini index method to
create split points.

30/04/2025 17
Duration: 55 min AIML Credit: 4 ML | AI3201

Gain ratio
• Information gain is biased towards choosing attributes with a large number of
values as root nodes. It means it prefers the attribute with a large number of
distinct values.
• C4.5, an improvement of ID3, uses Gain ratio which is a modification of
Information gain that reduces its bias and is usually the best option. Gain ratio
overcomes the problem with information gain by taking into account the number
of branches that would result before making the split. It corrects information gain
by taking the intrinsic information of a split into account.
• Let us consider if we have a dataset that has users and their movie genre
preferences based on variables like gender, group of age, rating, blah, blah. With
the help of information gain, you split at ‘Gender’ (assuming it has the highest
information gain) and now the variables ‘Group of Age’ and ‘Rating’ could be
equally important and with the help of gain ratio, it will penalize a variable with
more distinct values which will help us decide the split at the next level.

30/04/2025 18
Duration: 55 min AIML Credit: 4 ML | AI3201

Gain ratio

• Where “before” is the dataset before the split, K is the number of subsets
generated by the split, and (j, after) is subset j after the split.

30/04/2025 19
Duration: 55 min AIML Credit: 4 ML | AI3201

Reduction in Variance

• Reduction in variance is an algorithm used for continuous target variables

(regression problems). This algorithm uses the standard formula of variance to
choose the best split. The split with lower variance is selected as the criteria to
split the population:

• Above X-bar is the mean of the values, X is actual, and n is the number of
values.

• Steps to calculate Variance:

1.Calculate variance for each node.
2.Calculate variance for each split as the weighted average of each node
variance.
30/04/2025 20
Duration: 55 min AIML Credit: 4 ML | AI3201

Chi-Square
• The acronym CHAID stands for Chi-squared Automatic Interaction Detector. It is
one of the oldest tree classification methods. It finds out the statistical
significance between the differences between sub-nodes and parent node. We
measure it by the sum of squares of standardized differences between observed
and expected frequencies of the target variable.
• It works with the categorical target variable “Success” or “Failure”. It can perform
two or more splits. Higher the value of Chi-Square higher the statistical
significance of differences between sub-node and Parent node.
• It generates a tree called CHAID (Chi-square Automatic Interaction Detector).
• Mathematically, Chi-squared is represented as:

30/04/2025 21
Duration: 55 min AIML Credit: 4 ML | AI3201

Steps to Calculate Chi-square for a split:

1.Calculate Chi-square for an individual node by calculating the deviation for

Success and Failure both
2.Calculated Chi-square of Split using Sum of all Chi-square of success and
Failure of each node of the split

30/04/2025 22
Duration: 55 min AIML Credit: 4 ML | AI3201

How to avoid/counter Overfitting in Decision Trees?

• The common problem with Decision trees, especially having a table full of
columns, they fit a lot. Sometimes it looks like the tree memorized the training
data set.
• If there is no limit set on a decision tree, it will give you 100% accuracy on the
training data set because in the worse case it will end up making 1 leaf for each
observation.
• Thus this affects the accuracy when predicting samples that are not part of the
training set.

• Here are two ways to remove overfitting:

1.Pruning Decision Trees.
2.Random Forest

30/04/2025 23
Duration: 55 min AIML Credit: 4 ML | AI3201

Pruning Decision Trees

• The splitting process results in fully grown trees until the stopping criteria are
reached. But, the fully grown tree is likely to overfit the data, leading to poor
accuracy on unseen data.

30/04/2025 24
Duration: 55 min AIML Credit: 4 ML | AI3201

Pruning Decision Trees

• In pruning, you trim off the branches of the tree, i.e., remove the decision nodes starting from
the leaf node such that the overall accuracy is not disturbed.
• This is done by segregating the actual training set into two sets: training data set, D and
validation data set, V. Prepare the decision tree using the segregated training data set, D. Then
continue trimming the tree accordingly to optimize the accuracy of the validation data set, V.

• In the above diagram, the ‘Age’ attribute in the left-hand side of the tree has been pruned as it
has more importance on the right-hand side of the tree, hence removing overfitting.

30/04/2025 25
Duration: 55 min AIML Credit: 4 ML | AI3201

Random Forest
• Random Forest is an example of ensemble learning, in which we combine
multiple machine learning algorithms to obtain better predictive performance.

• Why the name “Random”?

• Two key concepts that give it the name random:
1.A random sampling of training data set when building trees.
2.Random subsets of features considered when splitting nodes.

30/04/2025 26
Duration: 55 min AIML Credit: 4 ML | AI3201

Bagging
• A technique known as bagging is used to create an ensemble of trees where
multiple training sets are generated with replacement.
• In the bagging technique, a data set is divided into N samples using
randomized sampling. Then, using a single learning algorithm a model is built
on all samples. Later, the resultant predictions are combined using voting or
averaging in parallel.

30/04/2025 27
Duration: 55 min AIML Credit: 4 ML | AI3201

Which is better Linear or tree-based models?

• Well, it depends on the kind of problem you are solving.
1.If the relationship between dependent & independent variables is well
approximated by a linear model, linear regression will outperform the tree-
based model.
2.If there is a high non-linearity & complex relationship between dependent &
independent variables, a tree model will outperform a classical regression
method.
3.If you need to build a model that is easy to explain to people, a decision tree
model will always do better than a linear model. Decision tree models are even
simpler to interpret than linear regression!

30/04/2025 28

Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
UNIT II 2.1 ML Decision Tree Learning
No ratings yet
UNIT II 2.1 ML Decision Tree Learning
55 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Unit-II - Tree Based Methods
No ratings yet
Unit-II - Tree Based Methods
158 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Decision Tree Algorithm, Explained
No ratings yet
Decision Tree Algorithm, Explained
20 pages
Trees
No ratings yet
Trees
78 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
Decision Trees Lectures
No ratings yet
Decision Trees Lectures
55 pages
Classification
No ratings yet
Classification
75 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
Module 4 Lecture - 2
No ratings yet
Module 4 Lecture - 2
65 pages
DecisionTree Numerical ID3Prob
No ratings yet
DecisionTree Numerical ID3Prob
114 pages
Supervised Decision TreeRandom Forest
No ratings yet
Supervised Decision TreeRandom Forest
39 pages
Machine - Learning - Lecture - 08 - Decision Tree Learning
No ratings yet
Machine - Learning - Lecture - 08 - Decision Tree Learning
67 pages
Ms. Mehroz Sadiq: 11/23/2020 Bahria University Islamabad 1
No ratings yet
Ms. Mehroz Sadiq: 11/23/2020 Bahria University Islamabad 1
75 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Session 5b Classification by Decision Tree Induction
No ratings yet
Session 5b Classification by Decision Tree Induction
42 pages
ML Unit 3 Qa
No ratings yet
ML Unit 3 Qa
26 pages
Advanced Technologies in Modern Robotic PDF
100% (2)
Advanced Technologies in Modern Robotic PDF
428 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
UNIT 2 - Groups (Decision Tree)
No ratings yet
UNIT 2 - Groups (Decision Tree)
20 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
2179 Unit 3
No ratings yet
2179 Unit 3
29 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
6.2.unit-2 ML Handsout
No ratings yet
6.2.unit-2 ML Handsout
18 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Cours #4-Decision Tree
No ratings yet
Cours #4-Decision Tree
18 pages
Unit-3 Introduction To Machine Learning Algorithms
No ratings yet
Unit-3 Introduction To Machine Learning Algorithms
18 pages
Apogee Prepress v9 Tutorial en
No ratings yet
Apogee Prepress v9 Tutorial en
172 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
UN40D5500 Trobleshooting PDF
100% (1)
UN40D5500 Trobleshooting PDF
49 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
Decision Trees Edited
No ratings yet
Decision Trees Edited
56 pages
Lecture 7.1 - Decision Tree Classification
No ratings yet
Lecture 7.1 - Decision Tree Classification
15 pages
DMDW 04
No ratings yet
DMDW 04
10 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
14 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
11 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Decisiontree
No ratings yet
Decisiontree
4 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Suhail Khan 12th Marksheet
No ratings yet
Suhail Khan 12th Marksheet
1 page
Vtu Course Work
100% (2)
Vtu Course Work
7 pages
Lesson 7 Supervised Method (Decision Trees) Algorithms
No ratings yet
Lesson 7 Supervised Method (Decision Trees) Algorithms
12 pages
Decisiontree 2
No ratings yet
Decisiontree 2
16 pages
Decsion Tree
No ratings yet
Decsion Tree
6 pages
Decision Trees and How To Build and Optimize Decision Tree Classifier
No ratings yet
Decision Trees and How To Build and Optimize Decision Tree Classifier
16 pages
Tree
No ratings yet
Tree
7 pages
Decision Trees
No ratings yet
Decision Trees
3 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Wire Rope Lubricants
100% (1)
Wire Rope Lubricants
12 pages
Day48 Decision Trees
No ratings yet
Day48 Decision Trees
5 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Writing Preliminary Pages
No ratings yet
Writing Preliminary Pages
10 pages
Lecture # 6: Computer Organization and Assembly Language
100% (1)
Lecture # 6: Computer Organization and Assembly Language
31 pages
P-WPS 135 - MAG (GR 316)
No ratings yet
P-WPS 135 - MAG (GR 316)
9 pages
New Model Service Ratio - 15022025
No ratings yet
New Model Service Ratio - 15022025
36 pages
Aba Siwes
No ratings yet
Aba Siwes
59 pages
M-TEC Single-Phase Energy-Butler 062023 WEB
No ratings yet
M-TEC Single-Phase Energy-Butler 062023 WEB
2 pages
Instrument Specification Sheet - Flame Detectors: Project
No ratings yet
Instrument Specification Sheet - Flame Detectors: Project
1 page
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
РЛС ICE RADAR FICE-100 Мануал1
No ratings yet
РЛС ICE RADAR FICE-100 Мануал1
20 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
AMM25 62 00 05.full
No ratings yet
AMM25 62 00 05.full
6 pages
CNC, DS Catia, AutoCAD PPT 2
No ratings yet
CNC, DS Catia, AutoCAD PPT 2
17 pages
Carburetor Tuning Tips 101
No ratings yet
Carburetor Tuning Tips 101
3 pages
Package Aware R
No ratings yet
Package Aware R
98 pages
Unit 4 1
No ratings yet
Unit 4 1
7 pages
CH 6 Synchronization
No ratings yet
CH 6 Synchronization
69 pages
Talking Cars: Field Talking Cars: Field Test Results Comparing DSRC & C-V2X Technology
No ratings yet
Talking Cars: Field Talking Cars: Field Test Results Comparing DSRC & C-V2X Technology
39 pages
Book Chapter ZimmermannFuzzySetTheory2001
No ratings yet
Book Chapter ZimmermannFuzzySetTheory2001
14 pages
B Com 1st, 3rd, 5th
No ratings yet
B Com 1st, 3rd, 5th
1 page
A Sustainable Quality Assessment Model For The Information Delivery in E - Learning System
No ratings yet
A Sustainable Quality Assessment Model For The Information Delivery in E - Learning System
38 pages
PDF 1
No ratings yet
PDF 1
3 pages
4 Enhancing-Decision-Making-Through-Sensitivity-Analysis
No ratings yet
4 Enhancing-Decision-Making-Through-Sensitivity-Analysis
14 pages
For Cinema, Television and Photography: Light & Shadow
No ratings yet
For Cinema, Television and Photography: Light & Shadow
7 pages
(SS Handouts 5&6) Dec 13
No ratings yet
(SS Handouts 5&6) Dec 13
3 pages
Einhell-Bmh 33-36-E
No ratings yet
Einhell-Bmh 33-36-E
1 page
Fascia Ventilated Eaves 25mm - Warm Roof
No ratings yet
Fascia Ventilated Eaves 25mm - Warm Roof
1 page

Decision Tree-31-01-2025

Uploaded by

Decision Tree-31-01-2025

Uploaded by

Duration: 55 min AIML Credit: 4 ML | AI3201

Duration: 45 min AIML Credit: 4 ML | AI3201

Decision Tree Algorithm

• Decision Tree algorithm belongs to the family of supervised learning

Types of Decision Trees

2.Continuous Variable Decision Tree: Decision Tree has a continuous target

Important Terminology related to Decision Trees

Splitting: It is a process of dividing a node into two or more sub-nodes.

Pruning: When we remove sub-nodes of a decision node, this process is called

Branch / Sub-Tree: A subsection of the entire tree is called branch or sub-tree.

How Does the Decision Tree Algorithm Work?

Steps in ID3 algorithm:

Mathematically Entropy for 1 attribute is represented as:

• Where S → Current state, and Pi → Probability of an event i of state S

Mathematically Entropy for multiple attributes is represented

• where T→ Current state and X → Selected attribute

• Information gain is a decrease in entropy. It computes the difference between

Mathematically, IG is represented as:

• In a much simpler way, we can conclude that:

Steps to Calculate Gini index for a split

• Reduction in variance is an algorithm used for continuous target variables

• Steps to calculate Variance:

Steps to Calculate Chi-square for a split:

1.Calculate Chi-square for an individual node by calculating the deviation for

How to avoid/counter Overfitting in Decision Trees?

• Here are two ways to remove overfitting:

Pruning Decision Trees

Pruning Decision Trees

• Why the name “Random”?

Which is better Linear or tree-based models?

You might also like