0% found this document useful (0 votes)

25 views37 pages

Unit IV Decision Trees

The document provides an overview of decision trees, a graphical representation used for decision-making based on given conditions. It explains key terminologies such as root node, leaf node, and the processes of splitting and pruning, as well as the algorithms used for constructing decision trees like CART. Additionally, it discusses concepts like entropy, information gain, and Gini index, which are essential for determining the best splits in both classification and regression trees.

Uploaded by

janarthana9789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views37 pages

Unit IV Decision Trees

Uploaded by

janarthana9789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Unit 4 – Decision Trees

Prepared By: Nivetha Raju

Department of Computer Science and
Engineering
Decision Tree

• It is a graphical representation for getting all the

possible solutions to a problem/decision based
on given conditions.
• It is called a decision tree because, similar to a tree, it
starts with the root node, which expands on further
branches and constructs a tree-like structure.
• In order to build a tree, CART algorithm, which stands
for Classification and Regression Tree algorithm.
• A decision tree simply asks a question, and based on
the answer (Yes/No), it further split the tree into
subtrees.
Decision Tree Terminologies
• Root Node: Root node is from where the decision tree
starts. It represents the entire dataset, which further
gets divided into two or more homogeneous sets.
• Leaf Node: Leaf nodes are the final output node, and
the tree cannot be segregated further after getting a leaf
node.
• Splitting: Splitting is the process of dividing the decision
node/root node into sub-nodes according to the given
conditions.
• Branch/Sub Tree: A tree formed by splitting the tree.
• Pruning: Pruning is the process of removing the
unwanted branches from the tree.
• Parent/Child node: The root node of the tree is called
the parent node, and other nodes are called the child
nodes.
Decision Tree

• It is a hierarchical data structure implementing the divide-and-conquer

strategy.
• Efficient nonparametric method- used for classification and regression
Decision Tree
• A decision tree is a hierarchical model for supervised learning whereby the local region is
identified in a sequence of recursive splits in a smaller number of steps.
• It is a hierarchical data structure implementing the divide-and-conquer strategy.
• It is composed of internal decision nodes and terminal leaves
• It is also a nonparametric model in the sense that we do not assume any parametric form
for the class densities and the tree structure is not fixed a priori but the tree grows,
branches and leaves are added
• Each decision node m implements a test function fm(x) with discrete outcomes labeling the
branches.
• Given an input, at each node, a test is applied and one of the branches is taken depending
on the outcome.
• This process starts at the root and is repeated recursively until a leaf node is hit, at which
point the value written in the leaf constitutes the output
• Each leaf node has an output label,
• classification - class code
• regression - numeric value.
• A leaf node defines a localized region in the input space where instances falling in this
region have the same labels (in classification), or very similar numeric outputs (in
regression).
Example
Example
Example
Example
Univariate Trees

• The test uses only one of the input dimensions.

• If the used input dimension is discrete ,the decision node checks the
value of it and takes the corresponding branch, implementing an n-way
split.
Binary Split
• Numeric xi : Binary split : xi > wm
• where Wm is a suitably chosen threshold value.
• The decision node divides the input space into two:
Lm = {x|xj > wm} and
Rm = {x|xj ≤ wm}
Decision Trees

• Tree Induction – constructing tree from training sample.

• There may be many trees for sample- we are interested to find
smallest one( no.of nodes & complexity) – NP complete
• we are forced to use local search procedures based on heuristics that
give reasonable trees in reasonable time.
• Tree learning algorithms are greedy and, at each step, starting at the
root with the complete training data, we look for the best split
Classification Trees
• For node m, Nm instances reach m, Nim belong to Ci

i
N
PˆC i | x , m  pmi  m
• Node m is pure if pimis 0 or 1 Nm
• Measure of impurity is entropy

K
I m   p log p i
m
i
2 m
i 1

Entropy is the measure of purity

high entropy , a high level of disorder ( meaning low
level of purity)
Best Split

• If node m is pure, generate a leaf and stop, otherwise split and

continue recursively
• Impurity after split: Nmj of Nm take branch j. Nimj belong to Ci
i
Nmj
PˆC i | x , m, j  pmj
i

Nmj
• Find the variable and split that min impurity (among all variables -- and
split positions for numeric variables)
n Nmj K
I' m    mj 2 mj
p i
log p i

j 1 Nm i 1
Information Gain:
• Information gain is the measurement of changes in
entropy after the segmentation of a dataset based on
an attribute.
• It calculates how much information a feature provides
us about a class.
• According to the value of information gain, we split the
node and build the decision tree.
• A decision tree algorithm always tries to maximize the
value of information gain, and a node/attribute having
the highest information gain is split first. It can be
calculated using the below formula:

• Information Gain= Entropy(S)- [(Weighted Avg) *Entrop

y(each feature)
Information Gain:
• Entropy: Entropy is a metric to measure the impurity in
a given attribute. It specifies randomness in data.
Entropy can be calculated as:

• Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)

• Where,

• S= Total number of samples

• P(yes)= probability of yes
• P(no)= probability of no
Gini Index:
• Gini index is a measure of impurity or purity used while
creating a decision tree in the CART(Classification and
Regression Tree) algorithm.
• An attribute with the low Gini index should be preferred
as compared to the high Gini index.
• It only creates binary splits, and the CART algorithm
uses the Gini index to create binary splits.
• Gini index can be calculated using the below formula:

Gini Index= 1- ∑jPj2

1. GenerateTree(X):
Step 1: Check if the entropy of the current node is below a
threshold (θ). If yes, create a leaf node labeled with the
majority class.
Classification Step 2: If not, select the best attribute to split using
SplitAttribute(X).
tree construction
Step 3: For each branch of the selected attribute, generate the
tree recursively for the subset of data that falls into each
branch.
2. SplitAttribute(X):
Step 1: Initialize the minimum entropy (MinEnt) to a maximum
value.
Step 2: For each attribute, check if it is discrete or numeric:
Discrete attributes: Split the data into subsets based on the
values of the attribute. Compute the entropy for the split.
Numeric attributes: Consider all possible splits for the numeric
attribute and calculate the entropy.
Step 3: Choose the attribute and split that minimize the
entropy.
Step 4: Return the best attribute (bestf) with the lowest
entropy.
Example-Entropy
• Consider an example where we are building a decision tree to
predict whether a loan given to a person would result in a write-
off or not. Our entire population consists of 30 instances. 16
belong to the write-off class and the other 14 belong to the non-
write-off class. We have two features, namely “Balance” that can
take on two values -> “< 50K” or “>50K” and “Residence” that
can take on three values -> “OWN”, “RENT” or “OTHER”.
• How a decision tree algorithm would decide what attribute to split
on first and what feature provides more information, or reduces
more uncertainty about our target variable out of the two using
the concepts of Entropy.
Solution – step 1
• Feature 1 – Balance
Step 2
Feature 2 - Residence
Solution - Inference

• E(balance)<E(residence) => The child nodes from splitting on

Balance do seem purer than those of Residence.
• If you look at graph, however the left most node for residence is also
very pure but this is where the weighted averages come in play.
• Even though that node is very pure, it has the least amount of the
total observations, and a result contributes a small portion of it’s
purity when we calculate the total entropy from splitting on
Residence.
• Conclusion – A decision tree algorithm would use this result to make
the first split on our data using Balance.
Regression Trees
Regression Trees
Regression Trees
A regression tree is a decision tree where the values at the endpoint
nodes are continuous rather than discrete. That is, the regression tree
predicts a real-valued output rather than a class label
Ex: Predicting a person’s salary based on their age and occupation

Let us say for node m, Xm is the subset of X reaching node m; namely, it is

the set of all x ∈ X satisfying all the conditions in the decision nodes on
the path from the root until node m

In regression, the goodness of a split is measured by the mean square error from the estimated value.
Regression Trees
• If at a node, the error is acceptable, that is, Em < θr , then a
leaf node is created and it stores the gm value.
• If the error is not acceptable, data reaching node m is split
further such that the sum of the errors in the branches is
minimum.
• As in classification, at each node, we look for the attribute
(and split threshold for a numeric attribute) that minimizes
the error, and then we continue recursively.
1 if x  X mj : x reaches node m and branch j
bmj x  
0 otherwise
1 t mj
b x t
r t

E 'm   j t r  gmj bmj xt 

 
t 2
gmj 
Nm t mj 
b x t
Model Selection in Trees
Pruning Trees
• Any decision based on too few instances causes variance -
generalization error.
• Remove subtrees for better generalization (decrease
variance)
• Prepruning: Stopping tree construction early on before it is
full is called prepruning the tree.
• Postpruning: Grow the tree full until all leaves are pure and
we have no training error. We then find subtrees that cause
overfitting and we prune them.
• Prepruning is faster, postpruning is more accurate (requires a
separate pruning set)
Rule Extraction from Trees
• A decision tree does its own feature extraction.
• we build a tree and then take only those features used by the tree as inputs to another learning
method.
• Another main advantage of decision trees is interpretability: The decision nodes carry
conditions that are simple to understand
• Rule base: set of IF-THEN rules
• Knowledge Extraction: rule base allows knowledge extraction. It allows experts to verify the
model learned from data.
• Rule support - the percentage of training data covered by the rule
Learning Rules
• Rule induction is similar to tree induction but
• tree induction is breadth-first,
• rule induction is depth-first; one rule at a time
• Rule set contains rules; rules are conjunctions of terms
• A rule is said to cover an example if the example satisfies all the conditions of
the rule.
• Sequential covering: Generate rules one at a time until all positive examples
are covered
• Example: IREP (Fürnkrantz and Widmer, 1994), Ripper (Cohen, 1995)
IREP
• IREP (Incremental Reduced Error Pruning) is an algorithm used for
rule induction in machine learning. It generates a set of rules (a rule
set) to classify data and covers one rule at a time, aiming to capture
patterns in the data.
• Steps of IREP:
1.Generate a rule: It starts by finding a rule that covers many
positive examples (examples belonging to the class of interest).
2.Prune the rule: It simplifies the rule by removing unnecessary parts
to avoid overfitting to the training data.
3.Remove covered examples: Once a rule is created, the examples
covered by that rule are removed from the dataset.
4.Repeat: The algorithm repeats this process, generating more rules,
until all the positive examples are covered.
Multivariate Trees
In a multivariate tree, at a decision node, all input dimensions can be used and thus it is
more general.
Example Problems
Example Problems
Example Problems
Example Problems
Example Problems
1.https://www.youtube.com/watch?v=zNYdkpAcP-g&t=595s

2. https://www.youtube.com/watch?v=wefc_36d5mU&t=52s

3. https://www.youtube.com/watch?v=y6VwIcZAUkI&t=113s

The Growth Hacker's Guide To The Galaxy For Betakit
78% (9)
The Growth Hacker's Guide To The Galaxy For Betakit
33 pages
Module in Mathematics 03: Trigonometry
100% (1)
Module in Mathematics 03: Trigonometry
21 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
UNIT II Part-1
No ratings yet
UNIT II Part-1
59 pages
Sunspec Modbus Protocol For SMA Device
No ratings yet
Sunspec Modbus Protocol For SMA Device
19 pages
Grade 8 Computer Question Bank
No ratings yet
Grade 8 Computer Question Bank
3 pages
Decision Tree
No ratings yet
Decision Tree
66 pages
BCA Course PDF
No ratings yet
BCA Course PDF
109 pages
Case Study Mysql
0% (1)
Case Study Mysql
3 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Classification and Regression Trees
100% (1)
Classification and Regression Trees
60 pages
A Survey of Vectorization Methods in Topological Data Analysis
No ratings yet
A Survey of Vectorization Methods in Topological Data Analysis
14 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
16 pages
Dashboard in A Day
No ratings yet
Dashboard in A Day
40 pages
Tribhuvan University: Project Proposal
No ratings yet
Tribhuvan University: Project Proposal
17 pages
Decision Tree
No ratings yet
Decision Tree
68 pages
Adv Itt CAAT MCQ
No ratings yet
Adv Itt CAAT MCQ
5 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Decision Tree Notes
No ratings yet
Decision Tree Notes
6 pages
FMA Unit-4 MCQ's PART-A (1 Mark Question)
100% (1)
FMA Unit-4 MCQ's PART-A (1 Mark Question)
8 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Classification and Prediction
No ratings yet
Classification and Prediction
81 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
Blind Navigation Band
No ratings yet
Blind Navigation Band
50 pages
BSC ML Ch3
No ratings yet
BSC ML Ch3
106 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
ML - 4
No ratings yet
ML - 4
58 pages
AIML Removed Merged
No ratings yet
AIML Removed Merged
31 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
07.2.decision Trees
No ratings yet
07.2.decision Trees
33 pages
AIML Removed
No ratings yet
AIML Removed
25 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
36 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
Visualization Charts
No ratings yet
Visualization Charts
108 pages
H. Huong - Đề Toeic - HK 1-2019 - No2
No ratings yet
H. Huong - Đề Toeic - HK 1-2019 - No2
11 pages
DV Unit1 Part2
No ratings yet
DV Unit1 Part2
98 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Machine - Learning - Lecture - 08 - Decision Tree Learning
No ratings yet
Machine - Learning - Lecture - 08 - Decision Tree Learning
67 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Big Data and Tableau
No ratings yet
Big Data and Tableau
132 pages
Decision Trees (I) : ISOM3360 Data Mining For Business Analytics, Session 4
No ratings yet
Decision Trees (I) : ISOM3360 Data Mining For Business Analytics, Session 4
32 pages
S&Z 10.4 PDF
No ratings yet
S&Z 10.4 PDF
20 pages
DV Unit5
No ratings yet
DV Unit5
113 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Decision Trees
No ratings yet
Decision Trees
37 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Trees
No ratings yet
Trees
78 pages
Decision Tree Theory
No ratings yet
Decision Tree Theory
22 pages
UNIT III Part-1
No ratings yet
UNIT III Part-1
69 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
Unit V - Graphical Models
No ratings yet
Unit V - Graphical Models
43 pages
UNIT III Part-2
No ratings yet
UNIT III Part-2
39 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
MI - Unit 4
No ratings yet
MI - Unit 4
79 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
Supervised Decision TreeRandom Forest
No ratings yet
Supervised Decision TreeRandom Forest
39 pages
Lec05 Intermediate Code Generation
No ratings yet
Lec05 Intermediate Code Generation
40 pages
AI - Mod 5. Part 2
No ratings yet
AI - Mod 5. Part 2
40 pages
TEAA - Tree Ensembles-1
No ratings yet
TEAA - Tree Ensembles-1
43 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
Benefits of Lift-and-Shift Strategy For Cloud Migration: Compute Storage Network On-Premise Infrastructure
No ratings yet
Benefits of Lift-and-Shift Strategy For Cloud Migration: Compute Storage Network On-Premise Infrastructure
22 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
07.2.decision Trees - ML
No ratings yet
07.2.decision Trees - ML
32 pages
Decision Trees
No ratings yet
Decision Trees
17 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Cellular Phones
No ratings yet
Cellular Phones
4 pages
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
17 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Decision Tree
No ratings yet
Decision Tree
15 pages
Classification and Regression Tree Construction
No ratings yet
Classification and Regression Tree Construction
18 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
11 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Catchlogs - 2024-12-11 at 13-30-58 - 6.111.2 - .Java
No ratings yet
Catchlogs - 2024-12-11 at 13-30-58 - 6.111.2 - .Java
22 pages
UNIT II Part-2
No ratings yet
UNIT II Part-2
32 pages
Decision Trees
No ratings yet
Decision Trees
18 pages
Mysql (Create, Insert, Select, Update, Delete)
No ratings yet
Mysql (Create, Insert, Select, Update, Delete)
7 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
DS Tech M 3 1
No ratings yet
DS Tech M 3 1
13 pages
IT Vulnerability and Remediation Management Security Standard
No ratings yet
IT Vulnerability and Remediation Management Security Standard
8 pages
Blockchain Based Framework For Software Development Using DevOps
No ratings yet
Blockchain Based Framework For Software Development Using DevOps
6 pages
Learning Area Grade Level Quarter Date I. Lesson Title Ii. Most Essential Learning Competencies (Melcs) Iii. Content/Core Content
No ratings yet
Learning Area Grade Level Quarter Date I. Lesson Title Ii. Most Essential Learning Competencies (Melcs) Iii. Content/Core Content
5 pages
Decision Trees
No ratings yet
Decision Trees
26 pages
66f2333917152bc83a343f60 94216597565
No ratings yet
66f2333917152bc83a343f60 94216597565
2 pages
QB 12678
No ratings yet
QB 12678
3 pages
K-Map Method
No ratings yet
K-Map Method
3 pages
The Most Effective Digital Marketing Strategies
No ratings yet
The Most Effective Digital Marketing Strategies
5 pages
Keywords
No ratings yet
Keywords
3 pages
51 C Arm Machine
No ratings yet
51 C Arm Machine
3 pages
SV102
No ratings yet
SV102
2 pages
How To Send To A Business Paypal - Google Search
No ratings yet
How To Send To A Business Paypal - Google Search
1 page
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet

Unit IV Decision Trees

Uploaded by

Unit IV Decision Trees

Uploaded by

Unit 4 – Decision Trees

Prepared By: Nivetha Raju

• It is a graphical representation for getting all the

• It is a hierarchical data structure implementing the divide-and-conquer

• The test uses only one of the input dimensions.

• Tree Induction – constructing tree from training sample.

Entropy is the measure of purity

• If node m is pure, generate a leaf and stop, otherwise split and

• Information Gain= Entropy(S)- [(Weighted Avg) *Entrop

• Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)

• S= Total number of samples

Gini Index= 1- ∑jPj2

• E(balance)<E(residence) => The child nodes from splitting on

Let us say for node m, Xm is the subset of X reaching node m; namely, it is

E 'm   j t r  gmj bmj xt 

You might also like