0% found this document useful (0 votes)

37 views25 pages

Decision Trees

Uploaded by

nehaalkhasim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views25 pages

Decision Trees

Uploaded by

nehaalkhasim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

Decision Trees

Decision Trees
• Decision Trees is a non-parametric Supervised learning technique that can
be used for both classification and Regression problems.

• It is a tree-structured classifier, where internal nodes represent the

features of a dataset, branches represent the decision rules and each leaf
node represents the outcome.

• In a Decision tree, there are two nodes, which are the Decision Node and
Leaf Node.

• Decision nodes are used to make any decision and have multiple branches.

• Leaf nodes are the output of those decisions and do not contain any further
branches.
Decision Tree
• The decisions or the test are performed on the basis of features of the
given dataset.

• It is a graphical representation for getting all the possible solutions to a

problem/decision based on given conditions.

• A decision tree simply asks a question, and based on the answer (Yes/No),
it further split the tree into subtrees.
Examples
Types of Decision Trees
• There are two main types of Decision Trees:
• Classification trees
• Regression trees
• Classification trees (Yes/No types)

• Regression trees (Continuous data types)

• Here the decision or the outcome variable is Continuous, Ex, a
number like 123.
• Iterative Dichotomiser 3 (ID3 Algorithm)
Why use Decision Trees?

• Decision Trees usually mimic human thinking ability while

making a decision, so it is easy to understand.

• The logic behind the decision tree can be easily understood

because it shows a tree-like structure.
Decision Tree Terminologies
• Root Node: Root node is from where the decision tree starts. It represents the
entire dataset, which further gets divided into two or more homogeneous sets.

• Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.

• Splitting: Splitting is the process of dividing the decision node/root node into sub-
nodes according to the given conditions.

• Branch/Sub Tree: A tree formed by splitting the tree.

• Pruning: Pruning is the process of removing the unwanted branches from the tree.

• Parent/Child node: The root node of the tree is called the parent node, and other
nodes are called the child nodes.
Working principles of Decision Tree
algorithm
• Step-1: Begin the tree with the root node, says S, which contains the complete
dataset.

• Step-2: Find the best attribute in the dataset using Attribute Selection Measure
(ASM).

• Step-3: Divide the S into subsets that contains possible values for the best
attributes.

• Step-4: Generate the decision tree node, which contains the best attribute.

• Step-5: Recursively make new decision trees using the subsets of the dataset
created in step -3. Continue this process until a stage is reached where you cannot
further classify the nodes and called the final node as a leaf node.
Attribute Selection Measures
• Attribute selection measure or ASM used to select the best attribute for the
nodes of the tree.

• Information Gain
• Gini Index

• Information Gain
• Information gain is the measurement of changes in entropy after the segmentation of
a dataset based on an attribute.
• It is the measure of how good an attribute is for predicting the class of each of the
training data..
• According to the value of information gain, we split the node and build the decision
tree.
Entropy
• Entropy, also called as Shannon Entropy is denoted by H(S) for
a finite set S, is the measure of the amount of uncertainty or
randomness in data.

Entropy H(s)= -P(Yes)log2 P(Yes)- P(No) log2 P(No)

Where,
•S= Total number of samples
•P(Yes)= Probability of Yes
•P(No)= Probability of No
Example
• For the set S = {Y,Y,Y,N,N,N,N,N}
• Total instances: 8
• Instances of N: 5
• Instances of Y: 3
Entropy H(S)= -P(yes)log2 P(yes)+ P(no) log2
P(no)
S

• If number of yes = number of no, Then P(s)=0.5 and Entropy(s)

=1
Information Gain
• Information Gain= Entropy(S)-
[(Weighted Avg) *Entropy(each feature) (or)

(or)

Information Gain = H(S) - H(S|X)

Gini Index
• Gini index is a measure of impurity or purity used while
creating a decision tree in the CART(Classification and
Regression Tree) algorithm.
• An attribute with the low Gini index should be preferred as
compared to the high Gini index.
• It only creates binary splits, and the CART algorithm uses the
Gini index to create binary splits.
• Gini index can be calculated using the below formula:
Decision Tree classifier (ID3)
Decision tree generation consists of two phases:
Tree construction
• Initially all the training examples are at the root

• Attributes are categorical(if continuous-valued, they are discretized in

advance)
• Partition examples based on selected attributes

• Attributes are selected on the basis of a heuristic or statistical measure

(e.g.,information gain)

Tree pruning
• Identify and remove branches that reflect noise or outliers.
Decision Tree classifier (ID3)
ID3 Steps
• Calculate the Information Gain of each feature.

• Considering that all rows don’t belong to the same class, split the dataset S
into subsets using the feature for which the Information Gain is maximum.

• Make a decision tree node using the feature with the maximum Information
gain.

• If all rows belong to the same class, make the current node as a leaf node with
the class as its label.

• Repeat for the remaining features until we run out of all features, or the
decision tree has all leaf nodes.
xample :dataset of COVID-19 infection
ID Fever Cough Breathing Infected
issues
1 NO NO NO NO
2 YES YES YES YES
3 YES YES NO NO
4 YES NO YES YES
5 YES YES YES YES
6 NO YES NO NO
7 YES NO YES YES
8 YES NO YES YES
9 NO YES YES YES
10 YES YES NO YES
11 NO YES NO NO
12 NO YES YES YES
13 NO YES YES NO
14 YES YES NO NO
Example Cont’d
• From the total of 14 rows in our dataset S, there are 8 rows
with the target value YES and 6 rows with the target value
NO.

• The entropy of S is calculated as:

Entropy H(S) = — (8/14) * log₂(8/14) — (6/14) * log₂(6/14)

= 0.99

• Next step calculate the Information Gain for each feature.

Example Cont’d
• Information Gain Calculation for Fever:
In our Data set 8 rows with YES for Fever, there
are 6 rows having target value YES and 2 rows having
# target value NO.
Total rows
|S| = 14
For v = YES, |Sᵥ| = 8
Entropy(Sᵥ) = - (6/8) * log₂(6/8) - (2/8) * log₂(2/8) = 0.81

For v = NO, |Sᵥ| = 6

Entropy(Sᵥ) = - (2/6) * log₂(2/6) - (4/6) * log₂(4/6) = 0.91

Expanding the summation in the IG formula:

H(S, Fever) = Entropy(S) - (|Sʏᴇꜱ| / |S|) * Entropy(Sʏᴇꜱ) - (|Sɴᴏ| / |S|) *

Entropy(Sɴᴏ)

H(S, Fever) = 0.99 - (8/14) * 0.81 - (6/14) * 0.91 = 0.13

Example Cont’d
• Information Gain Calculation for Cough:
Fever Cough Breathi Infecte
In our Data set 10 rows with YES for Fever, there are 5 rows
ng
issues
d having target value YES and 5 rows having target value NO.
YES YES YES YES
# Total rows
YES YES NO NO
|S| = 14
YES YES YES YES For v = YES, |Sᵥ| = 8
NO YES NO NO Entropy(Sᵥ) = - (5/8) * log₂(5/8) - (5/8) * log₂(5/8) = 0.84
NO YES YES YES
For v = NO, |Sᵥ| = 6
YES YES NO YES
Entropy(Sᵥ) = - (5/6) * log₂(5/6) - (1/6) * log₂(1/6) = 0.68
NO YES NO NO

NO YES YES YES Expanding the summation in the IG formula:

NO YES YES NO
H(S, Cough) = Entropy(S) - (|Sʏᴇꜱ| / |S|) * Entropy(Sʏᴇꜱ) - (|Sɴᴏ| / |S|)
YES YES NO NO * Entropy(Sɴᴏ)

H(S, Cough) = 0.99 - (8/14) * 0.84 - (6/14) * 0.68 = 0.21

Example Cont’d
• Information Gain Calculation for Breathing Issues:
ID Fever Cough Breathi Infected
ng In our Data set 8 rows with YES for Fever, there
issues are 7 rows having target value YES and 1 row
2 YES YES YES YES
having target value NO.
4 YES NO YES YES
# Total rows
5 YES YES YES YES |S| = 14
7 YES NO YES YES
For v = YES, |Sᵥ| = 8
Entropy(Sᵥ) = - (7/8) * log₂(7/8) - (1/8) * log₂(1/8) = 0.54
8 YES NO YES YES

9 NO YES YES YES For v = NO, |Sᵥ| = 6

12 NO YES YES YES
Entropy(Sᵥ) = - (1/6) * log₂(1/6) - (5/6) * log₂(5/6) = 0.65

13 NO YES YES NO
Expanding the summation in the IG formula:

H(S, Breathing Issues) = Entropy(S) - (|Sʏᴇꜱ| / |S|) *

Entropy(Sʏᴇꜱ) - (|Sɴᴏ| / |S|) * Entropy(Sɴᴏ)

H(S, Breathing Issues) = 0.99 - (8/14) * 0.54 - (6/14) *

0.65 = .40
Example Cont’d
• Since the feature Breathing issues have the highest
Information Gain it is used to create the root node. Hence, after
this initial step our tree looks like this:

• Next, from the remaining two unused features, namely, Fever

and Cough, we decide which one is the best for the left branch
of Breathing Issues.
Example Cont’d

• Since the left branch of Breathing Issues denotes YES, we

will work with the subset of the original data i.e the set of rows
having YES as the value in the Breathing Issues
column. These 8 rows are shown below:
Information Gain(Sʙʏ, Fever) =
0.20
Information Gain(Sʙʏ, Cough)
= 0.09

• IG of Fever is greater than that

of Cough, so we select Fever as
the left branch of Breathing
Issues.
Example Cont’d

Our tree now looks like this:

But, since there is only one unused feature left we have no other choice but to make it the right
branch of the root node. So our tree now looks like this:
Example Cont’d

• There are no more unused features, so we stop here and jump

to the final step of creating the leaf nodes. For the left leaf node
of Fever, we see the subset of rows from the original data set
that has Breathing Issues and Fever both values as YES.

• Since all the values in the target column are YES, we label the left leaf node
as YES, but to make it more logical we label it Infected.

• Similarly, for the right node of Fever we see the subset of rows from the original
data set that have Breathing Issues value as YES and Fever as NO.
Example Cont’d

• Here not all but most of the values are NO, hence NO or Not
Infected becomes our right leaf node. We repeat the same process for the
node Cough, however here both left and right leaves turn out to be the same
i.e. NO or Not Infected as shown below :

MSC Physics Education Syllabus 2016
No ratings yet
MSC Physics Education Syllabus 2016
13 pages
ML Lec-12
No ratings yet
ML Lec-12
17 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Module 3
No ratings yet
Module 3
101 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
Unit3 ID3 DT Examples
No ratings yet
Unit3 ID3 DT Examples
12 pages
Module 3
No ratings yet
Module 3
102 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Module 3 Chap 3 Decision Tree Learning
No ratings yet
Module 3 Chap 3 Decision Tree Learning
79 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
Lecture W5ab
No ratings yet
Lecture W5ab
56 pages
ML Unit 3 Qa
No ratings yet
ML Unit 3 Qa
26 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Decision Tree
100% (4)
Decision Tree
66 pages
6CS4-02 Machine Learning Manish Bhardwaj
No ratings yet
6CS4-02 Machine Learning Manish Bhardwaj
625 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
ML Unit 2 Final - III Yr
No ratings yet
ML Unit 2 Final - III Yr
72 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
06-Classification Part1
No ratings yet
06-Classification Part1
44 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
ML Lec5
No ratings yet
ML Lec5
7 pages
Lecture 2.6
No ratings yet
Lecture 2.6
23 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Decision Tree Intro MDT903
No ratings yet
Decision Tree Intro MDT903
40 pages
Chapter 2 Types of Machine Learning and Their Learning Strategies
No ratings yet
Chapter 2 Types of Machine Learning and Their Learning Strategies
45 pages
DM Unit 4
No ratings yet
DM Unit 4
24 pages
MLT Unit 3
100% (1)
MLT Unit 3
38 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Training Day 22
No ratings yet
Training Day 22
48 pages
Decision Tree Classifier-Introduction, ID3
No ratings yet
Decision Tree Classifier-Introduction, ID3
34 pages
Multiplication Tables Made Easy
From Everand
Multiplication Tables Made Easy
Pervaiz Salik
No ratings yet
Standard Jmit CV - PG
No ratings yet
Standard Jmit CV - PG
2 pages
Vibration Induced Fatigue Integrity Evaluation of Small Bore Piping
No ratings yet
Vibration Induced Fatigue Integrity Evaluation of Small Bore Piping
8 pages
Brochure Cleaning Cationic Surfactant Thickeners
100% (1)
Brochure Cleaning Cationic Surfactant Thickeners
4 pages
NEUST AAF F001 Syllabus of Instruction Rev.02
No ratings yet
NEUST AAF F001 Syllabus of Instruction Rev.02
4 pages
SHS Exam
No ratings yet
SHS Exam
3 pages
Audit Report
No ratings yet
Audit Report
2 pages
Autodesk AutoCAD Plant 3D - CAD System Manager Manual
No ratings yet
Autodesk AutoCAD Plant 3D - CAD System Manager Manual
29 pages
Blue Whale Communication
No ratings yet
Blue Whale Communication
9 pages
Caffeine Experiment
No ratings yet
Caffeine Experiment
6 pages
Cse Department - ANNA UNIVERSITY Important Question and Answers - Regulation 2013,2017 - STUDY MATERIAL, Notes
No ratings yet
Cse Department - ANNA UNIVERSITY Important Question and Answers - Regulation 2013,2017 - STUDY MATERIAL, Notes
5 pages
PGD Võ Nhai - đề 2
No ratings yet
PGD Võ Nhai - đề 2
10 pages
RunnTech H100 Multifunction Control Handle
No ratings yet
RunnTech H100 Multifunction Control Handle
3 pages
Rats Love Driving M
No ratings yet
Rats Love Driving M
2 pages
6c. Beam Deflection: Deflect V (X)
No ratings yet
6c. Beam Deflection: Deflect V (X)
2 pages
Cosumers Perception Towards Insurance Project Report
No ratings yet
Cosumers Perception Towards Insurance Project Report
36 pages
Why I Became An Atheist
No ratings yet
Why I Became An Atheist
170 pages
The Maverick Leader
No ratings yet
The Maverick Leader
16 pages
"Borrower's Perception On Prime Bank Limited": University of Liberal Arts Bangladesh
No ratings yet
"Borrower's Perception On Prime Bank Limited": University of Liberal Arts Bangladesh
57 pages
Week 3
No ratings yet
Week 3
22 pages
Differential GPS (A Method of Processing The GPS Data)
No ratings yet
Differential GPS (A Method of Processing The GPS Data)
23 pages
Space and Culture Using Space Syntax For The Tenganan Pageringsingan Housing of Bali, Indonesia
No ratings yet
Space and Culture Using Space Syntax For The Tenganan Pageringsingan Housing of Bali, Indonesia
5 pages
A Stroll Around RISC OS
No ratings yet
A Stroll Around RISC OS
9 pages
Composition: Pure Data As A Meta-Compositional Instrument
100% (2)
Composition: Pure Data As A Meta-Compositional Instrument
148 pages
A Culture of Safety Weinberg en 44834
No ratings yet
A Culture of Safety Weinberg en 44834
6 pages
Stochastic Processes (Second Edition) .: Journal of The Operational Research Society June 1996
No ratings yet
Stochastic Processes (Second Edition) .: Journal of The Operational Research Society June 1996
3 pages
01 MS For Piling Testing (FINAL)
No ratings yet
01 MS For Piling Testing (FINAL)
12 pages
Student Examination Eligibility Report (Generated On 12 - 09 - 2022 05 - 05 PM)
No ratings yet
Student Examination Eligibility Report (Generated On 12 - 09 - 2022 05 - 05 PM)
5 pages
Free & Forced Vortices: Manual
100% (1)
Free & Forced Vortices: Manual
11 pages

Decision Trees

Uploaded by

Decision Trees

Uploaded by

Decision Trees

• It is a tree-structured classifier, where internal nodes represent the

• It is a graphical representation for getting all the possible solutions to a

• Regression trees (Continuous data types)

• Decision Trees usually mimic human thinking ability while

• The logic behind the decision tree can be easily understood

• Branch/Sub Tree: A tree formed by splitting the tree.

Entropy H(s)= -P(Yes)log2 P(Yes)- P(No) log2 P(No)

• If number of yes = number of no, Then P(s)=0.5 and Entropy(s)

Information Gain = H(S) - H(S|X)

• Attributes are categorical(if continuous-valued, they are discretized in

• Attributes are selected on the basis of a heuristic or statistical measure

• The entropy of S is calculated as:

Entropy H(S) = — (8/14) * log₂(8/14) — (6/14) * log₂(6/14)

• Next step calculate the Information Gain for each feature.

For v = NO, |Sᵥ| = 6

Expanding the summation in the IG formula:

H(S, Fever) = Entropy(S) - (|Sʏᴇꜱ| / |S|) * Entropy(Sʏᴇꜱ) - (|Sɴᴏ| / |S|) *

H(S, Fever) = 0.99 - (8/14) * 0.81 - (6/14) * 0.91 = 0.13

NO YES YES YES Expanding the summation in the IG formula:

H(S, Cough) = 0.99 - (8/14) * 0.84 - (6/14) * 0.68 = 0.21

9 NO YES YES YES For v = NO, |Sᵥ| = 6

H(S, Breathing Issues) = Entropy(S) - (|Sʏᴇꜱ| / |S|) *

H(S, Breathing Issues) = 0.99 - (8/14) * 0.54 - (6/14) *

• Next, from the remaining two unused features, namely, Fever

• Since the left branch of Breathing Issues denotes YES, we

• IG of Fever is greater than that

Our tree now looks like this:

• There are no more unused features, so we stop here and jump

You might also like