0% found this document useful (0 votes)

21 views17 pages

ML Lec-12

ID3 is a decision tree algorithm that uses information gain to select the best feature to split on at each node. It builds the tree from the top down by starting with the root node and recursively splitting on the feature with the highest information gain. The tree splits the data into purer subsets until it reaches leaf nodes containing predominantly one class. Some disadvantages are that decision trees can be complex with many layers and prone to overfitting, but they are simple to understand and interpret.

Uploaded by

sankalp6414

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views17 pages

ML Lec-12

Uploaded by

sankalp6414

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

ML

LECTURE-12
BY
Dr. Ramesh Kumar Thakur
Assistant Professor (II)
School Of Computer Engineering
v Decision Tree is a Supervised learning technique that can be used for both classification and
Regression problems, but mostly it is preferred for solving Classification problems.
v It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches
represent the decision rules and each leaf node represents the outcome.
v In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are
used to make any decision and have multiple branches, whereas Leaf nodes are the output of those
decisions and do not contain any further branches.
v The decisions or the test are performed on the basis of features of the given dataset.
v It is a graphical representation for getting all the possible solutions to a problem/decision based on
given conditions.
v It is called a decision tree because, similar to a tree, it starts with the root node, which expands on further
branches and constructs a tree-like structure.

v Reasons for using the Decision tree:

v Decision Trees usually mimic human thinking ability while making a decision, so it is easy to understand.
v The logic behind the decision tree can be easily understood because it shows a tree-like structure.
v Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which
further gets divided into two or more homogeneous sets.
v Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting a
leaf node.
v Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to
the given conditions.
v Branch/Sub Tree: A tree formed by splitting the tree.
v Pruning: Pruning is the process of removing the unwanted branches from the tree.
v Parent/Child node: A node that is divided into sub-nodes is known as a parent node, and the sub-
nodes emerging from it are referred to as child nodes. The parent node represents a decision or
condition, while the child nodes represent the potential outcomes or further decisions based on that
condition.
v ID3 stands for Iterative Dichotomiser 3 and is named such because the algorithm iteratively
(repeatedly) dichotomizes(divides) features into two or more groups at each step.

v Invented by Ross Quinlan, ID3 uses a top-down greedy approach to build a decision tree.

v In simple words, the top-down approach means that we start building the tree from the top and the
greedy approach means that at each iteration we select the best feature at the present moment to
create a node.

v Most generally ID3 is only used for classification problems with nominal features only.
v ID3 algorithm selects the best feature at each step while building a Decision tree.

v So the answer to the question: ‘How does ID3 select the best feature?’ is that ID3 uses Information Gain
or just Gain to find the best feature.

v Information Gain calculates the reduction in the entropy and measures how well a given feature
separates or classifies the target classes.

v The feature with the highest Information Gain is selected as the best one.

v In simple words, Entropy is the measure of disorder and the Entropy of a dataset is the measure of
disorder in the target feature of the dataset.

v In the case of binary classification (where the target column has only two types of classes) entropy is 0 if
all values in the target column are homogenous(similar) and will be 1 if the target column has equal
number values for both the classes.
v We denote our dataset as S, entropy is calculated as:
Entropy(S) = - ∑ pᵢ * log₂(pᵢ) ; i = 1 to n
v where,
v n is the total number of classes in the target column (in our case n = 2 i.e YES and NO)
v pᵢ is the probability of class ‘i’ or the ratio of “number of rows with class i in the target column” to the
“total number of rows” in the dataset.

v Information Gain for a feature column A is calculated as:

IG(S, A) = Entropy(S) - ∑((|Sᵥ| / |S|) * Entropy(Sᵥ))
v where Sᵥ is the set of rows in S for which the feature column A has value v, |Sᵥ| is the number of rows in Sᵥ
and likewise |S| is the number of rows in S.
I. Calculate the Information Gain of each feature.

II. Considering that all rows don’t belong to the same class, split the dataset S into subsets using the feature
for which the Information Gain is maximum.

III. Make a decision tree node using the feature with the maximum Information gain.

IV. If all or most of the rows belong to the same class, make the current node as a leaf node with the class as
its label.

V. Repeat for the remaining features until we run out of all features, or the decision tree has all leaf nodes.
v The first step is to find the best feature i.e. the one that has the maximum Information Gain(IG).

v We’ll calculate the IG for each of the features now, but for that, we first need to calculate the entropy of S.

v From the total of 14 rows in our dataset S, there are 8 rows with the target value YES and 6 rows with the
target value NO. The entropy of S is calculated as:

Entropy(S) = - (8/14) * log₂(8/14) - (6/14) * log₂(6/14) = 0.99

v Note: If all the values in our target column are same the entropy will be zero (meaning that it has no
or zero randomness).

v We now calculate the Information Gain for each feature.

v IG calculation for Fever:
v In this(Fever) feature there are 8 rows having value YES and 6 rows having value NO.
v In the 8 rows with YES for Fever, there are 6 rows having target value YES and 2 rows having target
value NO.
v In the 6 rows with NO, there are 2 rows having target value YES and 4 rows having target value NO.
v |S| = 14
v For v = YES, |Sᵥ| = 8
v Entropy(Sᵥ) = - (6/8) * log₂(6/8) - (2/8) * log₂(2/8) = 0.81
v For v = NO, |Sᵥ| = 6
v Entropy(Sᵥ) = - (2/6) * log₂(2/6) - (4/6) * log₂(4/6) = 0.91

v # Expanding the summation in the IG formula:

v IG(S, Fever) = Entropy(S) - (|Sʏᴇꜱ| / |S|) * Entropy(Sʏᴇꜱ) - (|Sɴᴏ| / |S|) * Entropy(Sɴᴏ)
v ∴ IG(S, Fever) = 0.99 - (8/14) * 0.81 - (6/14) * 0.91 = 0.13
v Next, we calculate the IG for the features “Cough” and “Breathing issues”.
v IG(S, Cough) = 0.04
v IG(S, BreathingIssues) = 0.40
v Since the feature Breathing issues have the highest Information Gain it is used to create the root node.
v Hence, after this initial step our tree looks like this:

v Next, from the remaining two unused features, namely, Fever and Cough, we decide which one is the best
for the left branch of Breathing Issues.
v Since the left branch of Breathing Issues denotes YES, we will work with the subset of the original data i.e
the set of rows having YES as the value in the Breathing Issues column. These 8 rows are shown below:
v Next, we calculate the IG for the features Fever and Cough using the subset Sʙʏ (Set Breathing Issues Yes)
v Note: For IG calculation the Entropy will be calculated from the subset Sʙʏ and not the original dataset S.
v IG(Sʙʏ, Fever) = 0.20
v IG(Sʙʏ, Cough) = 0.09
v IG of Fever is greater than that of Cough, so we select Fever as the left branch of Breathing Issues.
v Our tree now looks like this:

v Next, we find the feature with the maximum IG for the right branch of Breathing Issues. But, since there is
only one unused feature left we have no other choice but to make it the right branch of the root node.
v So our tree now looks like this:

v There are no more unused features, so we stop here and jump to the final step of creating the leaf nodes.
v For the left leaf node of Fever, we see the subset of rows from the original data set that has Breathing
Issues and Fever both values as YES.

v Since all the values in the target column are YES, we label the left leaf node as YES, but to make it
more logical we label it Infected.
v Similarly, for the right node of Fever we see the subset of rows from the original data set that have
Breathing Issues value as YES and Fever as NO.

v Here not all but most of the values are NO, hence NO or Not Infected becomes our right leaf node.
v Our tree, now, looks like this:

v We repeat the same process for the node Cough, however here both left and right leaves turn out to be
the same i.e. NO or Not Infected as shown below:

v The right node of Breathing issues is as good as just a leaf node with class ‘Not infected’. This is one
of the Drawbacks of ID3, it doesn’t do pruning.
v Pruning is a mechanism that reduces the size and complexity of a Decision tree by removing
unnecessary nodes.
v Another drawback of ID3 is overfitting or high variance i.e. it learns the dataset it used so well that it fails
to generalize on new data which can be resolved using the Random Forest algorithm.
v Advantages of the Decision Tree
1. It is simple to understand as it follows the same process which a human follow while making any
decision in real-life.
2. It can be very useful for solving decision-related problems.
3. It helps to think about all the possible outcomes for a problem.
4. There is less requirement of data cleaning compared to other algorithms.

v Disadvantages of the Decision Tree

1. The decision tree contains lots of layers, which makes it complex.
2. It may have an overfitting issue, which can be resolved using the Random Forest algorithm.
3. For more class labels, the computational complexity of the decision tree may increase.
4. It may contain some unnecessary nodes which can be solved by prunning.

BNN Bootcamp 5 (Combination of Planets Part-3)
100% (3)
BNN Bootcamp 5 (Combination of Planets Part-3)
63 pages
Respectable Sins Student Edition Sample
100% (2)
Respectable Sins Student Edition Sample
16 pages
Normal Pulse Voltammetry
100% (2)
Normal Pulse Voltammetry
10 pages
Arroyo Oscar The World of Tomorrow
100% (1)
Arroyo Oscar The World of Tomorrow
5 pages
Gautama Buddha Was Born in Hela Bima
33% (3)
Gautama Buddha Was Born in Hela Bima
62 pages
Globe Telecom Globe Telecom: Jedi + 3G Migration
No ratings yet
Globe Telecom Globe Telecom: Jedi + 3G Migration
51 pages
Id Questio N A Graph Is A Set of - and Set of - A Vertices, Edges B Variables, Values C Vertices, Distances D Variable, Equation Answer A Marks 1 Unit 1
No ratings yet
Id Questio N A Graph Is A Set of - and Set of - A Vertices, Edges B Variables, Values C Vertices, Distances D Variable, Equation Answer A Marks 1 Unit 1
94 pages
Loft D55 Spec Sheet
No ratings yet
Loft D55 Spec Sheet
5 pages
Manual Alesis Qx25 Quickstart Guide Revb
No ratings yet
Manual Alesis Qx25 Quickstart Guide Revb
40 pages
PEPSICO
No ratings yet
PEPSICO
5 pages
HG Grade 3
No ratings yet
HG Grade 3
3 pages
Factors Affecting The Extent of Compliance of Adolescent Pregnant Mothers On Prenatal Care Services
100% (1)
Factors Affecting The Extent of Compliance of Adolescent Pregnant Mothers On Prenatal Care Services
29 pages
21 - Olorunfemi - Assessment of The Effect
No ratings yet
21 - Olorunfemi - Assessment of The Effect
7 pages
La Liberación Del Libro. Una Crítica Del Sistema de Precio Fijo. Pedro Schwartz.
No ratings yet
La Liberación Del Libro. Una Crítica Del Sistema de Precio Fijo. Pedro Schwartz.
79 pages
Quantam Computers
No ratings yet
Quantam Computers
21 pages
The Ghosts of Adichanallur - Artefacts That Suggest An Ancient Tamil Civilisation of Great Sophistication - The Hindu
No ratings yet
The Ghosts of Adichanallur - Artefacts That Suggest An Ancient Tamil Civilisation of Great Sophistication - The Hindu
12 pages
Action Plan in English
No ratings yet
Action Plan in English
4 pages
Tiploa LMD Fim 1904 010 1
No ratings yet
Tiploa LMD Fim 1904 010 1
2 pages
EE1005 L01 Computers & Programming
No ratings yet
EE1005 L01 Computers & Programming
35 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
Vaginal Exam Learning Guide
No ratings yet
Vaginal Exam Learning Guide
2 pages
Information Package: Including Terms & Conditions
No ratings yet
Information Package: Including Terms & Conditions
8 pages
Text
No ratings yet
Text
3 pages
Pruebas y Ajustes r1300G
No ratings yet
Pruebas y Ajustes r1300G
21 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
Lesson Plan (Thai Son)
No ratings yet
Lesson Plan (Thai Son)
8 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Group Assignment 6 ICT (XII IPA 5) - 20240118 - 003400 - 0000
No ratings yet
Group Assignment 6 ICT (XII IPA 5) - 20240118 - 003400 - 0000
13 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
Soal Kelas X
No ratings yet
Soal Kelas X
5 pages
Group 8 Ocampo ED 203 MidTerm Exam
No ratings yet
Group 8 Ocampo ED 203 MidTerm Exam
6 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
UNIT3
No ratings yet
UNIT3
71 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
No ratings yet
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
19 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Decision Tree Algorithm: and Classification Problems Too
No ratings yet
Decision Tree Algorithm: and Classification Problems Too
12 pages
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-29 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-29 Reference-Material-I
48 pages
Chapter 2 Types of Machine Learning and Their Learning Strategies
No ratings yet
Chapter 2 Types of Machine Learning and Their Learning Strategies
45 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
Lec-3-Decision Trees
No ratings yet
Lec-3-Decision Trees
47 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
Bhabesh - Chapter 3 Complete Editing Including Summary
No ratings yet
Bhabesh - Chapter 3 Complete Editing Including Summary
18 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
ML Unit-2.1
No ratings yet
ML Unit-2.1
17 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
DM Unit Iii
No ratings yet
DM Unit Iii
87 pages
ID3
No ratings yet
ID3
7 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Decision Tree Classifier-Introduction, ID3
No ratings yet
Decision Tree Classifier-Introduction, ID3
34 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
Unit 4 - Decision Tree ID3
No ratings yet
Unit 4 - Decision Tree ID3
5 pages
Decision Trees
No ratings yet
Decision Trees
19 pages
Lec4 - Decision Trees
No ratings yet
Lec4 - Decision Trees
43 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
41 pages
Chawimawi Ru
No ratings yet
Chawimawi Ru
1 page
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Assessments in Occupational Therapy Mental Health An Integrative Approach, 4th Edition Full Digital Edition
100% (15)
Assessments in Occupational Therapy Mental Health An Integrative Approach, 4th Edition Full Digital Edition
16 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
RDBMS Unit2
No ratings yet
RDBMS Unit2
28 pages
Lecture 2.6
No ratings yet
Lecture 2.6
23 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
ML Lec5
No ratings yet
ML Lec5
7 pages
Unit3 ID3 DT Examples
No ratings yet
Unit3 ID3 DT Examples
12 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
Lecture W5ab
No ratings yet
Lecture W5ab
56 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
ML Unit 3 Qa
No ratings yet
ML Unit 3 Qa
26 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet

ML Lec-12

Uploaded by

ML Lec-12

Uploaded by

ML

v Reasons for using the Decision tree:

v Information Gain for a feature column A is calculated as:

Entropy(S) = - (8/14) * log₂(8/14) - (6/14) * log₂(6/14) = 0.99

v We now calculate the Information Gain for each feature.

v # Expanding the summation in the IG formula:

v Disadvantages of the Decision Tree

You might also like