Decision Tree

Uploaded by

punam chavan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views27 pages

Decision Tree

Uploaded by

punam chavan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Decision Tree

-by Ms.Ashwini D. Khairkar

Decision Tree
• Decision tree algorithm falls under the category of supervised
learning. They can be used to solve both regression and classification
problems.
• Decision tree uses the tree representation to solve the problem in
which each leaf node corresponds to a class label and attributes are
represented on the internal node of the tree.
• We can represent any boolean function on discrete attributes using
the decision tree.
Cond …
Cond..
• There are two main types of Decision Trees:
1.Classification trees (Yes/No types)
• What we’ve seen above is an example of classification tree, where
the outcome was a variable like ‘fit’ or ‘unfit’. Here the decision
variable is Categorical.
2.Regression trees (Continuous data types)
• Here the decision or the outcome variable is Continuous, e.g. a
number like 123. Working Now that we know what a Decision Tree
is, we’ll see how it works internally. There are many algorithms out
there which construct Decision Trees, but one of the best is called
as ID3 Algorithm. ID3 Stands for Iterative Dichotomiser 3. Before
discussing the ID3 algorithm, we’ll go through few definitions.
Below are some assumptions that we made while using
decision tree:
• At the beginning, we consider the whole training set as the root.
• Feature values are preferred to be categorical. If the values are
continuous then they are discretized prior to building the model.
• On the basis of attribute values records are distributed recursively.
• We use statistical methods for ordering attributes as root or the
internal node.
Cond…
Important terminology

1.Root Node: This attribute is used for dividing the data into two or more
sets. The feature attribute in this node is selected based on Attribute
Selection Techniques.
2.Branch or Sub-Tree: A part of the entire decision tree is called a branch or
sub-tree.
3.Splitting: Dividing a node into two or more sub-nodes based on if-else
conditions.
4.Decision Node: After splitting the sub-nodes into further sub-nodes, then
it is called the decision node.
5.Leaf or Terminal Node: This is the end of the decision tree where it cannot
be split into further sub-nodes.
6.Pruning: Removing a sub-node from the tree is called pruning.
Cond..
Cond..
• In Decision Tree the major challenge is to identification of the attribute for
the root node in each level. This process is known as attribute selection.
We have two popular attribute selection measures:
1.Information Gain
2.Gini Index
• 1.Information Gain
When we use a node in a decision tree to partition the training instances
into smaller subsets the entropy changes. Information gain is a measure of
this change in entropy.
Definition: Suppose S is a set of instances, A is an attribute, Sv is the subset
of S with A = v, and Values (A) is the set of all possible values of A, then
Information Gain = (Entropy of parent node)-(Entropy of child node)
• For example, in a binary classification problem (two classes),
we can calculate the entropy of the data sample as follows:
Cond …
Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)
Gini index
• Gini Index is a metric to measure how often a randomly chosen
element would be incorrectly identified.
• It means an attribute with lower Gini index should be preferred.
• Sklearn supports “Gini” criteria for Gini Index and by default, it takes
“gini” value.
• The Formula for the calculation of the of the Gini Index is given below.
What are the steps in ID3 algorithm?

• The steps in ID3 algorithm are as follows:

1.Calculate entropy for dataset.
2.For each attribute/feature.
2.1. Calculate entropy for all its categorical values.
2.2. Calculate information gain for the feature.
3.Find the feature with maximum information gain.
4.Repeat it until we get the desired tree.
Classification using the ID3 algorithm
Consider whether a dataset based on which we will determine whether
to play Tennis or not.

Here, dataset is of binary

classes(yes and no), where
9 out of 14 are "yes" and 5
out of 14 are "no".
Cond..
• Complete entropy of dataset is:
H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no))
= - (9/14) * log2(9/14) - (5/14) * log2(5/14)
= - (-0.41) - (-0.53) = 0.94
For each attribute of the dataset, let's follow the step-2 of pseudocode : -
First Attribute - Outlook

Categorical values - sunny, overcast and rain

H(Outlook=sunny) = -(2/5)*log(2/5)-(3/5)*log(3/5) =0.971
H(Outlook=rain)= -(3/5)*log(3/5)-(2/5)*log(2/5) =0.971
H(Outlook=overcast) = -(4/4)*log(4/4)-0 = 0
Average Entropy Information for Outlook –
I(Outlook) = p(sunny) * H(Outlook=sunny) + p(rain) * H(Outlook=rain) + p(overcast) *
H(Outlook=overcast)
= (5/14)*0.971 + (5/14)*0.971 + (4/14)*0
= 0.693
Information Gain = H(S) - I(Outlook) = 0.94 - 0.693 = 0.247
Cond..
Second Attribute - Temperature
Categorical values - hot, mild, cool
H(Temperature=hot) = -(2/4)*log(2/4)-(2/4)*log(2/4) = 1
H(Temperature=cool) = -(3/4)*log(3/4)-(1/4)*log(1/4) = 0.811
H(Temperature=mild) = -(4/6)*log(4/6)-(2/6)*log(2/6) = 0.9179
Average Entropy Information for Temperature –
I(Temperature) = p(hot)*H(Temperature=hot) + p(mild)*H(Temperature=mild) + p(cool)*H(Temperature=cool)
= (4/14)*1 + (6/14)*0.9179 + (4/14)*0.811 = 0.9108
Information Gain = H(S) - I(Temperature) = 0.94 - 0.9108 = 0.0292

Third Attribute - Humidity

Categorical values - high, normal
H(Humidity=high) = -(3/7)*log(3/7)-(4/7)*log(4/7) = 0.983
H(Humidity=normal) = -(6/7)*log(6/7)-(1/7)*log(1/7) = 0.591
Average Entropy Information for Humidity –
I(Humidity) = p(high)*H(Humidity=high) + p(normal)*H(Humidity=normal)
= (7/14)*0.983 + (7/14)*0.591 =
0.787
Information Gain = H(S) - I(Humidity) = 0.94 - 0.787 = 0.153
Cond..
• Fourth Attribute - Wind
Categorical values - weak, strong
H(Wind=weak) = -(6/8)*log(6/8)-(2/8)*log(2/8) = 0.811
H(Wind=strong) = -(3/6)*log(3/6)-(3/6)*log(3/6) = 1
Average Entropy Information for Wind –
I(Wind) = p(weak)*H(Wind=weak) + p(strong)*H(Wind=strong) = (8/14)*0.811 + (6/14)*1 = 0.892
Information Gain = H(S) - I(Wind) = 0.94 - 0.892 = 0.048

Here, the attribute with maximum information gain is Outlook. So, the decision tree built so far -
Cond ..
• Here, when Outlook == overcast, it is of pure class(Yes).
Now, we have to repeat same procedure for the data with rows
consist of Outlook value as Sunny and then for Outlook value
as Rain.
• Now, finding the best attribute for splitting the data with
Outlook=Sunny values{ Dataset rows = [1, 2, 8, 9, 11]}.

Complete entropy of Sunny is –

H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no))
= - (2/5) * log2(2/5) - (3/5) * log2(3/5) = 0.971
Cond..
• First Attribute - Temperature
Categorical values - hot, mild, cool
H(Sunny, Temperature=hot) = -0-(2/2)*log(2/2) = 0
H(Sunny, Temperature=cool) = -(1)*log(1)- 0 = 0
H(Sunny, Temperature=mild) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1
Average Entropy Information for Temperature –
I(Sunny, Temperature) = p(Sunny, hot)*H(Sunny, Temperature=hot) + p(Sunny, mild)*H(Sunny, Temper
+ p(Sunny, cool)*H(Sunny, Temperature=cool)
= (2/5)*0 + (1/5)*0 + (2/5)*1 = 0.4
Information Gain = H(Sunny) - I(Sunny, Temperature) = 0.971 - 0.4 = 0.571
Cond..
• Second Attribute - Humidity

Categorical values - high, normal

H(Sunny, Humidity=high) = - 0 - (3/3)*log(3/3) = 0
H(Sunny, Humidity=normal) = -(2/2)*log(2/2)-0 = 0
Average Entropy Information for Humidity –
I(Sunny, Humidity) = p(Sunny, high)*H(Sunny, Humidity=high) + p(Sunny, normal)*H(Sunny, Humidity=normal)
= (3/5)*0 + (2/5)*0 = 0
Information Gain = H(Sunny) - I(Sunny, Humidity) = 0.971 - 0 = 0.971
Cond..
• Third Attribute - Wind

Categorical values - weak, strong

H(Sunny, Wind=weak) = -(1/3)*log(1/3)-(2/3)*log(2/3) = 0.918
H(Sunny, Wind=strong) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1
Average Entropy Information for Wind –
I(Sunny, Wind) = p(Sunny, weak)*H(Sunny, Wind=weak) + p(Sunny, strong)*H(Sunny, Wind=strong)
= (3/5)*0.918 + (2/5)*1 = 0.9508
Information Gain = H(Sunny) - I(Sunny, Wind) = 0.971 - 0.9508 = 0.0202
Cond..
• Here, the attribute with maximum information gain is Humidity.
So, the decision tree built so far -
Cond …
• Here, when Outlook = Sunny and Humidity = High, it is a pure
class of category "no". And When Outlook = Sunny and
Humidity = Normal, it is again a pure class of category "yes".
Therefore, we don't need to do further calculations.
• Now, finding the best attribute for splitting the data with
Outlook=Sunny values{ Dataset rows = [4, 5, 6, 10, 14]}.
Cond…
Complete entropy of Rain is –
H(S) = - p(yes) * log2(p(yes )) - p(no) * log2(p(no))
= - (3/5) * log(3/5) - (2/5) * log(2/5) = 0.971
First Attribute - Temperature

Categorical values - mild, cool

H(Rain, Temperature=cool) = -(1/2)*log(1/2)- (1/2)*log(1/2) = 1
H(Rain, Temperature=mild) = -(2/3)*log(2/3)-(1/3)*log(1/3) = 0.918
Average Entropy Information for Temperature –
I(Rain, Temperature) = p(Rain, mild)*H(Rain, Temperature=mild) + p(Rain, cool)*H(Rain, Temperature=cool)
= (2/5)*1 + (3/5)*0.918 = 0.9508
Information Gain = H(Rain) - I(Rain, Temperature) = 0.971 - 0.9508 = 0.0202
Cond..
• Second Attribute - Wind
Categorical values - weak, strong
H(Wind=weak) = -(3/3)*log(3/3)-0 = 0
H(Wind=strong) = 0-(2/2)*log(2/2) = 0
Average Entropy Information for Wind –
I(Wind) = p(Rain, weak)*H(Rain, Wind=weak) + p(Rain, strong)*H(Rain, Wind=strong)
= (3/5)*0 + (2/5)*0 = 0
Information Gain = H(Rain) - I(Rain, Wind) = 0.971 - 0 = 0.971
Final desired output
• Here, when Outlook = Rain and Wind = Strong, it is a pure class
of category "no". And When Outlook = Rain and Wind = Weak, it
is again a pure class of category "yes"
characteristics of ID3 algorithm

1.ID3 uses a greedy approach that's why it does not guarantee an

optimal solution; it can get stuck in local optimums.
2.ID3 can overfit to the training data (to avoid overfitting, smaller
decision trees should be preferred over larger ones).
3.This algorithm usually produces small trees, but it does not always
produce the smallest possible tree.
4.ID3 is harder to use on continuous data (if the values of any given
attribute is continuous, then there are many more places to split the
data on this attribute, and searching for the best value to split by can
be time consuming).

Sniper Mathematics PP1 QS
100% (1)
Sniper Mathematics PP1 QS
16 pages
Graboplast ENG PDF
No ratings yet
Graboplast ENG PDF
11 pages
Decision Tree
100% (1)
Decision Tree
10 pages
ML-19 (1)
No ratings yet
ML-19 (1)
28 pages
ML_Unit-3
No ratings yet
ML_Unit-3
29 pages
What Is An ID3 Algorithm?
No ratings yet
What Is An ID3 Algorithm?
10 pages
3ID3 Algorithm
No ratings yet
3ID3 Algorithm
9 pages
07_Decision tree
No ratings yet
07_Decision tree
45 pages
ID3
No ratings yet
ID3
7 pages
Decision Tree Classification
100% (1)
Decision Tree Classification
11 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
Decision Tree (Class 37-38) 169692509554958626652505a71d481
No ratings yet
Decision Tree (Class 37-38) 169692509554958626652505a71d481
45 pages
Classification - Issues Regarding Classification and Prediction
No ratings yet
Classification - Issues Regarding Classification and Prediction
42 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
unit 3
No ratings yet
unit 3
90 pages
3 Decision Trees_LMS
No ratings yet
3 Decision Trees_LMS
47 pages
ID3 Decision Tree Explanation
No ratings yet
ID3 Decision Tree Explanation
8 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
ID3_Complete_Solution
No ratings yet
ID3_Complete_Solution
3 pages
Lec-2 Decision Tree_13-8-2024
No ratings yet
Lec-2 Decision Tree_13-8-2024
38 pages
Decision Tree
No ratings yet
Decision Tree
100 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
7. Decision Tree & Random Forest
No ratings yet
7. Decision Tree & Random Forest
41 pages
ML intro
No ratings yet
ML intro
45 pages
id3algorithm-200307175839
No ratings yet
id3algorithm-200307175839
22 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
ML Unit-3 ppt
No ratings yet
ML Unit-3 ppt
92 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Unit-3 (1)
No ratings yet
Unit-3 (1)
81 pages
Decisiontrees
No ratings yet
Decisiontrees
46 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Unit 4 - Decision Tree ID3
No ratings yet
Unit 4 - Decision Tree ID3
5 pages
Entropy and Information Gain Explained
No ratings yet
Entropy and Information Gain Explained
10 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Week - 2 Day - 2 Machine Learning 2 - 3
No ratings yet
Week - 2 Day - 2 Machine Learning 2 - 3
33 pages
Unit6 -2 Classification-Decision-Trees_25625586-1bf9-4821-a721-70db2d7805ef
No ratings yet
Unit6 -2 Classification-Decision-Trees_25625586-1bf9-4821-a721-70db2d7805ef
36 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Randomforest TNP
No ratings yet
Randomforest TNP
71 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
Decision Tree Classifier-Introduction, ID3
No ratings yet
Decision Tree Classifier-Introduction, ID3
34 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
7-Decision Trees Learning
No ratings yet
7-Decision Trees Learning
51 pages
Decision Trees
No ratings yet
Decision Trees
19 pages
Decision Tree
100% (4)
Decision Tree
66 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
74 pages
DM UNIT 4b (1R ALGO)
No ratings yet
DM UNIT 4b (1R ALGO)
39 pages
Decision Trees
No ratings yet
Decision Trees
49 pages
Decision Tree
No ratings yet
Decision Tree
29 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
Lesson 5
No ratings yet
Lesson 5
28 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
52 pages
ID3 Algorithm
No ratings yet
ID3 Algorithm
22 pages
Speed Mathamatics
From Everand
Speed Mathamatics
Naila Hina
1/5 (1)
Content Creator_ How to Stand Out Amongst the Noise
100% (1)
Content Creator_ How to Stand Out Amongst the Noise
106 pages
Demonstration Project of The Solar Hydrogen Energy System Located On Taleghan-Iran: Technical-Economic Assessments
No ratings yet
Demonstration Project of The Solar Hydrogen Energy System Located On Taleghan-Iran: Technical-Economic Assessments
8 pages
1888 Victorian
No ratings yet
1888 Victorian
18 pages
Introduction To English As A Second Language Teacher's Book Fourth Edition
No ratings yet
Introduction To English As A Second Language Teacher's Book Fourth Edition
19 pages
QP - CB - IV - Computer Studies - CBQ - 1
No ratings yet
QP - CB - IV - Computer Studies - CBQ - 1
2 pages
232LPTTL
No ratings yet
232LPTTL
2 pages
NCR Reported by NCR Issued To: Non-Conformance Report
No ratings yet
NCR Reported by NCR Issued To: Non-Conformance Report
2 pages
2020 3 Kings Newsletter
No ratings yet
2020 3 Kings Newsletter
26 pages
Ict
100% (1)
Ict
4 pages
Anaerobic Treatment Process
No ratings yet
Anaerobic Treatment Process
8 pages
Komuter - Ticket 6
No ratings yet
Komuter - Ticket 6
2 pages
The Colorblock Snapshot Marc Jacobs Official Site
No ratings yet
The Colorblock Snapshot Marc Jacobs Official Site
1 page
Phase Standard Color Code
100% (1)
Phase Standard Color Code
4 pages
ELA-CAPDEV 2017-2019-Brief Profile
No ratings yet
ELA-CAPDEV 2017-2019-Brief Profile
40 pages
DL Project Report Anirudh Bhardwaj
No ratings yet
DL Project Report Anirudh Bhardwaj
11 pages
HJHJBBBB
No ratings yet
HJHJBBBB
6 pages
Understanding Wiegand Suprema
No ratings yet
Understanding Wiegand Suprema
9 pages
SM025 Topic 1 Integration
No ratings yet
SM025 Topic 1 Integration
5 pages
Events and Signals in Uml PDF
No ratings yet
Events and Signals in Uml PDF
2 pages
Grade 9 Auto
No ratings yet
Grade 9 Auto
10 pages
MGT490 Final Report Bashundhara
No ratings yet
MGT490 Final Report Bashundhara
23 pages
EA_Final_Adyar (1)
No ratings yet
EA_Final_Adyar (1)
98 pages
Marquess Of Winter A Historical Regency Romance Novel The Wild Brides Book 3 Hazel Linwood download
100% (2)
Marquess Of Winter A Historical Regency Romance Novel The Wild Brides Book 3 Hazel Linwood download
36 pages
UNIT_1
No ratings yet
UNIT_1
140 pages
Aerosols
No ratings yet
Aerosols
50 pages
Vim From Essentials To Mastery 2011
No ratings yet
Vim From Essentials To Mastery 2011
294 pages
Techno-Economic and Environmental Assessment of PEM Water Electrolysis For Green H2 Production
No ratings yet
Techno-Economic and Environmental Assessment of PEM Water Electrolysis For Green H2 Production
88 pages
Shipping Law Dissertation
100% (2)
Shipping Law Dissertation
4 pages