0% found this document useful (0 votes)

21 views

Lecture 21 (DS) - Decision Tree

decision tree

Uploaded by

anayabutt658

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Lecture 21 (DS) - Decision Tree

decision tree

Uploaded by

anayabutt658

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Data Science

Lecture # 22
Decision Tree
• Lets look at Decision Tree model, a popular method
used for classification
• By the end of this lecture, you should be able to:
• Explain how a decision tree is used for classification
• Describe the process of constructing a decision tree for
classification
• Interpret how a decision tree comes up with a classification
decision
2

Note: All Images are taken from edx.org

Decision Tree Overview

Note: All Images are taken from edx.org

Decision Tree Overview
• The idea behind decision tree is to split the data into
subsets where each subset belongs to only one class
• This is accomplished by dividing the input space into
pure regions
• i.e. regions with samples from only one class
• With real data completely pure subsets may not be
possible, so we divide into subsets that are as pure
as possible
• Decision tree makes classification decision based on
decision boundaries 4

Note: All Images are taken from edx.org

Classification Using Decision Tree
• The root and internal nodes
have test conditions
• Each leaf node has a class
label associated with it
• Decision is made by
traversing the decision tree
• At each node test condition
answer determines which
branch to traverse
• When a leaf node is reached,
the category at the leaf node
determines the decision 5

Note: All Images are taken from edx.org

Classification Using Decision Tree
• Depth of a node is the
number of edges from root
to that node
• The depth of root node is
zero
• Depth of tree is the
number of edges in the
longest path
• Size of tree is the number
of nodes in the tree
6

Note: All Images are taken from edx.org

Example Decision Tree

• This decision tree is used to classify an animal as a 7

mammal or not a mammal

Note: All Images are taken from edx.org
Constructing Decision Tree
• Constructing a decision tree consists of following
steps:
• Start with all samples at a node
• i.e. starting with all samples at root node
• Adding additional nodes when data is split into subsets

• Partition samples based on input to create purest subsets

• i.e. each subset contains as many samples as possible belonging
to just one class

• Repeat to partition data into successively purer subsets

• Do this process until stopping criteria are satisfied

• An algorithm for constructing a decision tree model is 8

called induction algorithm

Note: All Images are taken from edx.org
Greedy Approach
• At each split,
the induction
algorithm only
considers the
best way to split
the particular
portion of the
data
• This is referred
to as greedy
approach
9

Note: All Images are taken from edx.org

How to Determine Best Split?
• Again the goal is
to partition the
data into subsets
as pure as possible
• In this example,
right partition is
more
homogeneous
subsets, since
these contain
more samples
belonging to a
single class 10

Note: All Images are taken from edx.org

Impurity Measure
• Therefore, we need to
measure the purity of a
split
• Impurity measure of a
node specifies how
mixed the resulting
subsets are
• We want the split that
minimizes the impurity
measure
• Other impurity measures
are entropy and
misclassification rate 11

Note: All Images are taken from edx.org

What Variable to Split On?
• The other factor in
determining the best
way to partition a node
is which variable to split
on
• Decision tree will test all
variables to determine
the best way to split the
nodes, using a purity
measure such as Gini
index to compare the
various possibilities 12

Note: All Images are taken from edx.org

When to Stop Splitting a Node?
• Recall that tree induction algorithm repeatedly splits
nodes to get more and more homogeneous datasets
• So when does this process stop building subsets?

• All (or x% of) samples have same class label

• Number of samples in node reaches a minimum value
• Change in impurity measure is smaller than threshold
• Max tree depth is reached
• Others… (but we’ll not discuss here) 13

Note: All Images are taken from edx.org

Tree Induction Example: Split 1

• Let’s say we want to

classify loan
applicants as being
likely to repay a loan,
or not likely to repay a
loan, based on their
income and amount of
debt they have

Note: All Images are taken from edx.org

• Building
Tree aInduction Example:
decision tree for Split
this classification 1
problem
could proceed as follows
• Consider the input space of this problem, as shown
in left figure
• One way to split this dataset into a more
homogeneous subset is to consider the decision
boundary where income is t1.
• To the right of this decision boundary are mostly red
samples
• The subsets are not completely homogeneous, but
that is the best way to split the original dataset 15

based on variable income

Note: All Images are taken from edx.org
Tree Induction Example: Split 2
• Income > t1
represented at root
node
• This is the condition
used to split the original
dataset
• Samples > t1 are
placed in right subset
and < t1 are placed in
left subset
• Because right subset is
almost perfect, it is now
16
labeled as RED
Note: All Images are taken from edx.org
Tree
• RED Induction
means Example:
loan applicant Split
loan applicants 2
likely to
repay the loan
• The second step, then, is to determine how to split
the region outlined in red
• The best way to split this data is specified by the
second decision boundary, with debts equals t2
• This is represented in the decision tree on the right
with the addition of the node with condition debt >
t2
• This region contains all blue samples meaning that
the loan applicant is not likely to repay the loan 17

Note: All Images are taken from edx.org

Decision Boundaries
• The final decision tree
implements the decision
boundaries shown as
dashed lines in left
diagram
• The label for each region
is determined by the
label of the majority of
the samples
• These labels are
reflected in the leaf
nodes of the decision
tree shown on the right 18

Note: All Images are taken from edx.org

Decision Boundaries
• Notice that decision boundaries are parallel to axes
referred as rectilinear
• The boundaries are rectilinear because each split
considers only a single variable
• Some algorithms can consider more than one
variables
• However each split has to consider all combinations
of combined variables
• Such induction algorithms are more computationally
intensive 19

Note: All Images are taken from edx.org

Decision Tree for Classification
• There are few important things to note about the
decision tree classifier
• Resulting tree is often simple and easy to understand
• Induction is computationally inexpensive, so training a
decision tree for classification can be relatively fast
• Greedy approach does not guarantee best solution
• Rectilinear decision boundaries which means it may not be
able to solve complicated classification problems that
require complex decision boundaries
• Discuss Week 7 notebooks 20

Note: All Images are taken from edx.org

Sample Appeal Letter For Schengen Visa Refusal
79% (19)
Sample Appeal Letter For Schengen Visa Refusal
1 page
Personal and Intimate Workbook
No ratings yet
Personal and Intimate Workbook
17 pages
Wisc IV
No ratings yet
Wisc IV
13 pages
Lesson Plan in Science V: I. Objectives
100% (2)
Lesson Plan in Science V: I. Objectives
5 pages
decision_trees
No ratings yet
decision_trees
19 pages
_decision_trees
No ratings yet
_decision_trees
19 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Trees and Forests: Machine Learning With Python Cookbook
No ratings yet
Trees and Forests: Machine Learning With Python Cookbook
5 pages
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
No ratings yet
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
18 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
TEAA_ Tree Ensembles-1
No ratings yet
TEAA_ Tree Ensembles-1
43 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
Introduction to Decision Tree Algorithm
No ratings yet
Introduction to Decision Tree Algorithm
11 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
Trinh Khanh Ly 20213676
No ratings yet
Trinh Khanh Ly 20213676
13 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
decision tree
No ratings yet
decision tree
13 pages
Decision Tree
No ratings yet
Decision Tree
21 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Unit 3
No ratings yet
Unit 3
31 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
Decision Tree For Classification (ID3 Information Gain Entropy)
No ratings yet
Decision Tree For Classification (ID3 Information Gain Entropy)
3 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Unit-II - Tree Based Methods
No ratings yet
Unit-II - Tree Based Methods
158 pages
Decision Tree Theory
No ratings yet
Decision Tree Theory
22 pages
Adobe Scan 16 May 2023 (5)
No ratings yet
Adobe Scan 16 May 2023 (5)
12 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
17 pages
ML_UNIT_3_NOTES-1
No ratings yet
ML_UNIT_3_NOTES-1
118 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Decision Trees
No ratings yet
Decision Trees
21 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
ESGB_2025_classification and regression tress [Enregistré automatiquement]
No ratings yet
ESGB_2025_classification and regression tress [Enregistré automatiquement]
43 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
Decision Tree Induction Algorithm
No ratings yet
Decision Tree Induction Algorithm
6 pages
Chapter 5. Decision Trees
No ratings yet
Chapter 5. Decision Trees
53 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
ML-chap9_2024_110217
No ratings yet
ML-chap9_2024_110217
52 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Apznzayn4iudcvxyoppqs61j04 7hfvwveb4orry3irmq7ekrlv08lh81olz64cb1ycwzmxuattzrg0ox0g-e Tcprei1i3bwhbnbqofqhvtixwokm0ftaoxwee3znpcytoh6jgknlof6 Rukjysosqdyan8wfbovpzrikmrpeywyu07ft Vvpsanuerxuhcghc7g6sd4pcyi9z-Wao8bn
No ratings yet
Apznzayn4iudcvxyoppqs61j04 7hfvwveb4orry3irmq7ekrlv08lh81olz64cb1ycwzmxuattzrg0ox0g-e Tcprei1i3bwhbnbqofqhvtixwokm0ftaoxwee3znpcytoh6jgknlof6 Rukjysosqdyan8wfbovpzrikmrpeywyu07ft Vvpsanuerxuhcghc7g6sd4pcyi9z-Wao8bn
20 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Ch5 Data Science
No ratings yet
Ch5 Data Science
60 pages
Classification: Decision Trees: Business Analytics Lecture 7/8
No ratings yet
Classification: Decision Trees: Business Analytics Lecture 7/8
35 pages
Les 3 DWM
No ratings yet
Les 3 DWM
21 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
15 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Session 9 10 Decision Tree
No ratings yet
Session 9 10 Decision Tree
41 pages
PR GTU IMP questions by jay
No ratings yet
PR GTU IMP questions by jay
35 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Republic of Somalia لاـموصلا ةيروهــــــمج: College Of Health Science Graduation Thesis
No ratings yet
Republic of Somalia لاـموصلا ةيروهــــــمج: College Of Health Science Graduation Thesis
12 pages
Development of Natural Law Theory in Renaissance Period
No ratings yet
Development of Natural Law Theory in Renaissance Period
14 pages
Bhartidasan University Brouchre
No ratings yet
Bhartidasan University Brouchre
23 pages
Formerly San Manuel High School Annex
No ratings yet
Formerly San Manuel High School Annex
8 pages
Elements of Heat Transfer - M. Jacob and G. A. Hawkins
No ratings yet
Elements of Heat Transfer - M. Jacob and G. A. Hawkins
80 pages
Synthesis Sunga
No ratings yet
Synthesis Sunga
3 pages
Research Methodology - Survey Method 2020
No ratings yet
Research Methodology - Survey Method 2020
23 pages
ISE F - Speaking & Listening Classroom Activities
100% (1)
ISE F - Speaking & Listening Classroom Activities
53 pages
Gordon Allport - Trait Theory
100% (1)
Gordon Allport - Trait Theory
2 pages
CV Muksith
No ratings yet
CV Muksith
4 pages
Teaching Speaking
No ratings yet
Teaching Speaking
46 pages
Care Plan
No ratings yet
Care Plan
7 pages
CVviet
No ratings yet
CVviet
2 pages
PT. Tower Nusantara: Price / Unit Qty Amount
No ratings yet
PT. Tower Nusantara: Price / Unit Qty Amount
4 pages
Evaluation in Information Retrieval System PDF
No ratings yet
Evaluation in Information Retrieval System PDF
2 pages
Music Education Materials and Methods
100% (3)
Music Education Materials and Methods
15 pages
HTML5 Next Generation
No ratings yet
HTML5 Next Generation
18 pages
Dicto Comp
100% (1)
Dicto Comp
10 pages
Soal Bahasa Inggris Kelas Xi
No ratings yet
Soal Bahasa Inggris Kelas Xi
3 pages
Senior Pagbasa Q3 M5
No ratings yet
Senior Pagbasa Q3 M5
15 pages
Jhs Automated Mps Template Jhs
No ratings yet
Jhs Automated Mps Template Jhs
25 pages
OB - CasOB - Case Incident - Diversity in Organizatione Incident - Diversity in Organization
No ratings yet
OB - CasOB - Case Incident - Diversity in Organizatione Incident - Diversity in Organization
4 pages
Fundamentals of Financial Management, 16e 16th Edition Eugene F. Brigham - Download the full ebook set with all chapters in PDF format
100% (1)
Fundamentals of Financial Management, 16e 16th Edition Eugene F. Brigham - Download the full ebook set with all chapters in PDF format
58 pages
Thesis Ina NG Buhay Parish
No ratings yet
Thesis Ina NG Buhay Parish
135 pages
Brain
No ratings yet
Brain
11 pages
Enterprise Security Nanodegree Program Syllabus
No ratings yet
Enterprise Security Nanodegree Program Syllabus
15 pages

Lecture 21 (DS) - Decision Tree

Uploaded by

Lecture 21 (DS) - Decision Tree

Uploaded by

Data Science

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

• This decision tree is used to classify an animal as a 7

mammal or not a mammal

• Partition samples based on input to create purest subsets

• Repeat to partition data into successively purer subsets

• An algorithm for constructing a decision tree model is 8

called induction algorithm

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

• All (or x% of) samples have same class label

Note: All Images are taken from edx.org

• Let’s say we want to

Note: All Images are taken from edx.org

based on variable income

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

Note: All Images are taken from edx.org

You might also like