0% found this document useful (0 votes)

71 views32 pages

DS4 - CLS-Decision Tree

This document provides an overview of decision trees for classification. It explains how decision trees work to reduce uncertainty and classify data by building a tree structure that recursively partitions the data space and performs predictions at the leaf nodes. The key steps discussed are selecting the best attributes to split on at each node to maximize information gain or minimize impurity, and determining when to stop splitting the data to avoid overfitting.

Uploaded by

Văn Hoàng Trần

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views32 pages

DS4 - CLS-Decision Tree

Uploaded by

Văn Hoàng Trần

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 32

Data Science for Business

Lesson 4: Classification - Decision Tree

Dr. Le, Hai Ha

Contents
 Decision Tree
 Reduce Uncetainty
 Build Decision Tree
 Sumary of Decision Tree

2
Decision Tree
• Example: a student’s rules for studying or playing

3
How to build decision tree from data?
• Given data, how to build a classifier model (decision tree)?

4
Decision tree
Golf play: Yes (red), No (blue)

5
Decision tree
• Classification tree: to separate a dataset into classes
belonging to the response variable
• Intuitive and easy to set up
• Easy to interpret
• Usually has two classes: Yes or No (1 or 0)
• But can has more than two categories
• Regression trees: are used for numeric prediction problems

6
Golf data example

7
Ideas to build decision tree
• Define the order of attributes at each step
• For problems with many attributes and each
attribute having many different values, finding the
optimal solution is offen not feasible
• A simple method is that at each step the best
attribute is selected based on some criterion
• For each selected attribute, we devide the data into
child nodes corresponding to the values of that
attribute and then continue to apply this method to
each child node.
• Choosing the best attribute at each step like this
called greedy selection.
8
How to reduce uncetainty
• Imagine a box that can contain one of three colored balls
inside—red, yellow, and blue
• Without opening the box, if one had to “predict” which
colored ball is inside, then they are basically dealing with a
lack of information or uncertainty.
• What is the highest number of “yes/no” questions that can
be asked to reduce this uncertainty and, thus, increase our
information?
1. Is it red? No.
2. Is it yellow? No.
Then it must be blue.
• That is two questions.

ball-in-a-box problem
9
How to reduce uncetainty
• The maximum number of binary questions needed to
reduce uncertainty is essentially log(T), where the log is
taken to base 2 and T is the number of possible outcomes
• If there was only one color, that is, one outcome, then log(1)
= 0, which means there is no uncertainty
• If there are T events with equal probability of occurrence P,
then T = 1/P
• Claude Shannon defined Entropy as , or where is the
probability of an event occurring
• If the probability for all events is not identical, a weighted
expression is needed and, thus, entropy, , is adjusted as
follows:

10
Entropy
• Graph of the entropy function with

• P is purity: hay ,
• P is impurity: , max value when
11
Entropy and Gini index
• If the dataset had 100 samples with 50% of each, then the
entropy of the dataset is given by

• On the other hand, if the data can be partitioned into two sets of
50 samples each that exclusively contain all members and all
nonmembers, the entropy of either of these two partitioned sets
is given by
• Any other proportion of samples within a dataset will yield
entropy values between 0 and 1 (which is the maximum)
• The Gini index (G) is similar to the entropy measure in its
characteristics and is defined as

• The value of ranges between 0 and a maximum value of 0.5, but

otherwise has properties identical to
12
Measure of impurity
• Every split ties to make child node more pure.

13
Split criteria
• The measure of impurity of a dataset must be at a
maximum when all possible classes are equally
represented.
• The measure of impurity of a dataset must be zero
when only one class is represented.
• Measures such as entropy or Gini index easily meet
these criteria and are used to build decision trees.
• Different criteria will build different trees through
different biases.

14
Build Decision Tree
• Two steps:
• Step 1: Where to Split Data?
• Step 2: When to Stop Splitting Data?
• The Iterative Dichotomizer (ID3) algorithm
• Other algorithm is Classification and Regression
Tree (CART)

15
Step 1: Where to Split Data?

16
Algorithm
• Working with a non-leaf node, data points, points belong
to class (). Entroy of this node is:
(1)
• Select attribute . Base on , the data points are classified into
child nodes with the number of points in each child node
being . Define
(2)
• Information gain based on the x attribute is defined:

• In ID3, at each node, the selected attribute is determined

based on:

17
Example – Football team play or not?

18
Entropy at root node

=0.94

19
Consider outlook attribute

20
Outlook attribute

0.69

21
Temperature, Humidity, Wind attributes

Outlook is selected in the first step

22
Golf data example

23
• Start by partitioning the
data on each of the four
regular attributes
• Let us start with
Outlook. There are
three categories for this
variable: sunny,
overcast, and rain.

24
• For numeric variables, possible split points to examine are
essentially averages of available values. For example, the
first potential split point for Humidity could be Average
[65,70], which is 67.5, the next potential split point could be
Average [70,75], which is 72.5, and so on.

25
26
27
Step 2: When to Stop Splitting Data?
• The algorithm would need to be instructed when to stop.
• There are several situations where the process can be terminated:
• No attribute satisfies a minimum information gain threshold
• A maximal depth is reached: as the tree grows larger, not only does
interpretation get harder, but a situation called “overfitting” is induced.
• There are less than a certain number of examples in the current subtree:
again, a mechanism to prevent overfitting.
• To prevent overfitting, tree growth may need to be restricted or
reduced, using a process called pruning
• pre-pruning: the pruning occurs before or during the growth of
the tree.
• There are also methods that will not restrict the number of
branches and allow the tree to grow as deep as the data will allow,
and then trim or prune those branches that do not effectively
change the classification error rates. This is called post-pruning.

28
Post pruning
• Reduced error pruning: rely on a validation set
• Regularization: add regularization into loss function,
the regularization will be large if the number of leaf
nodes is large

• First, construct a decision tree where every point in the

training set is properly classified (all entopy of nodes is
zero). Now the data loss is zero but regularization can be
large, making L large.
• Then prune the leaf nodes so that L decreases. The
pruning is repeated until L can no longer be reduced.

29
Decision Tree – Model building
1. Calculate Shannon Entropy (no partition)
2. Calculate weighted entropy of each independent
variable on the target variable
3. Compute Information gain
4. Choose the independent variable with the
highest information gain as a divided node
5. Repeat this process. If the entropy of a variable is
zero, then that variable becomes a leaf node

30
Summary of Decision Tree
• A decision tree model takes a form of decision flowchart
• An attribute is tested in each node
• At end of the decision tree path is a leaf node where a
prediction is made about the target variable based on
conditions set forth by the decision path.
• The nodes split the dataset into subsets.
• In a decision tree, the idea is to split the dataset based on
the homogeneity of data.
• A rigorous measure of impurity is needed, which meets
certain criteria, based on computing a proportion of the
data that belong to a class.

31
• RapidMiner process files and data sets
• http://www.introdatascience.com/uploads/4/2/1/5
/42154413/second_ed_rm_process.zip

Chapter 10 - ToMS - Individual Assignment - Faris Prasetyo Makarim
No ratings yet
Chapter 10 - ToMS - Individual Assignment - Faris Prasetyo Makarim
4 pages
AS330 Series Elevator-Used Inverter User Manual V1.01
No ratings yet
AS330 Series Elevator-Used Inverter User Manual V1.01
128 pages
Understanding The Law of Resonance
No ratings yet
Understanding The Law of Resonance
13 pages
Decision Tree & Techniques
71% (7)
Decision Tree & Techniques
41 pages
Practical Design of Experiments: DoE Made Easy
From Everand
Practical Design of Experiments: DoE Made Easy
Colin Hardwick
4.5/5 (7)
S.No. Name of The Agency Contact Details: M/s M.P. Printers
100% (1)
S.No. Name of The Agency Contact Details: M/s M.P. Printers
3 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Classification
No ratings yet
Classification
30 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
Decision Tree
No ratings yet
Decision Tree
58 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
06 - Decision Trees
No ratings yet
06 - Decision Trees
14 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
ML Unit-3
No ratings yet
ML Unit-3
92 pages
Decision Tree
No ratings yet
Decision Tree
66 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
No ratings yet
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
73 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
ML Classification Tree
No ratings yet
ML Classification Tree
36 pages
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
No ratings yet
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
13 pages
Classification Algorithms: Inteligência Artificial E Cibersegurança (Inacs)
No ratings yet
Classification Algorithms: Inteligência Artificial E Cibersegurança (Inacs)
60 pages
MODULE 4-Dr - GM
No ratings yet
MODULE 4-Dr - GM
23 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
Decision Tree
No ratings yet
Decision Tree
34 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
COS10022 DSP Week05 Decision Tree and Random Forest
No ratings yet
COS10022 DSP Week05 Decision Tree and Random Forest
50 pages
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
No ratings yet
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
43 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
No ratings yet
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
61 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Trinh Khanh Ly 20213676
No ratings yet
Trinh Khanh Ly 20213676
13 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
Decision Trees - 2022
No ratings yet
Decision Trees - 2022
49 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Act 9
No ratings yet
Act 9
22 pages
ML-chap9 2024 110217
No ratings yet
ML-chap9 2024 110217
52 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
41 pages
DataMining-Handouts1 5
No ratings yet
DataMining-Handouts1 5
8 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
DM Chapter 4
No ratings yet
DM Chapter 4
6 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Iterative Dichotomizer 3 (ID3) Decision Tree: A Machine Learning Algorithm For Data Classification and Predictive Analysis
No ratings yet
Iterative Dichotomizer 3 (ID3) Decision Tree: A Machine Learning Algorithm For Data Classification and Predictive Analysis
8 pages
W8 - Research Ethics
No ratings yet
W8 - Research Ethics
10 pages
W7 - Introduction To Research Writing
No ratings yet
W7 - Introduction To Research Writing
18 pages
W1 - Slide 0 - Introduction To The Course
No ratings yet
W1 - Slide 0 - Introduction To The Course
7 pages
Statements PDF
No ratings yet
Statements PDF
4 pages
Giver EdGuide V4b
100% (1)
Giver EdGuide V4b
28 pages
Monthly RE Generation Report April 2025
No ratings yet
Monthly RE Generation Report April 2025
28 pages
Spot Film Devices
No ratings yet
Spot Film Devices
17 pages
Unit Iii
No ratings yet
Unit Iii
41 pages
RICOH Pro L4130/L4160 Print Guide: First, Confirm The Following Items
No ratings yet
RICOH Pro L4130/L4160 Print Guide: First, Confirm The Following Items
8 pages
Set 12-Math-Class V
No ratings yet
Set 12-Math-Class V
6 pages
Bray Resilient Valves
No ratings yet
Bray Resilient Valves
25 pages
A Clearer View of Crystallizers
No ratings yet
A Clearer View of Crystallizers
5 pages
20 Things To Do After Installing Elementary OS Freya
No ratings yet
20 Things To Do After Installing Elementary OS Freya
2 pages
Daniel Robert Middleton
No ratings yet
Daniel Robert Middleton
3 pages
Saipem Modern Slavery Statement 22 FINAL
No ratings yet
Saipem Modern Slavery Statement 22 FINAL
20 pages
BioAir EcoFilter Brochure
No ratings yet
BioAir EcoFilter Brochure
4 pages
Grade 10 Agriculture
No ratings yet
Grade 10 Agriculture
4 pages
Shared Module 4 5-Inventory, QM, SCM-Logistics
No ratings yet
Shared Module 4 5-Inventory, QM, SCM-Logistics
107 pages
A Geometric Review of Linear Algebra: Vectors (Finite-Dimensional)
No ratings yet
A Geometric Review of Linear Algebra: Vectors (Finite-Dimensional)
10 pages
Chapter 4
100% (1)
Chapter 4
166 pages
Tyagi Wang Wen Zuo
No ratings yet
Tyagi Wang Wen Zuo
17 pages
L8.2: Interfacing Digital Temperature and Humidity Sensor With Microcontroller
No ratings yet
L8.2: Interfacing Digital Temperature and Humidity Sensor With Microcontroller
6 pages
Math10 Q1 Wk3 Illustrate-Geometric-Sequence
100% (1)
Math10 Q1 Wk3 Illustrate-Geometric-Sequence
12 pages
Characteristics (Typical Figures) Agip Arum HT 220
No ratings yet
Characteristics (Typical Figures) Agip Arum HT 220
1 page
Thcs An Lac - Thi HK I. k9. 2020-2021
No ratings yet
Thcs An Lac - Thi HK I. k9. 2020-2021
8 pages
Using The TI-73:: A Guide For Teachers
No ratings yet
Using The TI-73:: A Guide For Teachers
86 pages
TransCAD - An Overview of A Transportation Planning and Analysis Software Significance Part 3
No ratings yet
TransCAD - An Overview of A Transportation Planning and Analysis Software Significance Part 3
10 pages
Mevolution - Ged101 - Quiambao - A54
No ratings yet
Mevolution - Ged101 - Quiambao - A54
12 pages
Juarez Cartel Suit
No ratings yet
Juarez Cartel Suit
52 pages

DS4 - CLS-Decision Tree

Uploaded by

DS4 - CLS-Decision Tree

Uploaded by

Data Science for Business

Lesson 4: Classification - Decision Tree

Dr. Le, Hai Ha

• The value of ranges between 0 and a maximum value of 0.5, but

• In ID3, at each node, the selected attribute is determined

Outlook is selected in the first step

• First, construct a decision tree where every point in the

You might also like