0% found this document useful (0 votes)

2 views20 pages

Decision Tree

The document provides an overview of decision trees, which are classifiers used for classification and prediction by organizing data into a tree structure with decision nodes and leaf nodes. It discusses the advantages and disadvantages of decision trees, including their interpretability and susceptibility to overfitting. Additionally, it outlines key concepts such as entropy, information gain, and algorithms like ID3 and CART used for constructing decision trees.

Uploaded by

man

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views20 pages

Decision Tree

Uploaded by

man

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

DECISION TREE

ASMA KANWAL
LECTURER
GC UNIVERSITY,
LAHORE
PROBLEM OBJECTIVE

 Given a set of training cases/objects and their attribute values, try

to determine the target attribute value of new examples.

 Classification
 Prediction
DEFINITION
Decision tree is a classifier in the form of a tree structure
– Decision node: specifies a test on a single attribute
– Leaf node: indicates the value of the target attribute
– Arc/edge: split of one attribute
– Path: a disjunction of test to make the final decision

 Decision trees classify instances or examples by starting at the root of the

tree and moving through it until a leaf node.
DECISION TREES

 Decision trees are powerful and popular tools for classification and prediction.
 Decision trees represent rules, which can be understood by humans and used
in knowledge system such as database.
 Rules for classifying data using attributes.
 The tree consists of decision nodes and leaf nodes.
 A decision node has two or more branches, each representing values for the
attribute tested.
 A leaf node attribute produces a homogeneous result (all in one class), which
does not require additional classification testing.
IMPORTANT TERMS

Root Node: Population which is further going to divide.

Splitting: Process of dividing any node into two or more sub-nodes.
Decision Node: Sub-node is divided into further sub-node is decision nodes.
Leaf/Terminal Nodes: Node that do not split further is leaf node.
Pruning: Remove sub-nodes of a decision-node is called pruning.
Branch/ Sub-Tree: Sub-section of the entire tree is branch.
Parent and Child Node: Node which is divided into sub-node is parent and newly generated
sub-node is a child.
ILLUSTRATION

(1) Which to start? (root)

(2) Which node to

proceed?

(3) When to stop/ come to conclusion?

DECISION

 Knowing the ``when’’ attribute values provides larger information

gain than ``where’’.
 Therefore the ``when’’ attribute should be chosen for testing prior
to the ``where’’ attribute.
 Similarly, we can compute the information gain for other attributes.
 At each node, choose the attribute with the largest information gain.
DECISION

 Stopping rule
 Every attribute has already been included along this path through the
tree, or
 The training examples associated with this leaf node all have the same
target attribute value (i.e., their entropy is zero).
ADVANTAGES OF DECISION TREE

 Simple to understand, interpret and visualize

 Little effort is required for data preparation
 Can handle both numerical and categorical data
 Non-linear parameters do not effect its performance
 Provide a clear indication of which fields are most important for
prediction or classification
 Perform classification without much computation
DISADVANTAGES OF DECISION TREE

 Over fitting: Over fitting occurs when the algorithm captures noise
in data.
 High Variance: The model can get unstable due to small variations
in dataset.
 Low Biased Tree: The highly complicated decision tree tends to
have a low bias which makes it difficult for the model to work with
new data.
WEAKNESS

 Perform poorly with many class and small data.

 Can generate understandable rules
 Computationally expensive to train.
 At each node, each candidate splitting field must be sorted before its best split can be
found.
 In some algorithms, combinations of fields are used and a search must be made for
optimal combining weights.
 Pruning algorithms can also be expensive since many candidate sub-trees must be
formed and compared.
 Do not treat well non-rectangular regions.
DECISION TREE TYPES

 Categorical Variable Decision Tree

 Continuous Variable Decision Tree
CONTINUOUS ATTRIBUTE?

 Each non-leaf node is a test, its edge partitioning the attribute into
subsets (easy for discrete attribute).
 For continuous attribute
 Partition the continuous value of attribute A into a discrete set of
intervals
 Create a new boolean attribute Ac , looking for a threshold c,
true if Ac  c
Ac 
 false otherwise
KEY REQUIREMENTS

 Attribute-value description: Object or case must be expressible in

terms of a fixed collection of properties or attributes (e.g., hot, mild, cold).
 Predefined classes (target values): The target function has
discrete output values (bolean or multiclass)
 Sufficient data: Enough training cases should be provided to learn the
model.
 Decision Path: Path from root node to the class is decision path
DECISION TREE ALGORITHMS

 CART (Gini Index)

 ID3 (Entropy Function & Information Gain)
PRINCIPLED CRITERION –ID3

 Selection of an attribute to test at each node - choosing the most

useful attribute for classifying examples.
 Information gain
 measures how well a given attribute separates the training examples
according to their target classification
 This measure is used to select among the candidate attributes at each
step while growing the tree
ID3 ALGORITHM

 Entropy Calculation: Compute entropy for entire dataset (for root node)
 For every Attribute
 Calculate entropy for all attributes.
 Take Average Information Entropy for current attribute.
 Calculate Information Gain for current attribute.
 Pick the highest gain attribute
 Repeat the process until we get the whole decision tree.
ENTROPY

 A measure of homogeneity of the set of examples.

 Given a set S of positive and negative examples of some target

concept (a 2-class problem), the entropy of set S relative to this
binary classification is

E(S) = - p(P)log2 p(P) – p(N)log2 p(N)

INFORMATION GAIN

 Information gain is difference in entropy before and after splitting the data
set.
 Information gain measures the expected reduction in entropy, or
Sv
uncertainty. Gain( S , A) Entropy ( S )  
vValues ( A ) S
Entropy (S v )

 Values(A) is the set of all possible values for attribute A, and Sv the subset of S
for which attribute A has value v Sv = {s in S | A(s) = v}.
 the first term in the equation for Gain is just the entropy of the original
collection S
 the second term is the expected value of the entropy after S is partitioned
using attribute A.
EVALUATION

 Training accuracy
 How many training instances can be correctly classify based on the available
data?
 Is high when the tree is deep/large, or when there is less confliction in the
training instances.
 However, higher training accuracy does not mean good generalization
 Testing accuracy
 Given a number of new instances, how many of them can we correctly classify?
 Cross validation

Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Enm302 Speaking Test
No ratings yet
Enm302 Speaking Test
12 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
AIML Module-04
No ratings yet
AIML Module-04
46 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Vidyanjali Project SCT - CSR
No ratings yet
Vidyanjali Project SCT - CSR
21 pages
ML Unit 2-2-40
No ratings yet
ML Unit 2-2-40
39 pages
Decisiontrees
No ratings yet
Decisiontrees
28 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Module 3
No ratings yet
Module 3
102 pages
M2 Decision Trees
No ratings yet
M2 Decision Trees
37 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Session 5b Classification by Decision Tree Induction
No ratings yet
Session 5b Classification by Decision Tree Induction
42 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Classification
No ratings yet
Classification
8 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Classification
No ratings yet
Classification
75 pages
Decision Trees Lectures
No ratings yet
Decision Trees Lectures
55 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
Module 3
No ratings yet
Module 3
101 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
15.module6 Decisiontree-Updated 14
No ratings yet
15.module6 Decisiontree-Updated 14
20 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
3-Classification, Clustering and Prediction
No ratings yet
3-Classification, Clustering and Prediction
142 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Secure Wireless Data Transmission Report
50% (2)
Secure Wireless Data Transmission Report
75 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
11 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
ThinkPad T520 Manual
No ratings yet
ThinkPad T520 Manual
178 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
Decisiontree 2
No ratings yet
Decisiontree 2
16 pages
Workshop 18 - Mixing Analysis (LMI) Part A: Project Setup and Processing
No ratings yet
Workshop 18 - Mixing Analysis (LMI) Part A: Project Setup and Processing
30 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Decision Tree
No ratings yet
Decision Tree
25 pages
Ferti Jet
No ratings yet
Ferti Jet
19 pages
Guide To Good Practices of Shift Training
No ratings yet
Guide To Good Practices of Shift Training
29 pages
Quality Management, Ethics, and Corporate Social
No ratings yet
Quality Management, Ethics, and Corporate Social
26 pages
Bill Payment Receipt - Feb 2019
No ratings yet
Bill Payment Receipt - Feb 2019
4 pages
BRM Chapter 1
No ratings yet
BRM Chapter 1
44 pages
Political Theology Four Chapters On The Concept of Sovereignty (Carl Schmitt)
No ratings yet
Political Theology Four Chapters On The Concept of Sovereignty (Carl Schmitt)
53 pages
Motor Development: A New Synthesis
No ratings yet
Motor Development: A New Synthesis
17 pages
Uniqueness of The Earth
No ratings yet
Uniqueness of The Earth
19 pages
Panasonic Kx-t7230SP User Manual 2
No ratings yet
Panasonic Kx-t7230SP User Manual 2
26 pages
Indigenous Religious Beliefs and Cosmology of The Filipino
No ratings yet
Indigenous Religious Beliefs and Cosmology of The Filipino
18 pages
Thesis For Girl Interrupted
100% (3)
Thesis For Girl Interrupted
8 pages
Yaprime Brochure
No ratings yet
Yaprime Brochure
14 pages
Wajood-e-Bari Ta'ala Aur Tauheed
No ratings yet
Wajood-e-Bari Ta'ala Aur Tauheed
382 pages
Lecture 12
No ratings yet
Lecture 12
44 pages
18ee En590, Euro 5, Gost R52368 - 2005 - WW ..
No ratings yet
18ee En590, Euro 5, Gost R52368 - 2005 - WW ..
1 page
Mujtaba-Artificial Intelligence
No ratings yet
Mujtaba-Artificial Intelligence
24 pages
Statistical Decision Theory Assignment
100% (1)
Statistical Decision Theory Assignment
10 pages
Bneo Program
No ratings yet
Bneo Program
6 pages
Lecture 2
No ratings yet
Lecture 2
40 pages
Csvsimple l3
No ratings yet
Csvsimple l3
62 pages
Semester - 6 Paperz
No ratings yet
Semester - 6 Paperz
76 pages
AI Notes by Affaq Bhai
No ratings yet
AI Notes by Affaq Bhai
60 pages
Lecture 9
No ratings yet
Lecture 9
22 pages
Mining Frequent Pattern
No ratings yet
Mining Frequent Pattern
36 pages
Lecture 3
No ratings yet
Lecture 3
23 pages
OpenMP Chapter
No ratings yet
OpenMP Chapter
32 pages
TENTEC V-Series Data Sheet R8 A4
No ratings yet
TENTEC V-Series Data Sheet R8 A4
4 pages
Hidden Markov Models
No ratings yet
Hidden Markov Models
17 pages
Lecture 4
No ratings yet
Lecture 4
20 pages
Backpropogation
No ratings yet
Backpropogation
26 pages
Report Writing
No ratings yet
Report Writing
15 pages
Chapter 27
No ratings yet
Chapter 27
19 pages
Lecture 8
No ratings yet
Lecture 8
19 pages
M2 Plate 200 Requirements
No ratings yet
M2 Plate 200 Requirements
6 pages
1 s2.0 S037872061731039X Main
No ratings yet
1 s2.0 S037872061731039X Main
12 pages
Emc QB
No ratings yet
Emc QB
4 pages
COCOMO and UNIT 4
No ratings yet
COCOMO and UNIT 4
10 pages
AI Notes
No ratings yet
AI Notes
9 pages
1-Data Mining
No ratings yet
1-Data Mining
11 pages
2 Conciseness
No ratings yet
2 Conciseness
9 pages
OOADFinalpaper
No ratings yet
OOADFinalpaper
3 pages
ASE Numericals
No ratings yet
ASE Numericals
7 pages
OR Final
No ratings yet
OR Final
4 pages
2-Data Mining
No ratings yet
2-Data Mining
6 pages
PCA
No ratings yet
PCA
5 pages
Mukul - Curriculum - Vitae - V.1
No ratings yet
Mukul - Curriculum - Vitae - V.1
2 pages
Networking Devices
No ratings yet
Networking Devices
5 pages
DLL in English Week 4
No ratings yet
DLL in English Week 4
5 pages
Example For PEARSON PRODUCT CORRELATION
No ratings yet
Example For PEARSON PRODUCT CORRELATION
3 pages
27.12. Assume You Are A Software Project Manager A...
No ratings yet
27.12. Assume You Are A Software Project Manager A...
3 pages
Rational Agent
No ratings yet
Rational Agent
2 pages
Types of Agents
No ratings yet
Types of Agents
1 page
Proposal
No ratings yet
Proposal
1 page
Free CV Design.c2024
No ratings yet
Free CV Design.c2024
1 page
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

Decision Tree

Uploaded by

Decision Tree

Uploaded by

DECISION TREE

 Given a set of training cases/objects and their attribute values, try

 Decision trees classify instances or examples by starting at the root of the

Root Node: Population which is further going to divide.

(1) Which to start? (root)

(2) Which node to

(3) When to stop/ come to conclusion?

 Knowing the ``when’’ attribute values provides larger information

 Simple to understand, interpret and visualize

 Perform poorly with many class and small data.

 Categorical Variable Decision Tree

 Attribute-value description: Object or case must be expressible in

 CART (Gini Index)

 Selection of an attribute to test at each node - choosing the most

 A measure of homogeneity of the set of examples.

 Given a set S of positive and negative examples of some target

E(S) = - p(P)log2 p(P) – p(N)log2 p(N)

You might also like