0% found this document useful (0 votes)

15 views

Tree

Uploaded by

Debjit Patar

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

Tree

Uploaded by

Debjit Patar

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

23 June 2024 22:02

Decision Trees Page 1

The Gini impurity is a measure of how often a randomly chosen element from the set would
be incorrectly labelled if it was randomly labelled according to the distribution of labels in the
subset.

Decision Trees Page 2

Decision Trees Page 3
Decision Trees Page 4
Advantages
• Simple to understand and to interpret. Trees can be visualized.
Requires little data preparation.

Other techniques often require data normalization,

dummy variables need to be created and blank values to be removed. Note however that
this module does not support missing values.

•The cost of using the tree (i.e., predicting data) is logarithmic in the number of data points
used to train the tree.

• Able to handle both numerical and categorical data.

• Can work on non-linear datasets
• Can give you feature importance.

Disadvantages
Decision-tree learners can create over-complex trees that do not generalize the data well.
This is called overfitting. Mechanisms such as pruning, setting the minimum number of
samples required at a leaf node or setting the maximum depth of the tree are necessary
to avoid this problem.
•

Decision trees can be unstable because small variations in the data might result in a
completely different tree being generated. This problem is mitigated by using decision
trees within an ensemble.
•

Predictions of decision trees are neither smooth nor continuous, but piecewise constant
approximations as seen in the above figure. Therefore, they are not good at
extrapolation.
•

This limitation is inherent to the structure of decision tree models. They are very useful
for interpretability and for handling non-linear relationships within the range of the

Decision Trees Page 5

extrapolation.
•

This limitation is inherent to the structure of decision tree models. They are very useful
for interpretability and for handling non-linear relationships within the range of the
training data, but they aren't designed for extrapolation. If extrapolation is important for
your task, you might need to consider other types of models.

The importance of a feature is computed as the (normalized) total reduction of the criterion
brought by that feature. It is also known as the Gini importance.

Pruning is a technique used in machine learning to reduce the size of decision trees and to
avoid overfitting. Overfitting happens when a model learns the training data too well,
including its noise and outliers, which results in poor performance on unseen or test data.

Decision trees are susceptible to overfitting because they can potentially create very complex
trees that perfectly classify the training data but fail to generalize to new data. Pruning helps
to solve this issue by reducing the complexity of the decision tree, thereby improving its
predictive power on unseen data.

There are two main types of pruning: pre-pruning and post-pruning.

1. Pre-pruning (Early stopping): This method halts the tree construction early. It can be done in
various ways: by setting a limit on the maximum depth of the tree, setting a limit on the
minimum number of instances that must be in a node to allow a split, or stopping when a split
results in the improvement of the model’s accuracy below a certain threshold.

2. Post-pruning (Cost Complexity Pruning): This method allows the tree to grow to its full size,
then prunes it. Nodes are removed from the tree based on the error complexity trade-off. The
basic idea is to replace a whole subtree by a leaf node, and assign the most common class in
that subtree to the leaf node.

Decision Trees Page 6

Pre-pruning, also known as early stopping, is a technique where the decision tree is pruned
during the learning process as soon as it's clear that further splits will not add significant value.
There are several strategies for pre-pruning:

1. Maximum Depth: One of the simplest forms of pre-pruning is to set a limit on the maximum
depth of the tree. Once the tree reaches the specified depth during training, no new nodes are
created. This strategy is simple to implement and can effectively prevent overfitting, but if the
maximum depth is set too low, the tree might be overly simplified and underfit the data.

2. Minimum Samples Split: This is a condition where a node will only be split if the number of
samples in that node is above a certain threshold. If the number of samples is too small, then
the node is not split and becomes a leaf node instead. This can prevent overfitting by not
allowing the model to learn noise in the data.

3. Minimum Samples Leaf: This condition requires that a split at a node must leave at least a
minimum number of training examples in each of the leaf nodes. Like the minimum samples
split, this strategy can prevent overfitting by not allowing the model to learn from noise in the
data.

Decision Trees Page 7

7721.00A High Broadband Cross Polarized
No ratings yet
7721.00A High Broadband Cross Polarized
1 page
Decision Tree Comprehesive
No ratings yet
Decision Tree Comprehesive
7 pages
HSMC
No ratings yet
HSMC
5 pages
Unit 3
No ratings yet
Unit 3
31 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Unit 4
No ratings yet
Unit 4
33 pages
Decision Tree by Masud (1)
No ratings yet
Decision Tree by Masud (1)
12 pages
IMP Machine Learning
No ratings yet
IMP Machine Learning
8 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Prune Regression Trees
No ratings yet
Prune Regression Trees
2 pages
Assignment Decision Tree
No ratings yet
Assignment Decision Tree
15 pages
Assignment of Decision Tree in Machine Learning
No ratings yet
Assignment of Decision Tree in Machine Learning
15 pages
Unit 4-2
No ratings yet
Unit 4-2
20 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
15 pages
RB's ML2 Notes
No ratings yet
RB's ML2 Notes
5 pages
12500221027
No ratings yet
12500221027
12 pages
Unit Iir20
No ratings yet
Unit Iir20
22 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
Lecture Note 5
No ratings yet
Lecture Note 5
7 pages
Apznzayn4iudcvxyoppqs61j04 7hfvwveb4orry3irmq7ekrlv08lh81olz64cb1ycwzmxuattzrg0ox0g-e Tcprei1i3bwhbnbqofqhvtixwokm0ftaoxwee3znpcytoh6jgknlof6 Rukjysosqdyan8wfbovpzrikmrpeywyu07ft Vvpsanuerxuhcghc7g6sd4pcyi9z-Wao8bn
No ratings yet
Apznzayn4iudcvxyoppqs61j04 7hfvwveb4orry3irmq7ekrlv08lh81olz64cb1ycwzmxuattzrg0ox0g-e Tcprei1i3bwhbnbqofqhvtixwokm0ftaoxwee3znpcytoh6jgknlof6 Rukjysosqdyan8wfbovpzrikmrpeywyu07ft Vvpsanuerxuhcghc7g6sd4pcyi9z-Wao8bn
20 pages
Introduction to Decision Trees
No ratings yet
Introduction to Decision Trees
10 pages
DT Instability
No ratings yet
DT Instability
4 pages
ML Exp6
No ratings yet
ML Exp6
3 pages
Decision Tree in Data Mining
No ratings yet
Decision Tree in Data Mining
1 page
Unit 3 Classification - Dr. Vidyut D
No ratings yet
Unit 3 Classification - Dr. Vidyut D
72 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
14 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
AIML Removed Merged
No ratings yet
AIML Removed Merged
31 pages
AIML Removed
No ratings yet
AIML Removed
25 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
4 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Ôn Thi KTDL
No ratings yet
Ôn Thi KTDL
18 pages
Writeup On Bank Customer Churn Prediction
No ratings yet
Writeup On Bank Customer Churn Prediction
14 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
1
No ratings yet
1
2 pages
phys361-S24-lecture-17-random-forests
No ratings yet
phys361-S24-lecture-17-random-forests
24 pages
Random Forest
No ratings yet
Random Forest
25 pages
Machine Learning For Interviews
No ratings yet
Machine Learning For Interviews
12 pages
Decisiontree1 2
No ratings yet
Decisiontree1 2
29 pages
Decision Trees - Pres
No ratings yet
Decision Trees - Pres
9 pages
Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science
No ratings yet
Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science
6 pages
Lect16 CSN382
No ratings yet
Lect16 CSN382
18 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
decision tree
No ratings yet
decision tree
1 page
Machine learning lecture 2,3,4
No ratings yet
Machine learning lecture 2,3,4
26 pages
Open Machine Learning With Decision Trees and Random Forests
No ratings yet
Open Machine Learning With Decision Trees and Random Forests
30 pages
Machine Learning With Decision Trees and Random Forest ?
No ratings yet
Machine Learning With Decision Trees and Random Forest ?
31 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
34 pages
Issues in Decision Tree Learning
No ratings yet
Issues in Decision Tree Learning
14 pages
Prac 6
No ratings yet
Prac 6
6 pages
NOTES
No ratings yet
NOTES
18 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
15 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Decision Trees Report
No ratings yet
Decision Trees Report
3 pages
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
No ratings yet
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
18 pages
Data Science
No ratings yet
Data Science
5 pages
Module III - Classification Decision Tree
No ratings yet
Module III - Classification Decision Tree
48 pages
DMI UNIT 4
No ratings yet
DMI UNIT 4
34 pages
Tree Pruning
No ratings yet
Tree Pruning
3 pages
13030822039_Aditri Chaudhuri_DM_
No ratings yet
13030822039_Aditri Chaudhuri_DM_
10 pages
AIML Final Cpy Word
No ratings yet
AIML Final Cpy Word
15 pages
Title - Wireless Power Transmission Through Mutual Induction and Transistor Amplification
No ratings yet
Title - Wireless Power Transmission Through Mutual Induction and Transistor Amplification
8 pages
14 Bernardo Soustruhy de
0% (1)
14 Bernardo Soustruhy de
40 pages
Gen Physics 2
No ratings yet
Gen Physics 2
21 pages
Unit Plan
No ratings yet
Unit Plan
34 pages
GoBeam081 Demo
No ratings yet
GoBeam081 Demo
5 pages
Lecture 1 Fermi Liquid Theory
No ratings yet
Lecture 1 Fermi Liquid Theory
25 pages
LPC55S6x: 1. General Description
No ratings yet
LPC55S6x: 1. General Description
140 pages
Module-2-Lesson-2-Workshop-6-Program Structure Arduino IDE 2-REVISED
No ratings yet
Module-2-Lesson-2-Workshop-6-Program Structure Arduino IDE 2-REVISED
9 pages
Lumiere A Year Ago
No ratings yet
Lumiere A Year Ago
4 pages
Lecture 2
No ratings yet
Lecture 2
69 pages
NBIMS-US V3 4.8 HVACie
No ratings yet
NBIMS-US V3 4.8 HVACie
143 pages
Chapter 3 - Data Pre-Processing Notes
No ratings yet
Chapter 3 - Data Pre-Processing Notes
8 pages
5.1.1 What Is An Engine Block? 5.1.2 Functional Requirements of A Cylinder Block
No ratings yet
5.1.1 What Is An Engine Block? 5.1.2 Functional Requirements of A Cylinder Block
15 pages
Book Store Management Synopsys
No ratings yet
Book Store Management Synopsys
25 pages
Grade 9 Baseline Test 2023 Marking Guideline (Final)
No ratings yet
Grade 9 Baseline Test 2023 Marking Guideline (Final)
7 pages
07 - Sheet Metal Ducting - 2.9
No ratings yet
07 - Sheet Metal Ducting - 2.9
11 pages
MAD-lec 11 Android Fragment
No ratings yet
MAD-lec 11 Android Fragment
23 pages
DB-400 Windows SCADA Database Editing Overview 1.1
100% (1)
DB-400 Windows SCADA Database Editing Overview 1.1
47 pages
Class-10
No ratings yet
Class-10
186 pages
Christian Delacampagne - A History of Philosophy in The Twentieth Century (2001, Johns Hopkins University Press) - Libgen - Li
No ratings yet
Christian Delacampagne - A History of Philosophy in The Twentieth Century (2001, Johns Hopkins University Press) - Libgen - Li
356 pages
Conventional Fire Detectors: ID100 - Optical Smoke Detector ID200 - Temperature Detector ID300 - Multicriteria Detector
No ratings yet
Conventional Fire Detectors: ID100 - Optical Smoke Detector ID200 - Temperature Detector ID300 - Multicriteria Detector
4 pages
Rate of Change
No ratings yet
Rate of Change
22 pages
Basic Ventilation
No ratings yet
Basic Ventilation
64 pages
TRIGONOMETRY-1,2 CENGAGE
No ratings yet
TRIGONOMETRY-1,2 CENGAGE
5 pages
91212
No ratings yet
91212
9 pages
Comparacion Planta Meg - Tegpdf
100% (1)
Comparacion Planta Meg - Tegpdf
18 pages
ZET Astrology Software E
No ratings yet
ZET Astrology Software E
3 pages
Flow Diagram: (/qs3/pubsys2/xml/en/manual/3810497/3810497-Titlepage - HTML)
No ratings yet
Flow Diagram: (/qs3/pubsys2/xml/en/manual/3810497/3810497-Titlepage - HTML)
22 pages
Final Dissertation 1 PDF
100% (1)
Final Dissertation 1 PDF
100 pages

Tree

Uploaded by

Tree

Uploaded by

23 June 2024 22:02

Decision Trees Page 1

Decision Trees Page 2

Other techniques often require data normalization,

• Able to handle both numerical and categorical data.

Decision Trees Page 5

There are two main types of pruning: pre-pruning and post-pruning.

Decision Trees Page 6

Decision Trees Page 7

You might also like