0% found this document useful (0 votes)

144 views6 pages

Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science

This document summarizes decision trees in machine learning. It explains that decision trees use a tree-like model to visually represent decisions and decision making. As an example, it shows a basic decision tree model using Titanic passenger data to predict survival. It then discusses how decision trees are grown using recursive binary splitting to select the optimal feature at each node to minimize cost. The document outlines techniques for determining when to stop splitting trees to avoid overfitting, and how pruning can further improve performance by removing less important branches.

Uploaded by

Akash Mukherjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

144 views6 pages

Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science

Uploaded by

Akash Mukherjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

28/08/2022, 13:03 Decision Trees in Machine Learning | by Prashant Gupta | Towards Data Science

Open in app Get started

Published in Towards Data Science

Prashant Gupta Follow

May 18, 2017 · 6 min read · Listen

Save

Decision Trees in Machine Learning

A tree has many analogies in real life, and turns out that it has influenced a wide area
of machine learning, covering both classification and regression. In decision
analysis, a decision tree can be used to visually and explicitly represent decisions and
decision making. As the name goes, it uses a tree-like model of decisions. Though a
commonly used tool in data mining for deriving a strategy to reach a particular goal, its
also widely used in machine learning, which will be the main focus of this article.

How can an algorithm be represented as a tree?

For this let’s consider a very basic example that uses titanic data set for predicting
whether a passenger will survive or not. Below model uses 3
features/attributes/columns from the data set, namely sex, age and sibsp (number of
https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052 1/6
28/08/2022, 13:03 Decision Trees in Machine Learning | by Prashant Gupta | Towards Data Science

spouses or children along).

Open in app Get started

Image taken from wikipedia

A decision tree is drawn upside down with its root at the top. In the image on the left, the
bold text in black represents a condition/internal node, based on which the tree splits
into branches/ edges. The end of the branch that doesn’t split anymore is the
decision/leaf, in this case, whether the passenger died or survived, represented as red
and green text respectively.

Although, a real dataset will have a lot more features and this will just be a branch in a
much bigger tree, but you can’t ignore the simplicity of this algorithm. The feature
importance is clear and relations can be viewed easily. This methodology is more
commonly known as learning decision tree from data and above tree is called
Classification tree as the target is to classify passenger as survived or died.
Regression trees are represented in the same manner, just they predict continuous
values like price of a house. In general, Decision Tree algorithms are referred to as
CART or Classification and Regression Trees.

So, what is actually going on in the background? Growing a tree involves deciding
on which features to choose and what conditions to use for splitting, along with
knowing when to stop. As a tree generally grows arbitrarily, you will need to trim it
down for it to look beautiful. Lets start with a common technique used for splitting.

Recursive Binary Splitting

https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052 2/6
28/08/2022, 13:03 Decision Trees in Machine Learning | by Prashant Gupta | Towards Data Science

Open in app Get started

In this procedure all the features are considered and different split points are tried and
tested using a cost function. The split with the best cost (or lowest cost) is selected.

Consider the earlier example of tree learned from titanic dataset. In the first split or the
root, all attributes/features are considered and the training data is divided into groups
based on this split. We have 3 features, so will have 3 candidate splits. Now we will
calculate how much accuracy each split will cost us, using a function. The split that
costs least is chosen, which in our example is sex of the passenger. This algorithm is
recursive in nature as the groups formed can be sub-divided using same strategy. Due
to this procedure, this algorithm is also known as the greedy algorithm, as we have an
excessive desire of lowering the cost. This makes the root node as best
predictor/classifier.

Cost of a split
Lets take a closer look at cost functions used for classification and regression. In
both cases the cost functions try to find most homogeneous branches, or branches
having groups with similar responses. This makes sense we can be more sure that a
test data input will follow a certain path.

Regression : sum(y — prediction)²

Lets say, we are predicting the price of houses. Now the decision tree will start splitting
by considering each feature in training data. The mean of responses of the training
data inputs of particular group is considered as prediction for that group. The above
function is applied to all data points and cost is calculated for all candidate splits. Again
the split with lowest cost is chosen. Another cost function involves reduction of standard
deviation, more about it can be found here.

https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052 3/6
28/08/2022, 13:03 Decision Trees in Machine Learning | by Prashant Gupta | Towards Data Science

Classification : G = sum(pk * (1 — pk))

Open in app Get started

A Gini score gives an idea of how good a split is by how mixed the response classes are
in the groups created by the split. Here, pk is proportion of same class inputs present in
a particular group. A perfect class purity occurs when a group contains all inputs from
the same class, in which case pk is either 1 or 0 and G = 0, where as a node having a
50–50 split of classes in a group has the worst purity, so for a binary classification it
will have pk = 0.5 and G = 0.5.

When to stop splitting?

You might ask when to stop growing a tree? As a problem usually has a large set of
features, it results in large number of split, which in turn gives a huge tree. Such trees
are complex and can lead to overfitting. So, we need to know when to stop? One way of
doing this is to set a minimum number of training inputs to use on each leaf. For
example we can use a minimum of 10 passengers to reach a decision(died or survived),
and ignore any leaf that takes less than 10 passengers. Another way is to set maximum
depth of your model. Maximum depth refers to the the length of the longest path
from a root to a leaf.

Pruning
The performance of a tree can be further increased by pruning. It involves removing
the branches that make use of features having low importance. This way, we reduce
the complexity of tree, and thus increasing its predictive power by reducing overfitting.

Pruning can start at either root or the leaves. The simplest method of pruning starts at
leaves and removes each node with most popular class in that leaf, this change is kept if
it doesn't deteriorate accuracy. Its also called reduced error pruning. More
sophisticated pruning methods can be used such as cost complexity pruning where a
learning parameter (alpha) is used to weigh whether nodes can be removed based on
the size of the sub-tree. This is also known as weakest link pruning.

Advantages of CART
Simple to understand, interpret, visualize.

Decision trees implicitly perform variable screening or feature selection.

Can handle both numerical and categorical data. Can also handle multi-output
problems.

D ii t i l ti l littl ff t f
https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052
f d t ti 4/6
28/08/2022, 13:03 Decision Trees in Machine Learning | by Prashant Gupta | Towards Data Science
Decision trees require relatively little effort from users for data preparation.
Open in app Get started

Nonlinear relationships between parameters do not affect tree performance.

Disadvantages of CART
Decision-tree learners can create over-complex trees that do not generalize the data
well. This is called overfitting.

Decision trees can be unstable because small variations in the data might result in a
completely different tree being generated. This is called variance, which needs to be
lowered by methods like bagging and boosting.

Greedy algorithms cannot guarantee to return the globally optimal decision tree.
This can be mitigated by training multiple trees, where the features and samples
are randomly sampled with replacement.

Decision tree learners create biased trees if some classes dominate. It is therefore
recommended to balance the data set prior to fitting with the decision tree.

This is all the basic, to get you at par with decision tree learning. An improvement over
decision tree learning is made using technique of boosting. A popular library for
implementing these algorithms is Scikit-Learn. It has a wonderful api that can get your
model up an running with just a few lines of code in python.

If you liked this article, be sure to click ❤ below to recommend it and if you have any
questions, leave a comment and I will do my best to answer.

For being more aware of the world of machine learning, follow me. It’s the best way to
find out when I write more articles like this.

You can also follow me on Twitter, email me directly or find me on linkedin. I’d love
to hear from you.

That’s all folks, Have a nice day :)

5.8K 24

https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052 5/6
28/08/2022, 13:03 Decision Trees in Machine Learning | by Prashant Gupta | Towards Data Science

Open in app Get started

Sign up for The Variable

By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials and cutting-
edge research to original features you don't want to miss. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review
our Privacy Policy for more information about our privacy practices.

Get this newsletter

About Help Terms Privacy

Get the Medium app

https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052 6/6

2.decision Tree
No ratings yet
2.decision Tree
74 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
16 pages
Decision Trees Machine Learning
No ratings yet
Decision Trees Machine Learning
4 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
Radiometry of Image Formation (Computer Vision)
No ratings yet
Radiometry of Image Formation (Computer Vision)
14 pages
Ai Lect 06
No ratings yet
Ai Lect 06
54 pages
Fundamentos de Manufactura Moderna 3 Ed Mikell
No ratings yet
Fundamentos de Manufactura Moderna 3 Ed Mikell
518 pages
Chap 12 TCP Flow Control
No ratings yet
Chap 12 TCP Flow Control
21 pages
B-Splines Primer
No ratings yet
B-Splines Primer
52 pages
Week 8 - Understanding The Decision Tree
No ratings yet
Week 8 - Understanding The Decision Tree
28 pages
Pa Unit-Iii
No ratings yet
Pa Unit-Iii
75 pages
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
No ratings yet
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
20 pages
Lecture 3 - Decision Trees and Random Forest
No ratings yet
Lecture 3 - Decision Trees and Random Forest
20 pages
Powder Cell
No ratings yet
Powder Cell
12 pages
1 s2.0 S0306261921016676 Main
No ratings yet
1 s2.0 S0306261921016676 Main
19 pages
First Contact With Tensor Flow - Part 1
100% (1)
First Contact With Tensor Flow - Part 1
136 pages
Fract. Frágil Completa
No ratings yet
Fract. Frágil Completa
56 pages
Microsoft Certified Azure AI Fundamentals
No ratings yet
Microsoft Certified Azure AI Fundamentals
75 pages
AI Engineer 1 Year Roadmap Full
No ratings yet
AI Engineer 1 Year Roadmap Full
4 pages
L 0007634413 PDF
0% (1)
L 0007634413 PDF
30 pages
Unit 3
No ratings yet
Unit 3
28 pages
Edi CT OF Government: EAS 81-6 (2006) (English) : Milk Powders - Determination of Insolubility Index
100% (1)
Edi CT OF Government: EAS 81-6 (2006) (English) : Milk Powders - Determination of Insolubility Index
23 pages
Segmentation
100% (1)
Segmentation
51 pages
Final Capstone Project Report
100% (1)
Final Capstone Project Report
35 pages
TEAA - Tree Ensembles-1
No ratings yet
TEAA - Tree Ensembles-1
43 pages
Wind Turbine Condition Monitoring Technical and Co
No ratings yet
Wind Turbine Condition Monitoring Technical and Co
22 pages
Digital Twin: Enabling Technology, Challenges and Open Research
No ratings yet
Digital Twin: Enabling Technology, Challenges and Open Research
20 pages
Machine - Learning - Lecture - 08 - Decision Tree Learning
No ratings yet
Machine - Learning - Lecture - 08 - Decision Tree Learning
67 pages
Python Basics-1
No ratings yet
Python Basics-1
23 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Supervised and Unsupervised Learning
No ratings yet
Supervised and Unsupervised Learning
92 pages
Physics Digital Notes
No ratings yet
Physics Digital Notes
83 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
Chapter One: Introduction: 1 1.1 Definition and Classification of Statistics
No ratings yet
Chapter One: Introduction: 1 1.1 Definition and Classification of Statistics
68 pages
2006-IMR-Processing and Properties of Monolithic TiB2 Based Materials
No ratings yet
2006-IMR-Processing and Properties of Monolithic TiB2 Based Materials
23 pages
Midterm Review Spring18 Sols
No ratings yet
Midterm Review Spring18 Sols
22 pages
Aiml Easy Solution
No ratings yet
Aiml Easy Solution
70 pages
ML Unit 3
No ratings yet
ML Unit 3
30 pages
Final Year Project Proposal: Heart Attack Predictor Using Artificial Intelligence
No ratings yet
Final Year Project Proposal: Heart Attack Predictor Using Artificial Intelligence
37 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
A.B.bhatia - Er.n.singh, MechanicsofDeformable Media. Adam, 1986
No ratings yet
A.B.bhatia - Er.n.singh, MechanicsofDeformable Media. Adam, 1986
207 pages
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
17 pages
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant Blog
No ratings yet
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant Blog
17 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Unit 4 ML
No ratings yet
Unit 4 ML
24 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Unit-3 Introduction To Machine Learning Algorithms
No ratings yet
Unit-3 Introduction To Machine Learning Algorithms
18 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
ML Unit 3
No ratings yet
ML Unit 3
28 pages
S&ML Unit 6 - Q & A
No ratings yet
S&ML Unit 6 - Q & A
12 pages
Numerical Methods Using Python: (MCSC-202)
No ratings yet
Numerical Methods Using Python: (MCSC-202)
39 pages
Biological Thermodynamics
No ratings yet
Biological Thermodynamics
2 pages
XV. Anomaly Detection
0% (1)
XV. Anomaly Detection
4 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
14 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Final Doc of Two Stage Job Title Identification System For Online Job Advertisements-1
No ratings yet
Final Doc of Two Stage Job Title Identification System For Online Job Advertisements-1
59 pages
AI - ML Curriculum Powered by IBM - Pregrad
No ratings yet
AI - ML Curriculum Powered by IBM - Pregrad
31 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
ML - Module-3-Chapter-6 RNSIT
No ratings yet
ML - Module-3-Chapter-6 RNSIT
10 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
No ratings yet
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
13 pages
Crop Selection and Yield Prediction
No ratings yet
Crop Selection and Yield Prediction
13 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Exam 2003
No ratings yet
Exam 2003
21 pages
Comparison of Neural Networks With Traditional Machine Learning Models
No ratings yet
Comparison of Neural Networks With Traditional Machine Learning Models
20 pages
Lagrangian Methods For Constrained Optimization
No ratings yet
Lagrangian Methods For Constrained Optimization
6 pages
Spam Review Detection Using Machine Learning and Deep Learning Classifiers
No ratings yet
Spam Review Detection Using Machine Learning and Deep Learning Classifiers
46 pages
Unit V Classification
No ratings yet
Unit V Classification
69 pages
Wine Quality Prediction GHAR
No ratings yet
Wine Quality Prediction GHAR
19 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Python Notes
No ratings yet
Python Notes
16 pages
Spam Email Classifier - Ramsanjay
No ratings yet
Spam Email Classifier - Ramsanjay
2 pages
Cluster Lecture-1
No ratings yet
Cluster Lecture-1
20 pages
Applied Photometry, Radiometry, and Measurements of Optical Losses
No ratings yet
Applied Photometry, Radiometry, and Measurements of Optical Losses
6 pages
Nutrient Deficiency Detection in Maize (Zea Mays L.) Leaves Using Image Processing
No ratings yet
Nutrient Deficiency Detection in Maize (Zea Mays L.) Leaves Using Image Processing
6 pages
Artificial Intelligence Approach For Modeling House Price Prediction
No ratings yet
Artificial Intelligence Approach For Modeling House Price Prediction
5 pages
Logistic Regression For Malignancy Prediction in Cancer - by Luca Zammataro - Towards Data Science
No ratings yet
Logistic Regression For Malignancy Prediction in Cancer - by Luca Zammataro - Towards Data Science
32 pages
Lab 2
100% (1)
Lab 2
4 pages
Data Mining - Practical Machine Learning Tools and
No ratings yet
Data Mining - Practical Machine Learning Tools and
3 pages
Perceptrons and SVMS: Cs771: Introduction To Machine Learning Nisheeth
No ratings yet
Perceptrons and SVMS: Cs771: Introduction To Machine Learning Nisheeth
18 pages
El Kah-Anoual-Publications-17-08-2022-11-08-19-34
No ratings yet
El Kah-Anoual-Publications-17-08-2022-11-08-19-34
10 pages
He Deep Residual Learning 2016 CVPR Supplemental
No ratings yet
He Deep Residual Learning 2016 CVPR Supplemental
4 pages
CSD311: Artificial Intelligence
No ratings yet
CSD311: Artificial Intelligence
12 pages
08 An Example of NN Using ReLu
No ratings yet
08 An Example of NN Using ReLu
10 pages
Ba 404
No ratings yet
Ba 404
2 pages
User Identification On Social Networks Through Text Mining Techniques: A Systematic Literature Review
No ratings yet
User Identification On Social Networks Through Text Mining Techniques: A Systematic Literature Review
14 pages
Sentiment Analysis of Tamil Movie Reviews Via Feature Frequency Count
No ratings yet
Sentiment Analysis of Tamil Movie Reviews Via Feature Frequency Count
7 pages

Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science

Uploaded by

Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science

Uploaded by

28/08/2022, 13:03 Decision Trees in Machine Learning | by Prashant Gupta | Towards Data Science

Open in app Get started

Published in Towards Data Science

Prashant Gupta Follow

May 18, 2017 · 6 min read · Listen

Decision Trees in Machine Learning

How can an algorithm be represented as a tree?

spouses or children along).

Image taken from wikipedia

Recursive Binary Splitting

Open in app Get started

Regression : sum(y — prediction)²

Classification : G = sum(pk * (1 — pk))

When to stop splitting?

Decision trees implicitly perform variable screening or feature selection.

Nonlinear relationships between parameters do not affect tree performance.

That’s all folks, Have a nice day :)

Open in app Get started

Sign up for The Variable

Get this newsletter

About Help Terms Privacy

Get the Medium app

You might also like