0% found this document useful (0 votes)

2 views

Examples

Uploaded by

akg.uk14

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Examples

Uploaded by

akg.uk14

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

What is a decision tree?

A decision tree is a classification and prediction tool having a tree-like structure, where each
internal node denotes a test on an attribute, each branch represents an outcome of the test, and
each leaf node (terminal node) holds a class label.

Above we have a small decision tree. An important advantage of the decision tree is that it is
highly interpretable. Here If Height > 180cm or if height < 180cm and weight > 80kg person
is male.Otherwise female. Did you ever think about how we came up with this decision tree?
I will try to explain it using the weather dataset.

Before going to it further I will explain some important terms related to decision trees.

Entropy

In machine learning, entropy is a measure of the randomness in the information being

processed. The higher the entropy, the harder it is to draw any conclusions from that
information.

Information Gain
Information gain can be defined as the amount of information gained about a random variable
or signal from observing another random variable.It can be considered as the difference
between the entropy of parent node and weighted average entropy of child nodes.

Gini Impurity

Gini impurity is a measure of how often a randomly chosen element from the set would be
incorrectly labeled if it was randomly labeled according to the distribution of labels in the
subset.

Gini impurity is lower bounded by 0, with 0 occurring if the data set contains only one class.

There are many algorithms there to build a decision tree. They are
1. CART (Classification and Regression Trees) — This makes use of Gini impurity as
the metric.
2. ID3 (Iterative Dichotomiser 3) — This uses entropy and information gain as metric.

In this article, I will go through ID3. Once you got it it is easy to implement the same using
CART.

Classification using the ID3 algorithm

Consider whether a dataset based on which we will determine whether to play football or not.

Here There are for independent variables to determine the dependent variable. The
independent variables are Outlook, Temperature, Humidity, and Wind. The dependent
variable is whether to play football or not.

As the first step, we have to find the parent node for our decision tree. For that follow the
steps:

Find the entropy of the class variable.

E(S) = -[(9/14)log(9/14) + (5/14)log(5/14)] = 0.94

note: Here typically we will take log to base 2. Here total there are 14 yes/no. Out of which 9
yes and 5 no.Based on it we calculated probability above.

From the above data for outlook we can arrive at the following table easily
Now we have to calculate average weighted entropy. ie, we have found the total of weights
of each feature multiplied by probabilities.

E(S, outlook) = (5/14)E(3,2) + (4/14)E(4,0) + (5/14)*E(2,3) = (5/14)(-(3/5)log(3/5)-

(2/5)log(2/5))+ (4/14)(0) + (5/14)((2/5)log(2/5)-(3/5)log(3/5)) = 0.693

The next step is to find the information gain. It is the difference between parent entropy and
average weighted entropy we found above.

IG(S, outlook) = 0.94 - 0.693 = 0.247

Similarly find Information gain for Temperature, Humidity, and Windy.

IG(S, Temperature) = 0.940 - 0.911 = 0.029

IG(S, Humidity) = 0.940 - 0.788 = 0.152

IG(S, Windy) = 0.940 - 0.8932 = 0.048

Now select the feature having the largest entropy gain. Here it is Outlook. So it forms the
first node(root node) of our decision tree.

Now our data look as follows

Since overcast contains only examples of class ‘Yes’ we can set it as yes. That means If
outlook is overcast football will be played. Now our decision tree looks as follows.

The next step is to find the next node in our decision tree. Now we will find one under sunny.
We have to determine which of the following Temperature, Humidity or Wind has higher
information gain.

Calculate parent entropy E(sunny)

E(sunny) = (-(3/5)log(3/5)-(2/5)log(2/5)) = 0.971.

Now Calculate the information gain of Temperature. IG(sunny, Temperature)

E(sunny, Temperature) = (2/5)E(0,2) + (2/5)E(1,1) + (1/5)*E(1,0)=2/5=0.4

Now calculate information gain.

IG(sunny, Temperature) = 0.971–0.4 =0.571

Similarly we get

IG(sunny, Humidity) = 0.971

IG(sunny, Windy) = 0.020

Here IG(sunny, Humidity) is the largest value. So Humidity is the node that comes under
sunny.

For humidity from the above table, we can say that play will occur if humidity is normal and
will not occur if it is high. Similarly, find the nodes under rainy.

Note: A branch with entropy more than 0 needs further splitting.

Finally, our decision tree will look as below:

Classification using CART algorithm

Classification using CART is similar to it. But instead of entropy, we use Gini impurity.
So as the first step we will find the root node of our decision tree. For that Calculate the
Gini index of the class variable

Gini(S) = 1 - [(9/14)² + (5/14)²] = 0.4591

As the next step, we will calculate the Gini gain. For that first, we will find the average
weighted Gini impurity of Outlook, Temperature, Humidity, and Windy.

First, consider case of Outlook

Gini(S, outlook) = (5/14)gini(3,2) + (4/14)gini(4,0)+ (5/14)gini(2,3) = (5/14)(1 - (3/5)² -

(2/5)²) + (4/14)*0 + (5/14)(1 - (2/5)² - (3/5)²)= 0.171+0+0.171 = 0.342

Gini gain (S, outlook) = 0.459 - 0.342 = 0.117

Gini gain(S, Temperature) = 0.459 - 0.4405 = 0.0185

Gini gain(S, Humidity) = 0.459 - 0.3674 = 0.0916

Gini gain(S, windy) = 0.459 - 0.4286 = 0.0304

Choose one that has a higher Gini gain. Gini gain is higher for outlook. So we can choose it
as our root node.

Now you have got an idea of how to proceed further. Repeat the same steps we used in the
ID3 algorithm.

Advantages and disadvantages of decision trees

Advantages:

1. Decision trees are super interpretable

2. Require little data preprocessing
3. Suitable for low latency applications

Disadvantages:

1. More likely to overfit noisy data. The probability of overfitting on noise increases as a
tree gets deeper. A solution for it is pruning. You can read more about pruning from
my Kaggle notebook. Another way to avoid overfitting is to use bagging techniques
like Random Forest. You can read more about Random Forest from an article from
neptune.ai.

KOM6110 ANN - Machine - Learning Assignment 1 Spring 2017
No ratings yet
KOM6110 ANN - Machine - Learning Assignment 1 Spring 2017
1 page
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
12 pages
6__DecisionTrees__ID3_CART
No ratings yet
6__DecisionTrees__ID3_CART
24 pages
MODULE 4-Dr - GM
No ratings yet
MODULE 4-Dr - GM
23 pages
ML Unit-3 ppt
No ratings yet
ML Unit-3 ppt
92 pages
Decision Tree
No ratings yet
Decision Tree
100 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Unit 4 - Decision Tree ID3
No ratings yet
Unit 4 - Decision Tree ID3
5 pages
Ml Unit 2 Final_iii Yr
No ratings yet
Ml Unit 2 Final_iii Yr
72 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
Decision Tree
No ratings yet
Decision Tree
36 pages
Classification and Clustering
No ratings yet
Classification and Clustering
59 pages
ML Unit-2 Material WORD
No ratings yet
ML Unit-2 Material WORD
25 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Trees
No ratings yet
Decision Trees
19 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
Practice Q Machine Learning Ans
No ratings yet
Practice Q Machine Learning Ans
54 pages
Day48 Decision Trees
No ratings yet
Day48 Decision Trees
5 pages
3. Tree Models
No ratings yet
3. Tree Models
42 pages
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-08-19 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-08-19 Reference-Material-I
11 pages
IS4834 Week 8
No ratings yet
IS4834 Week 8
42 pages
Naïve Bayes-DecisionTrees-RandomForest-SVM
No ratings yet
Naïve Bayes-DecisionTrees-RandomForest-SVM
26 pages
AIML Lect5 Decision Tree
No ratings yet
AIML Lect5 Decision Tree
33 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Decision Tree
100% (4)
Decision Tree
66 pages
Decision Trees - Neha Chowdhary PPT
No ratings yet
Decision Trees - Neha Chowdhary PPT
20 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Entropy and Information Gain Explained
No ratings yet
Entropy and Information Gain Explained
10 pages
ML_Unit-2_Material
No ratings yet
ML_Unit-2_Material
20 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
ML Classification Tree
No ratings yet
ML Classification Tree
36 pages
CSE 422 Machine Learning Tree Based Methods
No ratings yet
CSE 422 Machine Learning Tree Based Methods
35 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
Decision Tree For Classification (ID3 Information Gain Entropy)
No ratings yet
Decision Tree For Classification (ID3 Information Gain Entropy)
3 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
28 pages
Unit 4a Decision Tree
No ratings yet
Unit 4a Decision Tree
90 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
Dec Tree
No ratings yet
Dec Tree
17 pages
Session 6 - Decision Tree
No ratings yet
Session 6 - Decision Tree
37 pages
ID3
No ratings yet
ID3
7 pages
Lecture 7.1 - Decision Tree Classification
No ratings yet
Lecture 7.1 - Decision Tree Classification
15 pages
Machine Learning chapter 4
No ratings yet
Machine Learning chapter 4
9 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
06-Classification_Part1
No ratings yet
06-Classification_Part1
44 pages
DecisionTree Numerical ID3Prob
No ratings yet
DecisionTree Numerical ID3Prob
114 pages
Experiment No-2
No ratings yet
Experiment No-2
4 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
S&ML Unit 6- Q & A
No ratings yet
S&ML Unit 6- Q & A
12 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
Trinh Khanh Ly 20213676
No ratings yet
Trinh Khanh Ly 20213676
13 pages
decisiontrees (1)
No ratings yet
decisiontrees (1)
28 pages
Week - 2 Day - 2 Machine Learning 2 - 3
No ratings yet
Week - 2 Day - 2 Machine Learning 2 - 3
33 pages
Lesson 7 Supervised Method (Decision Trees) Algorithms
No ratings yet
Lesson 7 Supervised Method (Decision Trees) Algorithms
12 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
ASSIGNMENT-3
No ratings yet
ASSIGNMENT-3
1 page
Basic Electrical Engineering Unit 1
No ratings yet
Basic Electrical Engineering Unit 1
53 pages
UNIT-1 QUESTIONS
No ratings yet
UNIT-1 QUESTIONS
2 pages
ASSIGNMENT-1
No ratings yet
ASSIGNMENT-1
2 pages
QUANTUM SERIES UNIT 1
No ratings yet
QUANTUM SERIES UNIT 1
16 pages
UNIT-4 QUESTIONS
No ratings yet
UNIT-4 QUESTIONS
3 pages
Engg. graphics and design -2
No ratings yet
Engg. graphics and design -2
72 pages
3 Decision Trees_LMS
No ratings yet
3 Decision Trees_LMS
47 pages
Unit-3_ Question Bank
No ratings yet
Unit-3_ Question Bank
4 pages
01 Goals and Components (13 Files Merged)
No ratings yet
01 Goals and Components (13 Files Merged)
188 pages
Unit 4_Question bank
No ratings yet
Unit 4_Question bank
4 pages
Emerging Technology
No ratings yet
Emerging Technology
20 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
I2ml3e Chap11
No ratings yet
I2ml3e Chap11
38 pages
Sistem Pendukung Keputusan Untuk Menentukan Jurusan Pada Siswa Sma Menggunakan Metode KNN Dan Smart
No ratings yet
Sistem Pendukung Keputusan Untuk Menentukan Jurusan Pada Siswa Sma Menggunakan Metode KNN Dan Smart
10 pages
Enhanced Bagging EBagging A Novel Approach For Ens
No ratings yet
Enhanced Bagging EBagging A Novel Approach For Ens
15 pages
Gradient Boosting in ML
No ratings yet
Gradient Boosting in ML
5 pages
Ritik DL
No ratings yet
Ritik DL
17 pages
Co-So-Tri-Tue-Nhan-Tao - 2021-Reviewexercise09-Nn-Sol - (Cuuduongthancong - Com)
No ratings yet
Co-So-Tri-Tue-Nhan-Tao - 2021-Reviewexercise09-Nn-Sol - (Cuuduongthancong - Com)
2 pages
Unit IV Artificial Neural Networks
No ratings yet
Unit IV Artificial Neural Networks
25 pages
Perceptron - Wikipedia
No ratings yet
Perceptron - Wikipedia
9 pages
One Class Text Classification Using An Ensemble of Classifiers
No ratings yet
One Class Text Classification Using An Ensemble of Classifiers
71 pages
Wahyudi 2020
No ratings yet
Wahyudi 2020
10 pages
Presentation FYP
No ratings yet
Presentation FYP
18 pages
Back Propagation Neural Network 1: Lili Ayu Wulandhari PH.D
No ratings yet
Back Propagation Neural Network 1: Lili Ayu Wulandhari PH.D
8 pages
Deep Learning: Data Mining: Advanced Aspects
No ratings yet
Deep Learning: Data Mining: Advanced Aspects
131 pages
DL Unit 1
No ratings yet
DL Unit 1
16 pages
Machine Learning Techniques: Important Questions Unit-1
No ratings yet
Machine Learning Techniques: Important Questions Unit-1
8 pages
UNIT II Basic On Neural Networks
No ratings yet
UNIT II Basic On Neural Networks
36 pages
Boosting: 1. What Is The Difference Between Adaboost and Gradient Boosting?
No ratings yet
Boosting: 1. What Is The Difference Between Adaboost and Gradient Boosting?
2 pages
247-Article Text-517-1-10-20201130 PDF
No ratings yet
247-Article Text-517-1-10-20201130 PDF
9 pages
II PG ML Lab Question Paper
No ratings yet
II PG ML Lab Question Paper
2 pages
Credit Card Fraud Detection Using State-Of-The-Art Machine Learning and Deep Learning Algorithms
No ratings yet
Credit Card Fraud Detection Using State-Of-The-Art Machine Learning and Deep Learning Algorithms
16 pages
Adarsh - 2024en01 - Soft Computing Assignment 1
No ratings yet
Adarsh - 2024en01 - Soft Computing Assignment 1
12 pages
Architecture: Simple Neural Nets For Pattern Classification
No ratings yet
Architecture: Simple Neural Nets For Pattern Classification
15 pages
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
20 pages
28.6 - Kernel Trick - mp4
No ratings yet
28.6 - Kernel Trick - mp4
3 pages
Soft Computing Perceptron Neural Network in MATLAB
No ratings yet
Soft Computing Perceptron Neural Network in MATLAB
8 pages
Btech Ec 6 Sem Artificial Neural Network Nec 013 2017
No ratings yet
Btech Ec 6 Sem Artificial Neural Network Nec 013 2017
1 page
What Is A Perceptron?
No ratings yet
What Is A Perceptron?
1 page
AppliedMachineLearning S12023 24
No ratings yet
AppliedMachineLearning S12023 24
5 pages