0% found this document useful (0 votes)

33 views44 pages

ML4 - Decision Trees & Random Forest

Decision trees use a tree-like graph or model of decisions and their possible consequences to help determine an outcome. Random forest is an ensemble learning method that fits multiple decision trees on various sub-samples of a dataset and uses averaging to improve the predictive accuracy and control over-fitting. Ensemble methods like bagging, boosting, and stacking combine multiple machine learning models to reduce variance and bias to improve overall accuracy.

Uploaded by

param_email

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views44 pages

ML4 - Decision Trees & Random Forest

Uploaded by

param_email

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

Decision Trees

Random Forest
REFERENCES
Decision Trees
Python Decision Tree Classification Tutorial: Scikit-Learn
DecisionTreeClassifier | DataCamp
Ensemble Learning & Random Forest
Ensemble Learning Methods: Bagging, Boosting and Stacking (analyticsvi
dhya.com)
Basic Ensemble Learning (Random Forest, AdaBoost, Gradient Boosting)
- Step by Step Explained | by Lilly Chen | Towards Data Science
Decision Trees
• Flowchart-like tree structure where
• an internal node represents a feature(or attribute)
• the branch represents a decision rule, and
• each leaf node represents the outcome
• Root Node: Topmost node in a decision tree
• It learns to partition on the basis of the feature (attribute) value
• Tree is partitioned in a recursive manner called recursive partitioning
• Helps you in decision-making
• It's visualization like a flowchart diagram easily mimics the human level thinking
• That is why decision trees are easy to understand and interpret
Decision Trees
How does Decision Tree Algorithm work?
• Select the best feature (attribute) using Attribute Selection Measures
(ASM) to split the examples (records)
• Make that attribute a decision node and breaks the dataset into
smaller subsets
• Start tree building by repeating this process recursively for each child
until one of the conditions will match:
• All the tuples belong to the same attribute value
• There are no more remaining attributes
• There are no more instances
Attribute Selection Measures
• Heuristics for selecting the splitting criterion that partitions data in the
best possible manner
• Provide a rank to each feature (or attribute) by explaining the given
dataset
• The best score attribute will be selected as a splitting attribute
• Most popular selection measures are
• Information Gain
• Gain Ratio
• Gini Index
Information Gain
• Information Gain ≡ Reduction in Entropy
• How well the given feature (attribute) separates the target classes
• Entropy ≡ Measure of disorder (in target feature)
• For Binary Classification
• Entropy = 0, if all values of target feature are homogeneous (same)
• Entropy = 1, if target feature has equal number of values for both classes
• Entropy of a dataset S = Entropy(S) = - ∑ pᵢ * log₂(pᵢ) ; i = 1 to n
• n is the total number of classes in the target column (in our case n = 2 i.e YES and NO)
• pᵢ is the probability of class ‘i’ or the ratio of “number of rows (examples) with class i in
the target column” to the “total number of rows (examples)” in the dataset
• Information Gain for a feature (attribute/column) A
• IG(S, A) = Entropy(S) - ∑((|Sᵥ| / |S|) * Entropy(Sᵥ))
• Sᵥ is the set of rows (examples) in S for which the feature (column) A has value v
• |Sᵥ| is the number of rows (examples) in Sᵥ
• |S| is the number of rows (examples) in S
• Sample dataset of COVID-19
infection
• Features (attributes) for decision nodes
• Fever
• Cough
• Breathing issues
• Target
• Infected
• Classes / Values (Two : Y & N)
• From the total of 14 rows in our dataset S,
there are 8 rows with the target
value YES and 6 rows with the target
value NO. The entropy of S is calculated as:
• Entropy(S) = - ∑ pᵢ * log₂(pᵢ) ; i = 1 to n
• Entropy(S) = — (8/14) * log₂(8/14) — (6/14) *
log₂(6/14) = 0.99
• IG Calculation for Fever
• In this(Fever) feature there are 8 rows
having value YES and 6 rows having
value NO Target
• As shown at left, in the 8 rows with YES for
Fever, there are 6 rows having target value
YES and 2 rows having target value NO
• IG Calculation for Fever
• In this(Fever) feature there are 8 rows
having value YES and 6 rows having
value NO Target
• As shown at left, in the 6 rows with NO,
there are 2 rows having target
value YES and 4 rows having target
value NO
First Step
Since the feature Breathing issues have the highest Information Gain it is
used to create the root node
Second Step
IG of Fever is greater than that of Cough, so we select Fever as the left
branch of Breathing Issues:
Third Step
Only one unused feature left we have no other choice but to make it the
right branch of the root node
Final Step (Creating Leaf Nodes)
- For the left leaf node of Fever, we see the subset of rows from the original
data set that has Breathing Issues and Fever both values as YES
- Similarly, for the right node of Fever we see the subset of rows from the
original data set that have Breathing Issues value as YES and Fever as NO
- We repeat the same process for the node Cough, however here both left and
right leaves turn out to be the same i.e. NO
Attribute Selection Measures
• Information Gain
• ID3 (Iterative Dichotomiser) decision tree algorithm uses information gain
• Gain Ratio
• C4.5, an improvement of ID3, uses an extension to information gain known as
the gain ratio
• Gini Index (Impurity)
• CART (Classification and Regression Tree) uses the Gini method to create
split points
• Attribute with the minimum Gini index is chosen as the splitting attribute
Decision Tree Classifier Building in Scikit-
learn
Example Code
• Import Required Libraries
4
• Load Data
• Load the required Pima Indian Diabetes dataset using pandas' read CSV function
• Ensure diabetes.csv in current folder
• Split dataset in features and target variable
• Split dataset into training and test sets
• Create Decision Tree Classifier Model, Train it and Predict
• Evaluate the Model
• Visualising Decision Tress
• pip install graphviz
• pip install pydotplus
• The export_graphviz function converts the decision tree classifier into a dot file, and
pydotplus converts this dot file to png or displayable form on Jupyter
Decision Tree Classifier Building in Scikit-
learn
Example Code 4
• Optimization of decision tree classifier
• Pruning - Maximum depth of the tree can be used as a control variable for
pruning (with max_depth=3)
• Other attribute selection measure such as entropy
Decision Trees - Advantages
• Easy to interpret and visualize
• Can easily capture Non-linear patterns (can create complex decision
boundaries)
• Requires fewer data preprocessing from the user, for example,
• there is no need to normalize / scale columns
• Missing values do not affect the process of building tree
• Has no assumptions about distribution because of the non-parametric nature
of the algorithm
• Parametric ML Models (Linear Regression, Naïve Bayes, NN, Logistic Regression)
• Non-Parametric ML Models (k-NN, DT, SVM)
Decision Trees - Disadvantages
• Sensitive to noisy data (can overfit noisy data)
• Small variation(or variance) in data can result in the different decision tree
• Decision trees are biased with imbalanced dataset, so it is recommended that
balance out the dataset before creating the decision tree
• Can be highly time consuming in training phase
RANDOM FOREST
ENSEMBLE LEARNING
Ensemble Learning Methods: Bagging, Boosting and Stacking (analyticsvi
dhya.com)
Basic Ensemble Learning (Random Forest, AdaBoost, Gradient Boosting)
- Step by Step Explained | by Lilly Chen | Towards Data Science
RANDOM FOREST / ENSEMBLE
LEARNING
• In Real Life
• Before taking Big Decisions, we ask opinions (friends / family / colleagues)
• Prevention against being BIASED and IRRATIONAL
• For ML Models too
• Individual models may suffer from BIAS and VARIANCE
• Ensemble Learning may prevent this
• Ensemble Learning
• Making predictions based on a number of different models
• By combining individual models, ensemble model tends to be
• More flexible (less BIAS)
• Less data-sensitive (less VARIANCE)
• Ensemble ≡ Crowd opinion!!!
Ensemble Methods
• Meta-algorithms that combine several machine learning techniques
into one predictive model in order to
• Decrease variance (bagging)
• Decrease bias (boosting)
• Improve predictions (stacking)
• Individual Models
• Tend to perform poorly (low prediction accuracy)
• Are weak learners (either high bias or high variance)
• Ensemble Learning
• Combine multiple models (learners) to get one with better performance
(accuracy)
Problems in Individual Models
• High bias model (not learning data well enough)
• High variance model (learning the data too well)
Ensemble Learning aims to
• Reduce the bias if we have a weak model with high bias and low
variance
• Reduce the variance if we have a weak model with high variance and
low bias
• Have resulting model much more balanced, with low bias and
variance. Resulting model will be
• Known as a strong learner
• More generalized than the weak learners
• Able to make accurate predictions
Ensemble Learning
• Improves a model’s performance in mainly three ways:
• By reducing the variance of weak learners (BAGGING)
• By reducing the bias of weak learners (BOOSTING)
• By improving the overall accuracy of strong learners (STACKING)
Random Forest
• Ensemble Model
• Using Bagging as Ensemble Method
• Decision Tree as individual model
• Step 1: Select n (e.g. 1000) random subsets from the training set
• Step 2: Train n (e.g. 1000) decision trees
• one random subset is used to train one decision tree
• the optimal splits for each decision tree are based on a random subset of
features (e.g. 10 features in total, randomly select 5 out of 10 features to split)
• Step 3: Each individual tree predicts the records/candidates in the
test set, independently.
• Step 4: Make the final prediction (voting / averaging)
AdaBoosting (Adaptive Boosting)
• Ensemble Model
• Using Boosting as Ensemble method
• Using Decision Tree as individual model
• key is learning from the previous mistakes, e.g. misclassification data points
(weight of misclassified points are increased)
• Step 0: Initialize the weights of data points. if the training set has 100 data
points, then each point’s initial weight should be 1/100 = 0.01.
• Step 1: Train a decision tree
• Step 2: Calculate the weighted error rate (e) of the decision tree. The
weighted error rate (e) is just how many wrong predictions out of total and
you treat the wrong predictions differently based on its data point’s weight. The
higher the weight, the more the corresponding error will be
weighted during the calculation of the (e).
• Step 3: Calculate this decision tree’s weight in the ensemble
• the weight of this tree = learning rate * log( (1 — e) / e)
• the higher weighted error rate of a tree, 😫, the less decision power the tree will be given
during the later voting
• the lower weighted error rate of a tree, 😃, the higher decision power the tree will be
given during the later voting
• Step 4: Update weights of wrongly classified points
• the weight of each data point =
• if the model got this data point correct, the weight stays the same
• if the model got this data point wrong, the new weight of this point = old weight *
• Note: The higher the weight of the tree (more accurate this tree performs), the more boost (importance) the misclassified data
point by this tree will get. The weights of the data points are normalized after all the misclassified points are updated.
• Step 5: Repeat Step 1(until the number of trees we set to train is reached)
• Step 6: Make the final prediction
• AdaBoost makes a new prediction by adding up the weight (of each tree) multiply the prediction (of each tree). Obviously, the
tree with higher weight will have more power of influence the final decision.
Example Code
• Load Library
• Create Dataset
• Split Training – Test Set
• Fit a Decision Tree model
• Fit a Random Forest model
• Fit a AdaBoost model
Random Forest
• One of the most popular and commonly used algorithms by Data
Scientists
• Supervised Machine Learning Algorithm that is used widely in
Classification and Regression problems
• Builds decision trees on different samples and takes their majority vote
for classification and average in case of regression
• Can handle the data set containing continuous variables, as in the case
of regression, and categorical variables, as in the case of classification
Features of Random Forest
• Diversity: Not all attributes/variables/features are considered while making
an individual tree; each tree is different.
• Immune to the curse of dimensionality: Since each tree does not consider
all the features, the feature space is reduced.
• Parallelization: Each tree is created independently out of different data and
attributes. This means we can fully use the CPU to build random forests.
• Train-Test split: In a random forest, we don’t have to segregate the data for
train and test as there will always be 30% of the data which is not seen by the
decision tree.
• Stability: Stability arises because the result is based on majority voting/
averaging
Difference between DT & RF
DECISION TREES RANDOM FOREST
Decision trees normally suffer Random forests are created from
from the problem of overfitting if subsets of data, and the final output
it’s allowed to grow without any is based on average or majority
control ranking; hence the problem of
overfitting is taken care of
A single decision tree is faster in It is comparatively slower
computation
When a data set with features is Random forest randomly selects
taken as input by a decision tree, observations, builds a decision tree,
it will formulate some rules to and takes the average result. It
make predictions doesn’t use any set of formulas

Outlier Math CV
No ratings yet
Outlier Math CV
2 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
2025 Ensemble Learning
No ratings yet
2025 Ensemble Learning
25 pages
CH2-Decision Trees and Random Forest
No ratings yet
CH2-Decision Trees and Random Forest
54 pages
Machine Learning Lecture 2,3,4
No ratings yet
Machine Learning Lecture 2,3,4
26 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
ML U-3
No ratings yet
ML U-3
16 pages
ML Unit-3
No ratings yet
ML Unit-3
16 pages
ML Lec-12
No ratings yet
ML Lec-12
17 pages
Machine Learning 10601 Recitation 8 Oct 21, 2009: Oznur Tastan
No ratings yet
Machine Learning 10601 Recitation 8 Oct 21, 2009: Oznur Tastan
46 pages
Lecture-7 Machine Learning With Python
No ratings yet
Lecture-7 Machine Learning With Python
42 pages
Decision Trees
No ratings yet
Decision Trees
13 pages
Unit 3
No ratings yet
Unit 3
63 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Ds Notes Mca
No ratings yet
Ds Notes Mca
30 pages
DM Unit 4
No ratings yet
DM Unit 4
24 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Lecture 11 Slides - After
No ratings yet
Lecture 11 Slides - After
55 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
DataMining-Handouts1 5
No ratings yet
DataMining-Handouts1 5
8 pages
Data Mining NOTES
No ratings yet
Data Mining NOTES
57 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Unit-3 Alt
No ratings yet
Unit-3 Alt
24 pages
Module 2
No ratings yet
Module 2
34 pages
Trees
No ratings yet
Trees
78 pages
Unit-V 1
No ratings yet
Unit-V 1
26 pages
Lecture W5ab
No ratings yet
Lecture W5ab
56 pages
ML Important
No ratings yet
ML Important
11 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
80 pages
ML Unit3
No ratings yet
ML Unit3
24 pages
2024 Decision Trees
No ratings yet
2024 Decision Trees
28 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
What Is Decision Tree
No ratings yet
What Is Decision Tree
35 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
ml2 PDF
No ratings yet
ml2 PDF
5 pages
Introduction To Big Data and Data Mining
No ratings yet
Introduction To Big Data and Data Mining
130 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
No ratings yet
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
13 pages
Class Basic
No ratings yet
Class Basic
75 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
MLA NLP Lecture2
No ratings yet
MLA NLP Lecture2
76 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
DEPHIDES Deep Learning Based Phishing Detection System
No ratings yet
DEPHIDES Deep Learning Based Phishing Detection System
19 pages
Machine Learning in Dentistry Dropbox Download
100% (18)
Machine Learning in Dentistry Dropbox Download
15 pages
Be Vii-Sem Project Seminar Review-2 Deep Learning Based Authorship Attribution
No ratings yet
Be Vii-Sem Project Seminar Review-2 Deep Learning Based Authorship Attribution
15 pages
DSF Unit 3
No ratings yet
DSF Unit 3
29 pages
20AI16 - ML Record
No ratings yet
20AI16 - ML Record
24 pages
7406HW05 1
No ratings yet
7406HW05 1
2 pages
Rapid Alzheimer's Disease Diagnosis Using Advanced Artificial Intelligence Algorithms
No ratings yet
Rapid Alzheimer's Disease Diagnosis Using Advanced Artificial Intelligence Algorithms
9 pages
Dry Food Paper SOIC
No ratings yet
Dry Food Paper SOIC
15 pages
NLP Mini Project
No ratings yet
NLP Mini Project
19 pages
Exploring Deep Learning and Machine Learning Approaches For Brain Hemorrhage Detection
No ratings yet
Exploring Deep Learning and Machine Learning Approaches For Brain Hemorrhage Detection
34 pages
LP-III Lab Manual
No ratings yet
LP-III Lab Manual
81 pages
Unit 3
No ratings yet
Unit 3
13 pages
Handgun Detection Using Combined Human Pose and Weapon Appearance
No ratings yet
Handgun Detection Using Combined Human Pose and Weapon Appearance
17 pages
MLDA Syllabus
No ratings yet
MLDA Syllabus
20 pages
Top 45 Machine Learning Interview Questions in 2025
100% (1)
Top 45 Machine Learning Interview Questions in 2025
37 pages
Employing Machine Learning Techniques and Fuzzy Membership For Detecting Fraud Transactions in Credit Card
No ratings yet
Employing Machine Learning Techniques and Fuzzy Membership For Detecting Fraud Transactions in Credit Card
82 pages
Pham 2021
No ratings yet
Pham 2021
14 pages
Application of Machine Learning and Deep Learning For Predicting Groundwater Levels in The West Coast Aquifer System, South Africa
No ratings yet
Application of Machine Learning and Deep Learning For Predicting Groundwater Levels in The West Coast Aquifer System, South Africa
18 pages
Full Download Machine Learning With R, The Tidyverse, and MLR 1st Edition Hefin Ioan Rhys PDF
100% (2)
Full Download Machine Learning With R, The Tidyverse, and MLR 1st Edition Hefin Ioan Rhys PDF
65 pages
1 s2.0 S1359836823006029 Main
No ratings yet
1 s2.0 S1359836823006029 Main
16 pages
Seismic Risk and Vulnerability Models Considering Typical Urban Building Portfolios
No ratings yet
Seismic Risk and Vulnerability Models Considering Typical Urban Building Portfolios
36 pages
AI - QP - XII - T1 - 2023-24 - Revision Exam
No ratings yet
AI - QP - XII - T1 - 2023-24 - Revision Exam
10 pages
Superfiltering: Weak-to-Strong Data Filtering For Fast Instruction-Tuning
No ratings yet
Superfiltering: Weak-to-Strong Data Filtering For Fast Instruction-Tuning
13 pages
ML - Business Report - Priyanka Sharma
No ratings yet
ML - Business Report - Priyanka Sharma
117 pages
Machine Learning Models For Energy Consumption Prediction in Buildings
No ratings yet
Machine Learning Models For Energy Consumption Prediction in Buildings
1 page
LLM - Michael R Douglas
No ratings yet
LLM - Michael R Douglas
47 pages
Optimized GAN-Based Pipeline For High-Quality Face Restoration From CCTV Images
No ratings yet
Optimized GAN-Based Pipeline For High-Quality Face Restoration From CCTV Images
29 pages
Amazon SageMaker Guide - FAQs
No ratings yet
Amazon SageMaker Guide - FAQs
9 pages
Chapter - 2-ML
No ratings yet
Chapter - 2-ML
63 pages

ML4 - Decision Trees & Random Forest

Uploaded by

ML4 - Decision Trees & Random Forest

Uploaded by

Decision Trees

You might also like