0% found this document useful (0 votes)

29 views25 pages

2025 Ensemble Learning

Uploaded by

Pavankumar Palla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views25 pages

2025 Ensemble Learning

Uploaded by

Pavankumar Palla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Presenter

B.E., M.Tech., Ph.D.

Senior Associate Professor, Grade 2, School of Computing Science Engineering and Artificial Intelligence
Assistant Director, Centre for Innovation in Teaching and Learning
VIT Bhopal University

"Invitation to Co-Create: Let's Collaborate"

+919945379089
Email(s): [email protected],[email protected]

Google Scholar || SCOPUS || ORCID || LinkedIn || YouTube

Data + Math + Binding Language ⇒ Machine Learning
Introduction to Machine Learning
What is Classification in Machine Learning?

https://paulvanderlaken.com/2020/01/20/animated-machine-learning-classifiers/

What is Ensemble Learning?

Ensemble learning is a machine learning technique where multiple models, often referred to as "weak learners" or "base
models," are combined to improve overall predictive performance. The core idea is that by aggregating the outputs of
multiple models, the ensemble can achieve better generalization and accuracy than any individual model alone.

For Numerical Prediction ⇒ Averaging is the technique

For Categorical Prediction ⇒ Majority Voting is the technique

Why is Ensemble Learning considered in ML?

● Ensemble methods aim to improve models' predictability by combining several models to
make one very reliable model.

● Ensemble methods work best when the base models make different types of errors, allowing
the combined model to compensate for individual weaknesses.

● Predictions from individual models are combined using techniques like averaging, voting, or
stacking to make the final prediction.

● Ensembles are generally more robust in terms of overfitting and noise in the data.
● The most popular ensemble methods are bagging, boosting, and stacking.
● Ensemble methods are ideal for regression and classification, where they reduce bias and
variance to boost the accuracy of models.
Categories of Ensemble Methods
Parallel ensemble techniques – Random Forest

Homogenous base learners

Sequential ensemble techniques - Adaptive Boosting (AdaBoost)

Heterogeneous base learners

Bias and Variance

Bias: The error is due to overly simplistic assumptions in the model. High bias can cause the model to miss important
patterns, leading to underfitting.

Variance: The error is due to sensitivity to small fluctuations in the training data. High variance can cause the
model to overfit, capturing noise instead of the underlying pattern.

A model with underfitting: Straight line failing to capture the curve.

A model with overfitting: Complex wavy line that fits every point, including noise.

A model with good / best fit: Smooth curve capturing the main trend.

Total Error = Bias² + Variance + Irreducible Error.

Sampling with replacement is called a bootstrapping method.

i. Random Forest can be adapted to classification or numeric prediction problems

ii. It classifies the data based on voting whereas, for numeric prediction, it uses an

average method

Random forest is a supervised learning algorithm that is used for both classification as well as

regression. However, it is mainly used for classification problems. As we know a forest is made up of trees

and more trees means a more robust forest.

Similarly, a random forest algorithm creates decision trees on data samples then gets the

prediction from each of them, and finally selects the best solution employing voting. It is an ensemble

method that is better than a single decision tree because it reduces the over-fitting by averaging the

result.
The random forest is a model made up of many decision trees. Rather than just simply averaging the

prediction of trees.

This model uses two key concepts:

1. A random sampling of training data points when building trees

2. Random subsets of features considered when splitting nodes

For the training phase, each tree in a random forest learns from a random sample of the data

points. The samples are drawn with replacement, known as bootstrapping, which means that some

samples will be used multiple times in a single tree.

A subset of all the features is considered for splitting each node in each decision tree. Generally,

this is set to sqrt(n_features) for classification. For example, if there are 16 features, at each node

in each tree, only 4 features will be considered for splitting the node.

⇒ The random forest itself is a bagging methodology or Ensemble.

⇒ Random Forest builds the structure based on the Gini Impurity.

Numerical Example: Construct a Decision Tree by using “Gini Index” as a criterion
We are going to use this data sample. Let’s
try to use information gain as a
criterion. Here, we have 5 columns out of
which 4 columns have continuous data and
5th column consists of class labels.

A, B, C, and D attributes can be considered

as predictors and E column class labels can
be considered as a target variable. To
construct a decision tree from this data, we
have to convert continuous data into
categorical data.

We have chosen some random values to

categorize each attribute.

Gini Index for Var A

Var A has a value >=5 for 12 records out of 16 and 4 records with a value <5 value.
● For Var A >= 5 & class == positive: 5/12
● For Var A >= 5 & class == negative: 7/12
o gini(5,7) = 1- ( (5/12)^2 + (7/12)^2 ) = 0.4860
● For Var A <5 & class == positive: 3/4
● For Var A <5 & class == negative: 1/4
o gini(3,1) = 1- ( (3/4)^2 + (1/4)^2 ) = 0.375

By adding weight and sum each of the gini indices:

Gini Index for Var B

Var B has a value >=3 for 12 records out of 16 and 4 records with a value <5 value.

● For Var B >= 3 & class == positive: 8/12

● For Var B >= 3 & class == negative: 4/12
o gini(8,4) = 1- ( (8/12)2 + (4/12)2 ) = 0.446
● For Var B <3 & class == positive: 0/4
● For Var B <3 & class == negative: 4/4
o gin(0,4) = 1- ( (0/4)2 + (4/4)2 ) = 0

Gini Index for Var C

Var C has a value >=4.2 for 6 records out of 16 and 10 records with a value <4.2 value.

● For Var C >= 4.2 & class == positive: 0/6

● For Var C >= 4.2 & class == negative: 6/6
o gini(0,6) = 1- ( (0/8)2 + (6/6)2 ) = 0
● For Var C < 4.2& class == positive: 8/10
● For Var C < 4.2 & class == negative: 2/10
o gin(8,2) = 1- ( (8/10)2 + (2/10)2 ) = 0.32

Gini Index for Var D

Var D has a value >=1.4 for 5 records out of 16 and 11 records with a value <1.4 value.

● For Var D >= 1.4 & class == positive: 0/5

● For Var D >= 1.4 & class == negative: 5/5
o gini(0,5) = 1- ( (0/5)2 + (5/5)2 ) = 0
● For Var D < 1.4 & class == positive: 8/11
● For Var D < 1.4 & class == negative: 3/11
o gin(8,3) = 1- ( (8/11)2 + (3/11)2 ) = 0.397
In the case of the Gini Index, we need to consider the minimum value as a root node. Among
the 4 attributes, C has the minimum Gini Index value of 2.

Hence, we need to consider C as a root node.

In general, the more trees in the forest the more robust the forest looks. In the same way in the
random forest classifier, the higher the number of trees in the forest gives the high accuracy results.

Model Validation Metrics

Confusion Matrix - A Meme Card
❖Sensitivity is called a Recall

F1 Score: In most real-life classification problems, imbalanced class distribution exists and
thus F1-score is a better metric to evaluate our model on.

Accuracy can be used when the class distribution is similar while F1-score is a better metric
when there are imbalanced classes as in the above case.

Online Calculator for the Confusion Matrix & Other Metrics

http://onlineconfusionmatrix.com/
Main Types of Ensemble Methods

1. Bagging (Bootstrap Aggregating):

How it works?

○ Take random subsets of the data (with replacement) to create different datasets (called

bootstraps).
○ Train the same type of model (e.g., decision trees) on each dataset.

○ Combine the results by averaging (for regression) or voting (for classification).

● Example: Random Forest

○ Random Forest is an extension of bagging that uses decision trees. It also introduces a

random selection of features at each split to increase diversity.

2. Boosting
How it works?

○ Train a model on the data.

○ Analyze where the model made errors.

○ Train a new model that focuses on correcting those errors.

○ Repeat this process multiple times, combining models in a weighted manner.

● The key difference from bagging is that models are trained sequentially, and each new model

learns from the mistakes of the previous ones.

● Example: Gradient Boosting, AdaBoost

○ AdaBoost: Assigns higher weights to incorrectly predicted samples.

○ Gradient Boosting: Optimizes a loss function using gradient descent.

3. Stacking (Stacked Generalization)
How it works?

1. Train multiple different models (e.g., decision tree, logistic regression, SVM) on the same

dataset.

2. Use the predictions from these models as input features for a meta-model (e.g., logistic

regression).

3. The meta-model learns to make the final predictions by combining the outputs of the base

models.

● Key difference: Stacking uses multiple different types of models and combines them with a

meta-model, unlike bagging and boosting, which typically use the same type of model.

Which models should be ensemble?

Let us consider models A, B, and C with an accuracy of 87%, 82%, and 72% respectively. Suppose, A and
B are highly correlated and C is not at all correlated with both A & B. In this type of scenario instead of
combining models A & B, model C should be combined with model A or model B to reduce generalized
errors.
6 ways to increase the quality of the model
Sample Calculation
Let us consider n=7 (Number of Samples)

The initial weight attached to each observation is 1/7

Tree stumps define a 1-level decision tree. The main idea is that at each step, we want to find the

best stump, i.e the best data split

Each decision tree is called a stump, having 1-level.

For each feature, we need to build a tree and make the decision of which decision tree becomes the base

learning model. [This is done based on Gini Index, entropy, or information gain]
The initial Weight of each Observation is 1/7 in this case.

Total Error in the initial stage becomes (TE =1/7) for each record.

Performance of stump = ½ ln [1-TE/TE] [The difference between log and ln is that log is defined for

base 10 and ln is denoted for base e. For example, the log of base 2 is represented as log2 and the log of

base e, i.e. loge = ln (natural log).]

=1/2 ln [1-(1/7)/(1/7)]

= ½ ln [6]

=0.897

Now we can update the weight, for every wrongly classified record we need to increase the weight, and for

every correctly classified record, we need to reduce the weight.

Below is the way to calculate the weight for the correctly classified records

New Stage Weight = Previous Stage Weight * expperformance

= 1/7 * exp0.895

= 0.349

Below is the way to calculate the weight for the incorrectly classified records

New Stage Weight = Previous Stage Weight * exp-performance

= 1/7 * exp-0.895

=0.5 Model will grab the attention

If we calculate the sum of initial weights it becomes 1.

But in the next stage, it will produce more than one.

To normalize the weights, we need to divide each record weight by the sum of all the record weights.

Finally, each boosting technique differs from the other based on baseline classifier, split criterion, working

principle, and weights.

Adaboost is a Boosting algorithm that increases the accuracy by giving more weightage to the target

which is misclassified by the model.

The gradient Boosting Algorithm increases the accuracy by minimizing the loss function(error which is a
difference between actual and predicted value) and having them as targets for the next decision tree
building.

Sklearn Gradient Boosting is defaulted to the Decision tree only.

Stacking (Regression & Classification)
Hands-on Notebook
https://drive.google.com/file/d/1MC17y0t--JwCbnJYF7D-q3MnsFclH9NV/view?usp=sharing

MATH5 Semi DLP (Q3W6) MONDAY
100% (2)
MATH5 Semi DLP (Q3W6) MONDAY
5 pages
Individual Dual Sports
100% (1)
Individual Dual Sports
60 pages
Pa - Unit - Iv
No ratings yet
Pa - Unit - Iv
45 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Lesson Plan Life of Pi
No ratings yet
Lesson Plan Life of Pi
3 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Unit I ML (I) 24-25
No ratings yet
Unit I ML (I) 24-25
79 pages
Department of Education: Action Plan/Activity Design in Summer Reading Camp 2021
No ratings yet
Department of Education: Action Plan/Activity Design in Summer Reading Camp 2021
4 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
Final Demo Lesson Plan
No ratings yet
Final Demo Lesson Plan
10 pages
ML Unit 3 (DS)
No ratings yet
ML Unit 3 (DS)
31 pages
CBLM Template Preliminary Pages
100% (1)
CBLM Template Preliminary Pages
9 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Lecture 11 Slides - After
No ratings yet
Lecture 11 Slides - After
55 pages
Unit 3
No ratings yet
Unit 3
99 pages
07-Ensembles Notes
No ratings yet
07-Ensembles Notes
21 pages
ML - 5
No ratings yet
ML - 5
53 pages
Unit 3
No ratings yet
Unit 3
59 pages
14 Model Ensembles
No ratings yet
14 Model Ensembles
63 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Ensemble Methods
No ratings yet
Ensemble Methods
30 pages
Unit-V 1
No ratings yet
Unit-V 1
26 pages
Unit 3
No ratings yet
Unit 3
63 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Module in Special Topic 1 - Teaching Multi Grade Class
No ratings yet
Module in Special Topic 1 - Teaching Multi Grade Class
20 pages
Unit 3 by GPT
No ratings yet
Unit 3 by GPT
10 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Eda - M4
No ratings yet
Eda - M4
7 pages
Ensemble Methods
No ratings yet
Ensemble Methods
32 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
Module 2
No ratings yet
Module 2
34 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
39 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Machine Learning Lecture 2,3,4
No ratings yet
Machine Learning Lecture 2,3,4
26 pages
Midbrain Activation Franchise
No ratings yet
Midbrain Activation Franchise
26 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
Unit - 3 ML
No ratings yet
Unit - 3 ML
17 pages
8 Steps Coaching Philosophy
No ratings yet
8 Steps Coaching Philosophy
3 pages
Data Science - Decision Tree - Random Forest
No ratings yet
Data Science - Decision Tree - Random Forest
15 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
Curriculum Integration in Makabaya2
No ratings yet
Curriculum Integration in Makabaya2
3 pages
Bagging
No ratings yet
Bagging
7 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Unit 5 ML
No ratings yet
Unit 5 ML
14 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Cot 1 Quarter 1 Lesson Plan
No ratings yet
Cot 1 Quarter 1 Lesson Plan
5 pages
7 - Ensemble Techniques-Converted Updated
No ratings yet
7 - Ensemble Techniques-Converted Updated
8 pages
Technical Report
No ratings yet
Technical Report
10 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
2.4-Ensemble Methods Lecture Notes
No ratings yet
2.4-Ensemble Methods Lecture Notes
14 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Lesson Plan in Mathematics 9
100% (9)
Lesson Plan in Mathematics 9
2 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
SFNHS - Extension's School Learning Continuity Plan
No ratings yet
SFNHS - Extension's School Learning Continuity Plan
5 pages
Assignment 2 MGT 403
No ratings yet
Assignment 2 MGT 403
3 pages
Data Minning Unit 2-1
No ratings yet
Data Minning Unit 2-1
10 pages
Random Forest
No ratings yet
Random Forest
5 pages
Ensemble Techniques and Random Forest: - Linear Algebra. - Basics of Machine Learning
No ratings yet
Ensemble Techniques and Random Forest: - Linear Algebra. - Basics of Machine Learning
8 pages
Combining Classifiers: Outline
No ratings yet
Combining Classifiers: Outline
15 pages
Democratizing AI, and Surviving Titanic With Automated Machine Learning - Adnan Masood
No ratings yet
Democratizing AI, and Surviving Titanic With Automated Machine Learning - Adnan Masood
21 pages
1 - Basic Concepts of Engineering Research
No ratings yet
1 - Basic Concepts of Engineering Research
32 pages
Beginners Guide To Arabic
No ratings yet
Beginners Guide To Arabic
4 pages
Project
No ratings yet
Project
8 pages
Curriculum Vitae
No ratings yet
Curriculum Vitae
2 pages
MC 20
No ratings yet
MC 20
9 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
Tle DLL 8
No ratings yet
Tle DLL 8
4 pages
Department of Mechanical Engineering and Automobile Engineering
No ratings yet
Department of Mechanical Engineering and Automobile Engineering
1 page
Detailed Lesson Plan (DLP) Format: Learning Competency/ies: Code: TLE - HEBC9HS-Ia-g1
No ratings yet
Detailed Lesson Plan (DLP) Format: Learning Competency/ies: Code: TLE - HEBC9HS-Ia-g1
2 pages
Teacher S Resource Centre at A Glance L4
No ratings yet
Teacher S Resource Centre at A Glance L4
1 page
Syllabus Edu212 - Sy 2024-2025
No ratings yet
Syllabus Edu212 - Sy 2024-2025
13 pages
Frame of Reference 2020
No ratings yet
Frame of Reference 2020
4 pages
Design Thinking: Running Head: Reflective Report
No ratings yet
Design Thinking: Running Head: Reflective Report
22 pages
Gökalp Düzgün - Language Acquisition Presentation
No ratings yet
Gökalp Düzgün - Language Acquisition Presentation
16 pages
Mohammad Jari Resume
No ratings yet
Mohammad Jari Resume
1 page
Station Rotations Model Planning Template: Step 1: Reimagine The Learning Environment
No ratings yet
Station Rotations Model Planning Template: Step 1: Reimagine The Learning Environment
10 pages

2025 Ensemble Learning

Uploaded by

2025 Ensemble Learning

Uploaded by

Presenter

B.E., M.Tech., Ph.D.

"Invitation to Co-Create: Let's Collaborate"

Google Scholar || SCOPUS || ORCID || LinkedIn || YouTube

What is Ensemble Learning?

For Numerical Prediction ⇒ Averaging is the technique

For Categorical Prediction ⇒ Majority Voting is the technique

Why is Ensemble Learning considered in ML?

Homogenous base learners

Sequential ensemble techniques - Adaptive Boosting (AdaBoost)

Heterogeneous base learners

Bias and Variance

A model with underfitting: Straight line failing to capture the curve.

Total Error = Bias² + Variance + Irreducible Error.

i. Random Forest can be adapted to classification or numeric prediction problems

and more trees means a more robust forest.

This model uses two key concepts:

1. A random sampling of training data points when building trees

2. Random subsets of features considered when splitting nodes

samples will be used multiple times in a single tree.

⇒ The random forest itself is a bagging methodology or Ensemble.

⇒ Random Forest builds the structure based on the Gini Impurity.

A, B, C, and D attributes can be considered

We have chosen some random values to

Gini Index for Var A

By adding weight and sum each of the gini indices:

Gini Index for Var B

● For Var B >= 3 & class == positive: 8/12

Gini Index for Var C

● For Var C >= 4.2 & class == positive: 0/6

Gini Index for Var D

● For Var D >= 1.4 & class == positive: 0/5

Hence, we need to consider C as a root node.

Model Validation Metrics

Online Calculator for the Confusion Matrix & Other Metrics

1. Bagging (Bootstrap Aggregating):

○ Combine the results by averaging (for regression) or voting (for classification).

● Example: Random Forest

random selection of features at each split to increase diversity.

○ Train a model on the data.

○ Analyze where the model made errors.

○ Train a new model that focuses on correcting those errors.

○ Repeat this process multiple times, combining models in a weighted manner.

learns from the mistakes of the previous ones.

● Example: Gradient Boosting, AdaBoost

○ AdaBoost: Assigns higher weights to incorrectly predicted samples.

○ Gradient Boosting: Optimizes a loss function using gradient descent.

Which models should be ensemble?

The initial weight attached to each observation is 1/7

best stump, i.e the best data split

Each decision tree is called a stump, having 1-level.

base e, i.e. loge = ln (natural log).]

every correctly classified record, we need to reduce the weight.

New Stage Weight = Previous Stage Weight * expperformance

New Stage Weight = Previous Stage Weight * exp-performance

=0.5 Model will grab the attention

If we calculate the sum of initial weights it becomes 1.

But in the next stage, it will produce more than one.

principle, and weights.

which is misclassified by the model.

Sklearn Gradient Boosting is defaulted to the Decision tree only.

You might also like