0% found this document useful (0 votes)

12 views32 pages

Ensemble Learning

The document discusses ensemble learning, which is the process of combining multiple machine learning models to obtain better predictive performance than could be obtained from any of the constituent models alone. It describes different types of ensembles including voting, bagging, boosting, and stacking. It also explains how combining models can lead to stronger learners even when the individual models are weak.

Uploaded by

Hiba Saghir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views32 pages

Ensemble Learning

Uploaded by

Hiba Saghir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

ENSEMBLE LEARNING

The science of combining models

1
Wisdom of the Crowd

2
Wisdom of the Crowd (of machines)

• The wisdom of the crowd

• A diverse set of models are Some unkown
likely to make better decisions distribution
as compared to single models
• Combining decisions from
multiple models to improve
the overall performance. Model 6 Model 4 Model 2 Model 1
Model 3
Model 5

3
What is ensemble learning?
• An ensemble is a group of predictors
• An ensemble can be a strong learner even if each predictor is a weak learner
• Provided there are a sufficient number of weak learners and they are sufficiently diverse

4
How will combining leads to a strong learner?
• Think of a slightly biased coin with 51% chance of heads and 49% of tails.

• Law of large numbers: as you keep tossing the coin, assuming every toss is
independent of others, the ratio of heads gets closer and closer to the probability of
heads 51%.

5
How will combining leads to a strong learner?
• Tossing the coin 1000 times, we will end with more or less than 510 heads and 490
tails.

• For an ensemble of 1000 classifiers, each correct 51% of the time, the probability of
getting the majority of heads is up 75%

6
Ensemble Learning

• Different types of ensembles

Voting
• Use the same/different learning
algorithms
• Homogeneous vs heterogeneous Bagging and
ensembles Pasting
EL approaches
• Use the same dataset/ random
subsets of data Boosting
• Use the same/different sets of
features Stacking

7
Voting Classifier
• Train diverse predictors on the same data

8
Voting Classifier
• Use voting for predictions Hard voting:
The ensemble prediction is the
prediction of the majority

Predictor 1 Predictor 2 Predictor 3 Predictor 4 Predictor 5 Ensemble’s prediction

Example
5 4 5 4 4 4 9
Voting Classifier
• Use voting for predictions

Soft voting:
The ensemble prediction is the class
with the highest averaged probability
averaged
10
Bagging and Pasting
• Use different random subset of samples
• Usually predictors of the same type are used

• Bagging: sampling with replacement

• For a given predictor, a training instance
may be sampled several times

• Pasting: sampling without replacement

• Training instances may be sampled several
times across predictors

• Training and predictions can be

performed in parallel
11
Example of Bagging
Original data 1 2 3 4 5 6 7 8 9 10
Sample size = 10

Bootstrap 1 Bootstrap 2 Bootstrap 3

7 8 10 8 2 5 10 10 5 9 1 8 5 10 5 5 9 6 3 7 1 4 9 1 2 3 2 7 3 2

Model 1 Model 2 Model 3

Combined prediction
12
Example of Pasting
Original data 1 2 3 4 5 6 7 8 9 10
Sample size = 6

Sample 1 Sample 2 Sample 3

7 8 10 1 2 3 1 8 5 10 9 6 1 4 9 2 3 7 6

Model 1 Model 2 Model 3

Combined prediction
13
Random Subspaces and Random patches
X1 X2 X3 X4 X5

1 10 21 30 44 15

2 12 20 35 40 20

3 10 24 34 43 14

4 15 22 31 41 12

5 19 25 35 42 19

6 12 29 30 45 11

Random Subspaces Random Patches

1 21 44 15 1 10 30 44 1 10 21 30 1 10 21 30 44 15 2 12 20 35 40 20 1 10 21 30 44 15

2 20 40 20 2 12 35 40 2 12 20 35 3 10 24 34 43 14 3 10 24 34 43 14 4 15 22 31 41 12

3 24 43 14 3 10 34 43 3 10 24 34 4 15 22 31 41 12 4 15 22 31 41 12 5 19 25 35 42 19

4 22 41 12 4 15 31 41 4 15 22 31 5 19 25 35 42 19 6 12 29 30 45 11 6 12 29 30 45 11

5 25 42 19 5 19 35 42 5 19 25 35
Sampling both instances and Features
6 29 45 11 6 12 30 45 6 12 29 30

Keep all instances and sample Features 14

Boosting
• Unlike bagging, individual predictors are trained sequentially, each trying to correct
its predecessor.

15
Adaboost (Adaptive boosting)
• Instead of sampling, re-weigh samples
• Samples are given weights.
• Start with uniform weighting

• At each iteration, a model is learned and the samples are re-weighted so the next
classifiers focus on samples that were wrongly predicted by previous classifier
• Weights of correctly predicted samples are decreased
• Weights of incorrectly predicted samples are increased

• Final prediction is a combination of model predictions weighted by their respective

error measures

16
Adaboost
Original data D1 Weighted data D2 Weighted data D3
1 1
4 4 1 4
2 2 5 2 5
5
5 1 1 5 1 Combined Classifier
5 2
2 2
1
3 3 4
3 3 3 3
4 2
4 4 5
5 1
2
Classifier 1 Classifier 2 Classifier 3
3 3
1 1 4
4 4 1 4
2 2 5 2
5 5
5 1 1 5 1
2 5 2
2
3 3 3
3 3 3
4 4 4

17
Adaboost algorithm
1
• Each sample weight 𝑤𝑖 in the training set is initialized to , where 𝑚 is the number of
𝑚
samples in training set.
• For each trained predictor j, compute its weighted error 𝑟𝑗 and its weight 𝛼𝑗

• Update sample weights

• For final predictions

18
Gradient Boosting
• Unlike AdaBoost, Gradient Boosting tries to ﬁt the new predictor to the residual errors
made by the previous predictor.

19
Gradient Boosting
𝐷𝑎𝑡𝑎𝑠𝑒𝑡 𝐷 = {𝑥𝑖 , 𝑦𝑖 }1𝑚

𝐷1 = {𝑥𝑖 , 𝑦𝑖 }1𝑚

𝐷2 = {𝑥𝑖 , 𝑦𝑖 − ℎ1 (𝑥𝑖 )}1𝑚

𝐷3 = {𝑥𝑖 , 𝑦𝑖 − ℎ1 (𝑥𝑖 ) − ℎ2 (𝑥𝑖 )}1𝑚

20
Gradiant boosting: example
• Training data : square footage data on five apartments and their rent prices in dollars per
month
• We use the mean (average) of the rent prices as our initial model F0

21
Gradiant boosting: example
• Next, we train weak models Δ𝑖 to predict residuals for all i observations

22
Gradiant boosting: example
• Next, we train weak models Δ𝑖 to predict residuals for all i observations

• The residuals (blue dots) get smaller as we add more learners

23
Gradiant boosting: example

24
Gradiant boosting: example
• Summing the three learners

25
Stacking (Stacked generalization)
• Basic idea:
• Train a separate model to perform aggregation of the predictions of the individual
classifiers
• Predictions from the train set are
used as features for level 1 model.
• Level 1 model is used to make a
prediction on the test set
Blender or meta-
learner

26
Stacking (Stacked generalization)
• Stacking with a hold-out set → Blending

27
Stacking
Training Subset 1 Model 1
• Training level 0 models
X1 X2 X3 X4 C
x1,1 x2,1 x3,1 x4,1 True
train
x1,2 x2,2 x3,2 x4,2 False Model 2
Training set … … … … …

X1 X2 X3 X4 C x1,i x2,i x3,i x4,i True

x1,1 x2,1 x3,1 x4,1 True Model 3

x1,2 x2,2 x3,2 x4,2 False
… … … … … Training Subset 2
x1,n x2,n x3,n x4,n False X1 X2 X3 X4 C
x1,i+1 x2,i+1 x3,i+1 x4,i+1 True
x1,i+2 x2,i+2 x3,i+2 x4,i+2 False
… … … … …
x1,n x2,n x3,n x4,n True 28
Stacking
Model predictions True targets
Training Subset 2 Model 1
X1 X2 X3 X4 C M1 M2 M3 C
x1,i+1 x2,i+1 x3,i+1 x4,i+1 True True True False True
predict
x1,i+2 x2,i+2 x3,i+2 x4,i+2 False Model 2 False True True False
… … … … … … … … …
x1,n x2,n x3,n x4,n True True False True True
Model 3
Training set for level 1 model

Final predictions Generalizer

29
Multilayer stacking

30
Conclusions
• Ensemble learning is about training multiple base models and combined them to obtain a
strong model with better performance
• Ideally low bias, low variance

• In bagging ensembles, instances of the same base model are trained in parallel on random
subsets of data and then aggregated
• Using random sampling reduce variance

• In boosting ensembles, instances of the same base model are trained iteratively, such that,
each model attempts to correct the predictions of the previous model.
• Stacking ensembles use multi-stage training. Different types of base models are trained at
the very first stage on top of which a meta-model is trained to make predictions based on
based model predictions
31
Ensemble learning on diabetes data
• Load the diabetes data and split it into training set, a validation set, and a test set
• 30% of data for test, 30% of training for validation
• Train various classifiers individually: Decision tree, KNN and SVM
• Voting ensemble: Combine the classifiers into an ensemble using hard or soft
voting.
• Use the validation set to find the best ensemble (it must outperforms the individual classifiers)
• Evaluate the best model found on the test set and compare the results

• Stacking ensemble: using the previous classifiers

• Create a new training set (for the meta learner) using the predictions on the validation set
• Train a classifier (e.g, random forest) on the new training set
• Evaluate the model on the test set and compare the results
32

A PPT in Perdev Sy 2024 2025 Week 4 Lesson 6
No ratings yet
A PPT in Perdev Sy 2024 2025 Week 4 Lesson 6
60 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
LAC Minutes of The Meeting
No ratings yet
LAC Minutes of The Meeting
27 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Behavioral Learning Theories and Approaches To Learning - 20241226 - 202745 - 0000
No ratings yet
Behavioral Learning Theories and Approaches To Learning - 20241226 - 202745 - 0000
105 pages
Jntuk R20 ML Unit-Iii
100% (1)
Jntuk R20 ML Unit-Iii
21 pages
Unit 3
No ratings yet
Unit 3
99 pages
Final Uace Sub Ict NCDC Syllabus
89% (9)
Final Uace Sub Ict NCDC Syllabus
63 pages
04 EnsembleLearning
No ratings yet
04 EnsembleLearning
40 pages
Measuring ROI of Training
50% (2)
Measuring ROI of Training
68 pages
Training Activity Matrix
No ratings yet
Training Activity Matrix
5 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
UE20CS302 Unit3 Slides
No ratings yet
UE20CS302 Unit3 Slides
308 pages
Ensemble Learning-Bagging-Boosting-Stacking
No ratings yet
Ensemble Learning-Bagging-Boosting-Stacking
12 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
dlp16 Math1q2
100% (2)
dlp16 Math1q2
2 pages
Self Study Ielts 2017 Bdk2002
No ratings yet
Self Study Ielts 2017 Bdk2002
9 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
Unit 4
No ratings yet
Unit 4
24 pages
Developing Reading Fluency With Repeated Reading
No ratings yet
Developing Reading Fluency With Repeated Reading
6 pages
Unit I ML (I) 24-25
No ratings yet
Unit I ML (I) 24-25
79 pages
Unit-I (Ensemble Learning)
No ratings yet
Unit-I (Ensemble Learning)
67 pages
CH 7 - Ensemble Learning and Random Forests
No ratings yet
CH 7 - Ensemble Learning and Random Forests
78 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Lecture Method
100% (2)
Lecture Method
16 pages
Unit 2 ML
No ratings yet
Unit 2 ML
47 pages
Ai ML Unit 4 Notes
No ratings yet
Ai ML Unit 4 Notes
42 pages
Unit IV Aiml
No ratings yet
Unit IV Aiml
32 pages
Unit 4
No ratings yet
Unit 4
45 pages
Lec Ensemble Learning
No ratings yet
Lec Ensemble Learning
39 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
ML Unit 3 r20 Jntuk
No ratings yet
ML Unit 3 r20 Jntuk
22 pages
Open World First - B2 - Listening - WS - 5 - Plus
No ratings yet
Open World First - B2 - Listening - WS - 5 - Plus
4 pages
Lecture 2
No ratings yet
Lecture 2
35 pages
Module 2
No ratings yet
Module 2
34 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Cognitivism in Classroom
No ratings yet
Cognitivism in Classroom
6 pages
GR12 Agri COT2 DLL - EXPLICIT - 2nd QRTR
No ratings yet
GR12 Agri COT2 DLL - EXPLICIT - 2nd QRTR
3 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
Unit Iv
No ratings yet
Unit Iv
28 pages
ML Uint 4-2
No ratings yet
ML Uint 4-2
20 pages
Ensemble Learning SA
No ratings yet
Ensemble Learning SA
27 pages
Neural Network Ensemble 정리 자료
No ratings yet
Neural Network Ensemble 정리 자료
15 pages
Unit 4 AIML
No ratings yet
Unit 4 AIML
29 pages
Unit 4
No ratings yet
Unit 4
24 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
24 pages
Ensemble Methods Send
No ratings yet
Ensemble Methods Send
20 pages
UNIT-5 ML Notes
No ratings yet
UNIT-5 ML Notes
24 pages
Ensemble Learning
No ratings yet
Ensemble Learning
26 pages
Mark Smith s254336 Paper 2
No ratings yet
Mark Smith s254336 Paper 2
7 pages
Unit Iv
No ratings yet
Unit Iv
22 pages
Ensemble Learning: Comprehensive Explanation: Base Models
No ratings yet
Ensemble Learning: Comprehensive Explanation: Base Models
20 pages
Unit 4 Study Material
No ratings yet
Unit 4 Study Material
24 pages
Article Review 9 Eng
No ratings yet
Article Review 9 Eng
21 pages
Aiml Unit 4
No ratings yet
Aiml Unit 4
17 pages
ML Unit-3
No ratings yet
ML Unit-3
22 pages
Ensemble Methods (Final)
No ratings yet
Ensemble Methods (Final)
16 pages
Unit Iv
No ratings yet
Unit Iv
18 pages
Ensembles Learning
No ratings yet
Ensembles Learning
16 pages
MYP Criteria Year 5
No ratings yet
MYP Criteria Year 5
4 pages
Unit 4 Updated Notes
No ratings yet
Unit 4 Updated Notes
13 pages
Unit 5 ML
No ratings yet
Unit 5 ML
14 pages
Study Plan
No ratings yet
Study Plan
4 pages
Fewvlm
No ratings yet
Fewvlm
13 pages
Visual Ensemble Techniques Presentation
No ratings yet
Visual Ensemble Techniques Presentation
13 pages
MTech Seminar II
No ratings yet
MTech Seminar II
10 pages
Time To Explore (5) ML
No ratings yet
Time To Explore (5) ML
9 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
Resume 102019
No ratings yet
Resume 102019
2 pages
GED0104 STS Module 2 Facilitation Guide
No ratings yet
GED0104 STS Module 2 Facilitation Guide
8 pages
Unit 4 PDF
No ratings yet
Unit 4 PDF
9 pages
Scientific Notation To Standard Form Abalde
No ratings yet
Scientific Notation To Standard Form Abalde
7 pages
Ensemble Learning
No ratings yet
Ensemble Learning
8 pages
Summative Evaluation of Practicum 1
No ratings yet
Summative Evaluation of Practicum 1
2 pages
Unit 4 ML
No ratings yet
Unit 4 ML
9 pages
Module 1 ML
No ratings yet
Module 1 ML
8 pages
DLP Identifies Odd and Even Numbers
No ratings yet
DLP Identifies Odd and Even Numbers
7 pages
Brittany Balazs Resume
No ratings yet
Brittany Balazs Resume
2 pages
Glossary Exercise
No ratings yet
Glossary Exercise
2 pages
Expectancy Sheet Syllabus
No ratings yet
Expectancy Sheet Syllabus
3 pages
Ubd Lesson Exemplar
No ratings yet
Ubd Lesson Exemplar
3 pages
Jessica Christiansen Resume For Weebly
No ratings yet
Jessica Christiansen Resume For Weebly
2 pages
Types of Learning
No ratings yet
Types of Learning
6 pages
Humber NHS Foundation Trust Registered Nurse - Band 5 Person Specification
No ratings yet
Humber NHS Foundation Trust Registered Nurse - Band 5 Person Specification
2 pages
Go Recipes for Developers: Top techniques and practical solutions for real-life Go programming problems
From Everand
Go Recipes for Developers: Top techniques and practical solutions for real-life Go programming problems
Burak Serdar
No ratings yet
INVENRELATION
From Everand
INVENRELATION
Shih Yu Chang
No ratings yet
Shadow Weave Simply: Understanding the Weave Structure 25 Projects to Practice Your Skills
From Everand
Shadow Weave Simply: Understanding the Weave Structure 25 Projects to Practice Your Skills
Susan Kesler-Simpson
No ratings yet