Ensemble Learning Algorithms
Ensemble Learning Algorithms
With Python
Jason Brownlee
i
Disclaimer
The information contained within this eBook is strictly for educational purposes. If you wish to apply
ideas contained in this eBook, you are taking full responsibility for your actions.
The author has made every effort to ensure the accuracy of the information within this book was
correct at time of publication. The author does not assume and hereby disclaims any liability to any
party for any loss, damage, or disruption caused by errors or omissions, whether such errors or
omissions result from accident, negligence, or any other cause.
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic or
mechanical, recording or by any information storage and retrieval system, without written permission
from the author.
Acknowledgements
Special thanks to my copy editor Sarah Martin and my technical editors Michael Sanderson and Arun
Koshy, Andrei Cheremskoy, and John Halfyard.
Copyright
Edition: v1.1
Contents
Copyright i
Contents ii
Preface iii
I Introduction iv
II Bagging 2
1 Bagged Decision Trees Ensemble 3
1.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Bagging Ensemble Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Evaluate Bagging Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Bagging for Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Bagging for Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Bagging Hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.1 Explore Number of Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.2 Explore Number of Samples . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.3 Explore Alternate Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Bagging Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.1 Pasting Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.2 Random Subspace Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.3 Random Patches Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6 Common Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
ii
Preface
Predictive skill can be the most important outcome for some modeling projects. This can
be the case when slightly better predictions can result in a large benefit to the organization.
A popular example is Netflix, where slightly better recommendations are known to result in
better customer retention with the platform. This motivated the one-million-dollar Netflix
prize, which was won using a large ensemble of models. On predictive modeling problems where
predictive performance is most important, like machine learning competitions, ensembles are
almost universally among the top and winning solutions. Ensemble learning algorithms are
required if you want the best results.
Ensemble learning used to be an advanced subfield of machine learning, left to the experts.
This was for two main reasons. The first is that many ensemble learning algorithms are a more
complex type of model, requiring the careful training and integration of multiple other machine
learning models. This makes them challenging to implement and challenging to train correctly
in a way that avoids data leakage, and in turn, optimistically misleading results. The second
reason is that ensemble learning is computationally expensive as instead of fitting and evaluating
one model, a single ensemble requires fitting tens, hundreds, or even thousands of models. This
used to require large computational resources and expertise with parallel programming.
Thankfully, things have changed. Desktop computers are now incredibly fast and are multi-
core by default. We also have access to a suite of advanced ensemble algorithms in modern
machine learning libraries such as scikit-Learn in Python, as well as highly efficient third-party
implementations of some of the more powerful ensemble algorithms in libraries like XGBoost and
LightGBM. It has never been easier to rapidly evaluate advanced ensemble learning algorithms
on your own predictive modeling projects. The problem has transformed from a matter of how
to implement ensemble methods correctly to instead what are the extent of ensemble methods
available and how can they be tailored to specific machine learning projects. That is why I
created this book.
I designed this book to take you on a tour of the most effective ensemble machine learning
algorithms and show you exactly how they can be used to address classification and regression
problems, and how to configure and tune the techniques to get the most out of them. I wanted
to skip the theory and math for each method, which may be interesting but do not tell you how
to actually configure and use the methods, and focus on showing you exactly how to get a result
so that you can bring modern and powerful ensemble learning algorithms to your own projects
as fast as possible. Ensemble learning is important to machine learning, and I believe that if it
is taught at the right level for practitioners, it can be a fascinating, fun, directly applicable, and
an immeasurably useful toolbox of techniques. I hope that you agree.
Jason Brownlee
2021
iii
Part I
Introduction
iv
Welcome
Welcome to Ensemble Learning Algorithms With Python. Ensemble learning algorithms are
those techniques that combine the predictions of two or more machine learning algorithms with
the goal of improving predictive skill. Ensemble learning algorithms are a more advanced subfield
of machine learning, often turned to on machine learning projects when predictive performance
is the most important objective. As such, ensembles are widely used by top participants and
winners of competitive machine learning competitions.
Traditionally, ensembles have been challenging to implement due to their increased computa-
tional cost and complexity, which can introduce data leakage and result in optimistic estimates
of model performance. Modern libraries, such as scikit-learn and related third party libraries,
now make working with ensembles straightforward for beginners and advanced practitioners
alike. I designed this book to teach you the techniques for ensemble learning step-by-step with
concrete and executable examples in Python.
This guide was written in the top-down and results-first machine learning style that you’re
used to from Machine Learning Mastery.
The intuition behind drawing upon a crowd or multiple experts when making important
decisions and how this intuition carries over to ensemble learning algorithms.
v
vi
The benefits of ensemble learning techniques for predictive modeling for both lifting
predictive skill and improving model robustness.
How to develop and evaluate multi-model algorithms for classification and regression
problems, providing a precursor to ensemble learning.
How to develop, configure, and evaluate bagging ensembles for classification and regression
predictive modeling problems.
How to develop and evaluate extensions to bagging, such as random subspace, random
forest, and extra trees ensembles.
How to develop, configure, and evaluate adaptive boosting (AdaBoost) and gradient
ensembles for classification and regression predictive modeling problems.
How to develop, configure, and evaluate stacking ensembles for classification and regression
predictive modeling problems.
How to develop and evaluate simpler stacking ensembles such as voting and weighted
average ensembles.
How to develop and evaluate extensions to stacking, such as model blending and super
learner ensembles.
This book is not a substitute for an undergraduate course on ensemble learning (if such
courses exist) or a textbook for such a course, although it could complement such materials. For
a good list of top papers, textbooks, and other resources on ensemble learning, see the Further
Reading section at the end of each tutorial.
Algorithms were demonstrated on synthetic and small standard datasets to give you the
context and confidence to bring the techniques to your own projects.
Model configurations used were discovered through trial and error and are skillful, but
not optimized. This leaves the door open for you to explore new and possibly better
configurations.
Code examples are complete and standalone. The code for each lesson will run as-is with
no code from prior lessons or third parties needed beyond the installation of the required
packages.
A complete working example is presented with each tutorial for you to inspect and copy-paste.
All source code is also provided with the book and I would recommend running the provided
files whenever possible to avoid any copy-paste issues. The provided code was developed in a
text editor and is intended to be run on the command line. No special IDE or notebooks are
required. If you are using a more advanced development environment and are having trouble,
try running the example from the command line instead.
Machine learning algorithms are stochastic. This means that they will make different
predictions when the same model configuration is trained on the same training data. On top of
that, each experimental problem in this book is based on generating stochastic predictions. As a
result, this means you will not get exactly the same sample output presented in this book. This
is by design. I want you to get used to the stochastic nature of the machine learning algorithms.
If this bothers you, please note:
You can re-run a given example a few times and your results should be close to the values
reported.
You can make the output consistent by fixing the random number seed.
You can develop a robust estimate of the skill of a model by fitting and evaluating it
multiple times and taking the average of the final skill score (highly recommended).
All code examples were tested on a POSIX-compatible machine with Python 3. All code
examples will run on modest and modern computer hardware. I am only human, and there
may be a bug in the sample code. If you discover a bug, please let me know so I can fix it and
correct the book (and you can request a free update at any time).
Research papers.
Webpages.
API documentation.
ix
Open-source projects.
Wherever possible, I have listed and linked to the relevant API documentation for key objects
and functions used in each lesson so you can learn more about them. When it comes to research
papers, I have listed those that are first to use a specific technique or first in a specific problem
domain. These are not required reading but can give you more technical details, theory, and
configuration details if you’re looking for it. Wherever possible, I have tried to link to the freely
available version of the paper on the arXiv preprint archive. You can search for and download
any of the papers listed on Google Scholar Search. Wherever possible, I have tried to link to
books on Amazon.
I don’t know everything, and if you discover a good resource related to a given lesson, please
let me know so I can update the book.
Help with a technique? If you need help with the technical aspects of a specific
operation or technique, see the Further Reading section at the end of each tutorial.
Help with APIs? If you need help with using a Python library, see the list of resources
in the Further Reading section at the end of each lesson, and also see Appendix A.
Help with your workstation? If you need help setting up your environment, I would
recommend using Anaconda and following my tutorial in Appendix B.
Next
Are you ready? Let’s dive in!
This is Just a Sample
Jason Brownlee
1
Part II
Bagging
2
Chapter 1
Bagging is an ensemble machine learning algorithm that combines the predictions from many
decision trees. It is also easy to implement given that it has few key hyperparameters and
sensible heuristics for configuring these hyperparameters. Bagging performs well in general and
provides the basis for a whole field of ensemble of decision tree algorithms such as the popular
random forest and extra trees ensemble algorithms, as well as the lesser-known Pasting, Random
Subspaces, and Random Patches ensemble algorithms. In this tutorial, you will discover how to
develop Bagging ensembles for classification and regression. After completing this tutorial, you
will know:
Bagging ensemble is an ensemble created from decision trees fit on different samples of a
dataset.
How to use the Bagging ensemble for classification and regression with scikit-learn.
3. Bagging Hyperparameters
4. Bagging Extensions
5. Common Questions
3
1.2. Bagging Ensemble Algorithm 4
Predictions are made for regression problems by averaging the prediction across the decision
trees. Predictions are made for classification problems by taking the majority vote prediction
for the classes from across the predictions made by the decision trees. The bagged decision trees
are effective because each decision tree is fit on a slightly different training dataset, which in
turn allows each tree to have minor differences and make slightly different skillful predictions.
Technically, we say that the method is effective because the trees have a low correlation between
predictions and, in turn, prediction errors.
Decision trees, specifically unpruned decision trees, are used as they slightly overfit the
training data and have a high variance. Other high-variance machine learning algorithms can
be used, such as a k-nearest neighbors algorithm with a low k value, although decision trees
have proven to be the most effective.
If perturbing the learning set can cause significant changes in the predictor con-
structed, then bagging can improve accuracy.
Bagging does not always offer an improvement. For low-variance models that already perform
well, bagging can result in a decrease in model performance.
1.3. Evaluate Bagging Ensembles 5
The evidence, both experimental and theoretical, is that bagging can push a good
but unstable procedure a significant step towards optimality. On the other hand, it
can slightly degrade the performance of stable procedures.
Listing 1.2: Example output from creating the synthetic classification dataset.
Next, we can evaluate a Bagging algorithm on this dataset. We will evaluate the model
using repeated stratified k-fold cross-validation, with three repeats and 10 folds. We will report
the mean and standard deviation of the accuracy of the model across all repeats and folds.
# evaluate bagging algorithm for classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
1.3. Evaluate Bagging Ensembles 6
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
In this case, we can see the Bagging ensemble with default hyperparameters achieves a
classification accuracy of about 85 percent on this synthetic dataset.
Mean Accuracy: 0.856 (0.037)
Listing 1.5: Example of using bagging for making a prediction on a classification dataset.
Running the example fits the Bagging ensemble model on the entire dataset and is then used
to make a prediction on a new row of data, as we might when using the model in an application.
1.3. Evaluate Bagging Ensembles 7
Predicted Class: 1
Listing 1.6: Example output from using bagging for making a prediction on a classification
dataset.
Now that we are familiar with using Bagging for classification, let’s look at the API for
regression.
Listing 1.8: Example output from creating the synthetic regression dataset.
Next, we can evaluate a Bagging algorithm on this dataset. As we did with the last section,
we will evaluate the model using repeated k-fold cross-validation, with three repeats and 10
folds. We will report the mean absolute error (MAE) of the model across all repeats and folds.
The complete example is listed below.
Note: The scikit-learn API flips the sign of the MAE to transform it from minimizing error to
maximizing negative error. This means that large magnitude positive errors become large
negative errors (e.g. 100 becomes -100) and a perfect model has no error with a value of 0.0. It
also means that we can safely ignore the sign of the mean MAE scores.
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
In this case, we can see that the Bagging ensemble with default hyperparameters achieves a
MAE of about 100.
MAE: -101.133 (9.757)
Listing 1.11: Example of using bagging for making a prediction on a regression dataset.
Running the example fits the Bagging ensemble model on the entire dataset and is then used
to make a prediction on a new row of data, as we might when using the model in an application.
Prediction: -134
Listing 1.12: Example output from using bagging for making a prediction on a regression
dataset.
Now that we are familiar with using the scikit-learn API to evaluate and use Bagging
ensembles, let’s look at configuring the model.
1.4. Bagging Hyperparameters 9
# define dataset
X, y = get_dataset()
# get the models to evaluate
models = get_models()
# evaluate the models and store results
results, names = list(), list()
for name, model in models.items():
1.4. Bagging Hyperparameters 10
Listing 1.13: Example of evaluating the effect of the number of trees in the bagging ensemble.
Running the example first reports the mean accuracy for each configured number of decision
trees.
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
In this case, we can see that performance improves on this dataset until about 100 trees and
remains flat after that.
>10 0.855 (0.037)
>50 0.876 (0.035)
>100 0.882 (0.037)
>500 0.885 (0.041)
>1000 0.885 (0.037)
>5000 0.885 (0.038)
Listing 1.14: Example output from evaluating the effect of the number of trees in the bagging
ensemble.
A box and whisker plot is created for the distribution of accuracy scores for each configured
number of trees. We can see the general trend of no further improvement beyond about 100
trees.
1.4. Bagging Hyperparameters 11
Figure 1.1: Box Plot of Bagging Ensemble Size vs. Classification Accuracy.
n_redundant=5, random_state=5)
return X, y
# define dataset
X, y = get_dataset()
# get the models to evaluate
models = get_models()
# evaluate the models and store results
results, names = list(), list()
for name, model in models.items():
# evaluate the model
scores = evaluate_model(model, X, y)
# store the results
results.append(scores)
names.append(name)
# summarize the performance along the way
print('>%s %.3f (%.3f)' % (name, mean(scores), std(scores)))
# plot model performance for comparison
pyplot.boxplot(results, labels=names, showmeans=True)
pyplot.show()
Listing 1.15: Example of evaluating the effect of the number of samples in the bagging ensemble.
Running the example first reports the mean accuracy for each sample set size.
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
In this case, the results suggest that performance generally improves with an increase in
the sample size, highlighting that the default of 100 percent the size of the training dataset
is sensible. It might also be interesting to explore a smaller sample size with a corresponding
increase in the number of trees in an effort to reduce the variance of the individual models.
>0.1 0.810 (0.036)
>0.2 0.836 (0.044)
>0.3 0.844 (0.043)
>0.4 0.843 (0.041)
1.4. Bagging Hyperparameters 13
Listing 1.16: Example output from evaluating the effect of the number of samples in the bagging
ensemble.
A box and whisker plot is created for the distribution of accuracy scores for each sample
size. We see a general trend of increasing accuracy with sample size.
Figure 1.2: Box Plot of Bagging Sample Size vs. Classification Accuracy.
The example below demonstrates using a KNeighborsClassifier as the base algorithm used
in the bagging ensemble. Here, the algorithm is used with default hyperparameters where k is
set to 5.
# evaluate bagging with knn algorithm for classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import BaggingClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# define the model
model = BaggingClassifier(base_estimator=KNeighborsClassifier())
# define the evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))
Listing 1.17: Example of evaluating the effect of changing the base algorithm in the bagging
ensemble.
Running the example reports the mean and standard deviation accuracy of the model.
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
In this case, we can see the Bagging ensemble with KNN and default hyperparameters
achieves a classification accuracy of about 88 percent on this synthetic dataset.
Mean Accuracy: 0.888 (0.036)
Listing 1.18: Example output from evaluating the effect of changing the base algorithm in the
bagging ensemble.
We can test different values of k to find the right balance of model variance to achieve good
performance as a bagged ensemble. The below example tests bagged KNN models with k values
between 1 and 20.
# explore bagging ensemble k for knn effect on performance
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
from sklearn.neighbors import KNeighborsClassifier
from matplotlib import pyplot
def get_dataset():
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
n_redundant=5, random_state=5)
return X, y
# define dataset
X, y = get_dataset()
# get the models to evaluate
models = get_models()
# evaluate the models and store results
results, names = list(), list()
for name, model in models.items():
# evaluate the model
scores = evaluate_model(model, X, y)
# store the results
results.append(scores)
names.append(name)
# summarize the performance along the way
print('>%s %.3f (%.3f)' % (name, mean(scores), std(scores)))
# plot model performance for comparison
pyplot.boxplot(results, labels=names, showmeans=True)
pyplot.show()
Listing 1.19: Example of evaluating the effect of changing the configuration of KNN as the base
algorithm in the bagging ensemble.
Running the example first reports the mean accuracy for each k value.
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
In this case, the results suggest a small k value such as two to four results in the best mean
accuracy when used in a bagging ensemble.
>1 0.884 (0.025)
>2 0.890 (0.029)
1.4. Bagging Hyperparameters 16
Listing 1.20: Example output from evaluating the effect of changing the configuration of KNN
as the base algorithm in the bagging ensemble.
A box and whisker plot is created for the distribution of accuracy scores for each k value.
We see a general trend of increasing accuracy with sample size in the beginning, then a modest
decrease in performance as the variance of the individual KNN models used in the ensemble is
increased with larger k values.
1.5. Bagging Extensions 17
Figure 1.3: Box Plot of Bagging KNN Number of Neighbors vs. Classification Accuracy.
— Pasting Small Votes for Classification in Large Databases and On-Line, 1999.
The example below demonstrates the Pasting ensemble by setting the bootstrap argument
to False and setting the number of samples used in the training dataset via max samples to a
modest value, in this case, 50 percent of the training dataset size.
# evaluate pasting ensemble algorithm for classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# define the model
model = BaggingClassifier(bootstrap=False, max_samples=0.5)
# define the evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))
We investigate a very simple, yet effective, ensemble framework that builds each
individual model of the ensemble from a random patch of data obtained by drawing
random subsets of both instances and features from the whole dataset.
The example below demonstrates the Random Patches Ensemble with decision trees created
from a random sample of the training dataset limited to 50 percent of the size of the training
dataset, and with a random subset of 10 features.
# evaluate random patches ensemble algorithm for classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# define the model
model = BaggingClassifier(bootstrap=False, max_features=10, max_samples=0.5)
# define the evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
In this case, we can see the Random Patches Ensemble achieves a classification accuracy of
about 84 percent on this dataset.
Mean Accuracy: 0.845 (0.036)
... it is well known that Bagging should be used with unstable learners, and generally,
the more unstable, the larger the performance improvement.
... the performance of Bagging converges as the ensemble size, i.e., the number of
base learners, grows large ...
Bagging is best suited for problems with relatively small available training datasets.
Papers
Bagging predictors, 1996.
https://link.springer.com/article/10.1007/BF00058655
Pasting Small Votes for Classification in Large Databases and On-Line, 1999.
https://link.springer.com/article/10.1023/A:1007563306331
Books
Pattern Classification Using Ensemble Methods, 2010.
https://amzn.to/2zxc0F7
APIs
sklearn.ensemble.BaggingClassifier API.
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html
sklearn.ensemble.BaggingRegressor API.
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingRegressor.html
Articles
Bootstrap aggregating, Wikipedia.
https://en.wikipedia.org/wiki/Bootstrap_aggregating
1.8 Summary
In this tutorial, you discovered how to develop Bagging ensembles for classification and regression.
Specifically, you learned:
Bagging ensemble is an ensemble created from decision trees fit on different samples of a
dataset.
How to use the Bagging ensemble for classification and regression with scikit-learn.
Next
In the next section, we will take a closer look at an extension to Bagging called Random
Subspace Ensembles.
This is Just a Sample
Jason Brownlee
23