100% found this document useful (1 vote)

300 views33 pages

Ensemble Learning Algorithms

Uploaded by

Dharaneesh .R.P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

300 views33 pages

Ensemble Learning Algorithms

Uploaded by

Dharaneesh .R.P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Ensemble Learning Algorithms

With Python

Make Better Predictions with

Bagging, Boosting, and Stacking

Jason Brownlee
i

Disclaimer
The information contained within this eBook is strictly for educational purposes. If you wish to apply
ideas contained in this eBook, you are taking full responsibility for your actions.
The author has made every effort to ensure the accuracy of the information within this book was
correct at time of publication. The author does not assume and hereby disclaims any liability to any
party for any loss, damage, or disruption caused by errors or omissions, whether such errors or
omissions result from accident, negligence, or any other cause.
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic or
mechanical, recording or by any information storage and retrieval system, without written permission
from the author.

Acknowledgements
Special thanks to my copy editor Sarah Martin and my technical editors Michael Sanderson and Arun
Koshy, Andrei Cheremskoy, and John Halfyard.

© Copyright 2021 Jason Brownlee. All Rights Reserved.

Ensemble Learning Algorithms With Python

Edition: v1.1
Contents

Contents ii

Preface iii

I Introduction iv

II Bagging 2
1 Bagged Decision Trees Ensemble 3
1.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Bagging Ensemble Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Evaluate Bagging Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Bagging for Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Bagging for Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Bagging Hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.1 Explore Number of Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.2 Explore Number of Samples . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.3 Explore Alternate Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Bagging Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.1 Pasting Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.2 Random Subspace Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.3 Random Patches Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6 Common Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

ii
Preface

Predictive skill can be the most important outcome for some modeling projects. This can
be the case when slightly better predictions can result in a large benefit to the organization.
A popular example is Netflix, where slightly better recommendations are known to result in
better customer retention with the platform. This motivated the one-million-dollar Netflix
prize, which was won using a large ensemble of models. On predictive modeling problems where
predictive performance is most important, like machine learning competitions, ensembles are
almost universally among the top and winning solutions. Ensemble learning algorithms are
required if you want the best results.
Ensemble learning used to be an advanced subfield of machine learning, left to the experts.
This was for two main reasons. The first is that many ensemble learning algorithms are a more
complex type of model, requiring the careful training and integration of multiple other machine
learning models. This makes them challenging to implement and challenging to train correctly
in a way that avoids data leakage, and in turn, optimistically misleading results. The second
reason is that ensemble learning is computationally expensive as instead of fitting and evaluating
one model, a single ensemble requires fitting tens, hundreds, or even thousands of models. This
used to require large computational resources and expertise with parallel programming.
Thankfully, things have changed. Desktop computers are now incredibly fast and are multi-
core by default. We also have access to a suite of advanced ensemble algorithms in modern
machine learning libraries such as scikit-Learn in Python, as well as highly efficient third-party
implementations of some of the more powerful ensemble algorithms in libraries like XGBoost and
LightGBM. It has never been easier to rapidly evaluate advanced ensemble learning algorithms
on your own predictive modeling projects. The problem has transformed from a matter of how
to implement ensemble methods correctly to instead what are the extent of ensemble methods
available and how can they be tailored to specific machine learning projects. That is why I
created this book.
I designed this book to take you on a tour of the most effective ensemble machine learning
algorithms and show you exactly how they can be used to address classification and regression
problems, and how to configure and tune the techniques to get the most out of them. I wanted
to skip the theory and math for each method, which may be interesting but do not tell you how
to actually configure and use the methods, and focus on showing you exactly how to get a result
so that you can bring modern and powerful ensemble learning algorithms to your own projects
as fast as possible. Ensemble learning is important to machine learning, and I believe that if it
is taught at the right level for practitioners, it can be a fascinating, fun, directly applicable, and
an immeasurably useful toolbox of techniques. I hope that you agree.

Jason Brownlee
2021

iii
Part I

Introduction

iv
Welcome

Welcome to Ensemble Learning Algorithms With Python. Ensemble learning algorithms are
those techniques that combine the predictions of two or more machine learning algorithms with
the goal of improving predictive skill. Ensemble learning algorithms are a more advanced subfield
of machine learning, often turned to on machine learning projects when predictive performance
is the most important objective. As such, ensembles are widely used by top participants and
winners of competitive machine learning competitions.
Traditionally, ensembles have been challenging to implement due to their increased computa-
tional cost and complexity, which can introduce data leakage and result in optimistic estimates
of model performance. Modern libraries, such as scikit-learn and related third party libraries,
now make working with ensembles straightforward for beginners and advanced practitioners
alike. I designed this book to teach you the techniques for ensemble learning step-by-step with
concrete and executable examples in Python.

Who Is This Book For?

Before we get started, let’s make sure you are in the right place. This book is for developers that
may know some applied machine learning. Maybe you know how to work through a predictive
modeling problem end-to-end, or at least most of the main steps, with popular tools. The
lessons in this book do assume a few things about you, such as:

You know your way around basic Python for programming.

You may know some basic NumPy for array manipulation.

You may know some basic Scikit-Learn for modeling.

This guide was written in the top-down and results-first machine learning style that you’re
used to from Machine Learning Mastery.

About Your Outcomes

This book will teach you the techniques for ensemble learning that you need to know as a
machine learning practitioner. After reading and working through this book, you will know:

The intuition behind drawing upon a crowd or multiple experts when making important
decisions and how this intuition carries over to ensemble learning algorithms.

v
vi

The benefits of ensemble learning techniques for predictive modeling for both lifting
predictive skill and improving model robustness.

How to develop and evaluate multi-model algorithms for classification and regression
problems, providing a precursor to ensemble learning.

How to develop, configure, and evaluate bagging ensembles for classification and regression
predictive modeling problems.

How to develop and evaluate extensions to bagging, such as random subspace, random
forest, and extra trees ensembles.

How to develop, configure, and evaluate adaptive boosting (AdaBoost) and gradient
ensembles for classification and regression predictive modeling problems.

How to develop and evaluate efficient implementations of gradient boosting ensembles,

such as extreme gradient boosting (XGBoost) and light gradient boosting machines
(LightGBM).

How to develop, configure, and evaluate stacking ensembles for classification and regression
predictive modeling problems.

How to develop and evaluate simpler stacking ensembles such as voting and weighted
average ensembles.

How to develop and evaluate extensions to stacking, such as model blending and super
learner ensembles.

This book is not a substitute for an undergraduate course on ensemble learning (if such
courses exist) or a textbook for such a course, although it could complement such materials. For
a good list of top papers, textbooks, and other resources on ensemble learning, see the Further
Reading section at the end of each tutorial.

How to Read This Book

This book was written to be read linearly, from start to finish. That being said, if you know the
basics and need help with a specific technique, then you can skip straight to that section and
get started. This book was designed for you to read on your workstation, on the screen, not on
a tablet or eReader. My hope is that you have the book open right next to your editor and run
the examples as you read about them.
This book is not intended to be read passively or be placed in a folder as a reference text. It
is a playbook, a workbook, and a guidebook intended for you to learn by doing and then apply
your new understanding with working Python examples. To get the most out of the book, I
would recommend playing with the examples in each tutorial. Extend them, break them, then
fix them.
vii

About the Book Structure

This book was designed around major ensemble learning techniques that are directly relevant to
real-world problems. There are a lot of things you could learn about ensemble learning, from
theory to abstract concepts to APIs. My goal is to take you straight to developing an intuition
for the elements you must understand with laser-focused tutorials.
The tutorials were designed to focus on how to get results with ensemble learning methods.
As such, the tutorials give you the tools to both rapidly understand and apply each technique
or operation. There is a mixture of both tutorial lessons and practical examples to introduce
the methods and give plenty of opportunities to practice using them. Each of the tutorials is
designed to take you about one hour to read through and complete, excluding the extensions
and further reading.
You can choose to work through the lessons one per day, one per week, or at your own pace.
I think momentum is critically important, and this book is intended to be read and used, not to
sit idle. I recommend picking a schedule and sticking to it. The tutorials are divided into six
parts; they are:
Part 1: Foundation: Discover the power of ensemble learning techniques, why they are
important to getting good performance on your project, and how to develop an intuition
for what is being learned.
Part 2: Background: Discover the background required for ensemble learning including
the diversity of ensemble members, techniques for combining predictions, the complexity
of ensemble models, and the main types of ensemble methods.
Part 3: Multiple Models: Discover machine learning techniques that involve explicitly
using multiple models that provide the foundation for ensemble learning methods.
Part 4: Bagging: Discover bootstrap aggregation known as bagging family of ensemble
learning techniques including random forest, extra trees, and related methods.
Part 5: Boosting: Discover the boosting family of ensemble learning techniques, in-
cluding adaptive boosting, gradient boosting, and modern efficient implementations like
extreme gradient boosting and light gradient boosting machines.
Part 6: Stacking: Discover the stacked generalization or stacking family of ensemble
learning methods, including voting, blending, and related methods.
Each part targets a specific learning outcome, and so does each tutorial within each part.
This acts as a filter to ensure you are only focused on the things you need to know to get to a
specific result and do not get bogged down in the math or near-infinite number of digressions.
The tutorials were not designed to teach you everything there is to know about each of the
methods. They were designed to give you an understanding of how they work, how to use them,
and how to interpret the results the fastest way I know how: to learn by doing.

About Python Code Examples

The code examples were carefully designed to demonstrate the purpose of a given lesson. For
this reason, the examples are highly targeted.
viii

Algorithms were demonstrated on synthetic and small standard datasets to give you the
context and confidence to bring the techniques to your own projects.

Model configurations used were discovered through trial and error and are skillful, but
not optimized. This leaves the door open for you to explore new and possibly better
configurations.

Code examples are complete and standalone. The code for each lesson will run as-is with
no code from prior lessons or third parties needed beyond the installation of the required
packages.

A complete working example is presented with each tutorial for you to inspect and copy-paste.
All source code is also provided with the book and I would recommend running the provided
files whenever possible to avoid any copy-paste issues. The provided code was developed in a
text editor and is intended to be run on the command line. No special IDE or notebooks are
required. If you are using a more advanced development environment and are having trouble,
try running the example from the command line instead.
Machine learning algorithms are stochastic. This means that they will make different
predictions when the same model configuration is trained on the same training data. On top of
that, each experimental problem in this book is based on generating stochastic predictions. As a
result, this means you will not get exactly the same sample output presented in this book. This
is by design. I want you to get used to the stochastic nature of the machine learning algorithms.
If this bothers you, please note:

You can re-run a given example a few times and your results should be close to the values
reported.

You can make the output consistent by fixing the random number seed.

You can develop a robust estimate of the skill of a model by fitting and evaluating it
multiple times and taking the average of the final skill score (highly recommended).

All code examples were tested on a POSIX-compatible machine with Python 3. All code
examples will run on modest and modern computer hardware. I am only human, and there
may be a bug in the sample code. If you discover a bug, please let me know so I can fix it and
correct the book (and you can request a free update at any time).

About Further Reading

Each lesson includes a list of further reading resources. This may include:

Research papers.

Books and book chapters.

Webpages.

API documentation.
ix

Open-source projects.

Wherever possible, I have listed and linked to the relevant API documentation for key objects
and functions used in each lesson so you can learn more about them. When it comes to research
papers, I have listed those that are first to use a specific technique or first in a specific problem
domain. These are not required reading but can give you more technical details, theory, and
configuration details if you’re looking for it. Wherever possible, I have tried to link to the freely
available version of the paper on the arXiv preprint archive. You can search for and download
any of the papers listed on Google Scholar Search. Wherever possible, I have tried to link to
books on Amazon.
I don’t know everything, and if you discover a good resource related to a given lesson, please
let me know so I can update the book.

About Getting Help

You might need help along the way. Don’t worry; you are not alone.

Help with a technique? If you need help with the technical aspects of a specific
operation or technique, see the Further Reading section at the end of each tutorial.

Help with APIs? If you need help with using a Python library, see the list of resources
in the Further Reading section at the end of each lesson, and also see Appendix A.

Help with your workstation? If you need help setting up your environment, I would
recommend using Anaconda and following my tutorial in Appendix B.

Help in general? You can shoot me an email. My details are in Appendix A.

Next
Are you ready? Let’s dive in!
This is Just a Sample

Thank-you for your interest in Ensemble Learning Algorithms With Python.

This is just a sample of the full text. You can purchase the complete book online from:
https://machinelearningmastery.com/ensemble-learning-algorithms-with-python/

Ensemble Learning Algorithms

With Python

Make Better Predictions with

Bagging, Boosting, and Stacking

Jason Brownlee

1
Part II

Bagging

2
Chapter 1

Bagged Decision Trees Ensemble

Bagging is an ensemble machine learning algorithm that combines the predictions from many
decision trees. It is also easy to implement given that it has few key hyperparameters and
sensible heuristics for configuring these hyperparameters. Bagging performs well in general and
provides the basis for a whole field of ensemble of decision tree algorithms such as the popular
random forest and extra trees ensemble algorithms, as well as the lesser-known Pasting, Random
Subspaces, and Random Patches ensemble algorithms. In this tutorial, you will discover how to
develop Bagging ensembles for classification and regression. After completing this tutorial, you
will know:

Bagging ensemble is an ensemble created from decision trees fit on different samples of a
dataset.

How to use the Bagging ensemble for classification and regression with scikit-learn.

How to explore the effect of Bagging model hyperparameters on model performance.

Let’s get started.

1.1 Tutorial Overview

This tutorial is divided into five parts; they are:

1. Bagging Ensemble Algorithm

2. Evaluate Bagging Ensembles

3. Bagging Hyperparameters

4. Bagging Extensions

5. Common Questions

3
1.2. Bagging Ensemble Algorithm 4

1.2 Bagging Ensemble Algorithm

Bootstrap Aggregation, or Bagging for short, is an ensemble machine learning algorithm.
Specifically, it is an ensemble of decision tree models, although the bagging technique can also
be used to combine the predictions of other types of models. As its name suggests, bootstrap
aggregation is based on the idea of the bootstrap sample. A bootstrap sample is a sample of a
dataset with replacement. Replacement means that a sample drawn from the dataset is replaced,
allowing it to be selected again and perhaps multiple times in the new sample. This means that
the sample may have duplicate examples from the original dataset. The bootstrap sampling
technique is used to estimate a population statistic from a small data sample. This is achieved
by drawing multiple bootstrap samples, calculating the statistic on each, and reporting the
mean statistic across all samples.
An example of using bootstrap sampling would be estimating the population mean from a
small dataset. Multiple bootstrap samples are drawn from the dataset, the mean calculated on
each, then the mean of the estimated means is reported as an estimate of the population mean.
Surprisingly, the bootstrap method provides a robust and accurate approach to estimating
statistical quantities compared to a single estimate on the original dataset.
This same approach can be used to create an ensemble of decision tree models. This is
achieved by drawing multiple bootstrap samples from the training dataset and fitting a decision
tree on each. The predictions from the decision trees are then combined to provide a more
robust and accurate prediction than a single decision tree (typically, but not always).

Bagging predictors is a method for generating multiple versions of a predictor and

using these to get an aggregated predictor. [...] The multiple versions are formed by
making bootstrap replicates of the learning set and using these as new learning sets.

— Bagging Predictors, 1996.

Predictions are made for regression problems by averaging the prediction across the decision
trees. Predictions are made for classification problems by taking the majority vote prediction
for the classes from across the predictions made by the decision trees. The bagged decision trees
are effective because each decision tree is fit on a slightly different training dataset, which in
turn allows each tree to have minor differences and make slightly different skillful predictions.
Technically, we say that the method is effective because the trees have a low correlation between
predictions and, in turn, prediction errors.
Decision trees, specifically unpruned decision trees, are used as they slightly overfit the
training data and have a high variance. Other high-variance machine learning algorithms can
be used, such as a k-nearest neighbors algorithm with a low k value, although decision trees
have proven to be the most effective.

If perturbing the learning set can cause significant changes in the predictor con-
structed, then bagging can improve accuracy.

— Bagging Predictors, 1996.

Bagging does not always offer an improvement. For low-variance models that already perform
well, bagging can result in a decrease in model performance.
1.3. Evaluate Bagging Ensembles 5

The evidence, both experimental and theoretical, is that bagging can push a good
but unstable procedure a significant step towards optimality. On the other hand, it
can slightly degrade the performance of stable procedures.

— Bagging Predictors, 1996.

1.3 Evaluate Bagging Ensembles

The scikit-learn Python machine learning library provides an implementation of Bagging
ensembles for machine learning via the BaggingRegressor and BaggingClassifier classes.
Both models operate the same way and take the same arguments that influence how the decision
trees are created. Randomness is used in the construction of the model. This means that
each time the algorithm is run on the same data, it will produce a slightly different model.
When using machine learning algorithms that have a stochastic learning algorithm, it is good
practice to evaluate them by averaging their performance across multiple runs or repeats of
cross-validation. When fitting a final model, it may be desirable to either increase the number
of trees until the variance of the model is reduced across repeated evaluations, or to fit multiple
final models and average their predictions. Let’s take a look at how to develop a Bagging
ensemble for both classification and regression.

1.3.1 Bagging for Classification

In this section, we will look at using Bagging for a classification problem. First, we can use the
make classification() function to create a synthetic binary classification problem with 1,000
examples and 20 input features. The complete example is listed below.
# synthetic binary classification dataset
from sklearn.datasets import make_classification
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# summarize the dataset
print(X.shape, y.shape)

Listing 1.1: Example of creating the synthetic classification dataset.

Running the example creates the dataset and summarizes the shape of the input and output
components.
(1000, 20) (1000,)

Listing 1.2: Example output from creating the synthetic classification dataset.
Next, we can evaluate a Bagging algorithm on this dataset. We will evaluate the model
using repeated stratified k-fold cross-validation, with three repeats and 10 folds. We will report
the mean and standard deviation of the accuracy of the model across all repeats and folds.
# evaluate bagging algorithm for classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
1.3. Evaluate Bagging Ensembles 6

from sklearn.model_selection import RepeatedStratifiedKFold

from sklearn.ensemble import BaggingClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# define the model
model = BaggingClassifier()
# define the evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))

Listing 1.3: Example of evaluating bagging on a classification dataset.

Running the example reports the mean and standard deviation accuracy of the model.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.

In this case, we can see the Bagging ensemble with default hyperparameters achieves a
classification accuracy of about 85 percent on this synthetic dataset.
Mean Accuracy: 0.856 (0.037)

Listing 1.4: Example output from evaluating bagging on a classification dataset.

We can also use the Bagging model as a final model and make predictions for classification.
First, the Bagging ensemble is fit on all available data, then the predict() function can be
called to make predictions on new data. The example below demonstrates this on our binary
classification dataset.
# make predictions using bagging for classification
from sklearn.datasets import make_classification
from sklearn.ensemble import BaggingClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# define the model
model = BaggingClassifier()
# fit the model on the whole dataset
model.fit(X, y)
# make a single prediction
row = [-4.7705504, -1.88685058, -0.96057964, 2.53850317, -6.5843005, 3.45711663,
-7.46225013, 2.01338213, -0.45086384, -1.89314931, -2.90675203, -0.21214568,
-0.9623956, 3.93862591, 0.06276375, 0.33964269, 4.0835676, 1.31423977, -2.17983117,
3.1047287]
yhat = model.predict([row])
# summarize the prediction
print('Predicted Class: %d' % yhat[0])

Listing 1.5: Example of using bagging for making a prediction on a classification dataset.
Running the example fits the Bagging ensemble model on the entire dataset and is then used
to make a prediction on a new row of data, as we might when using the model in an application.
1.3. Evaluate Bagging Ensembles 7

Predicted Class: 1

Listing 1.6: Example output from using bagging for making a prediction on a classification
dataset.
Now that we are familiar with using Bagging for classification, let’s look at the API for
regression.

1.3.2 Bagging for Regression

In this section, we will look at using Bagging for a regression problem. First, we can use the
make regression() function to create a synthetic regression problem with 1,000 examples and
20 input features. The complete example is listed below.
# synthetic regression dataset
from sklearn.datasets import make_regression
# define dataset
X, y = make_regression(n_samples=1000, n_features=20, n_informative=15, noise=0.1,
random_state=5)
# summarize the dataset
print(X.shape, y.shape)

Listing 1.7: Example of creating the synthetic regression dataset.

Running the example creates the dataset and summarizes the shape of the input and output
components.
(1000, 20) (1000,)

Listing 1.8: Example output from creating the synthetic regression dataset.
Next, we can evaluate a Bagging algorithm on this dataset. As we did with the last section,
we will evaluate the model using repeated k-fold cross-validation, with three repeats and 10
folds. We will report the mean absolute error (MAE) of the model across all repeats and folds.
The complete example is listed below.

Note: The scikit-learn API flips the sign of the MAE to transform it from minimizing error to
maximizing negative error. This means that large magnitude positive errors become large
negative errors (e.g. 100 becomes -100) and a perfect model has no error with a value of 0.0. It
also means that we can safely ignore the sign of the mean MAE scores.

# evaluate bagging ensemble for regression

from numpy import mean
from numpy import std
from sklearn.datasets import make_regression
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedKFold
from sklearn.ensemble import BaggingRegressor
# define dataset
X, y = make_regression(n_samples=1000, n_features=20, n_informative=15, noise=0.1,
random_state=5)
# define the model
model = BaggingRegressor()
1.3. Evaluate Bagging Ensembles 8

# define the evaluation procedure

cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
n_scores = cross_val_score(model, X, y, scoring='neg_mean_absolute_error', cv=cv, n_jobs=-1)
# report performance
print('MAE: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))

Listing 1.9: Example of evaluating bagging on a regression dataset.

Running the example reports the mean and standard deviation accuracy of the model.

In this case, we can see that the Bagging ensemble with default hyperparameters achieves a
MAE of about 100.
MAE: -101.133 (9.757)

Listing 1.10: Example output from evaluating bagging on a regression dataset.

We can also use the Bagging model as a final model and make predictions for regression.
First, the Bagging ensemble is fit on all available data, then the predict() function can be
called to make predictions on new data. The example below demonstrates this on our regression
dataset.
# bagging ensemble for making predictions for regression
from sklearn.datasets import make_regression
from sklearn.ensemble import BaggingRegressor
# define dataset
X, y = make_regression(n_samples=1000, n_features=20, n_informative=15, noise=0.1,
random_state=5)
# define the model
model = BaggingRegressor()
# fit the model on the whole dataset
model.fit(X, y)
# make a single prediction
row = [0.88950817, -0.93540416, 0.08392824, 0.26438806, -0.52828711, -1.21102238,
-0.4499934, 1.47392391, -0.19737726, -0.22252503, 0.02307668, 0.26953276, 0.03572757,
-0.51606983, -0.39937452, 1.8121736, -0.00775917, -0.02514283, -0.76089365, 1.58692212]
yhat = model.predict([row])
# summarize the prediction
print('Prediction: %d' % yhat[0])

Listing 1.11: Example of using bagging for making a prediction on a regression dataset.
Running the example fits the Bagging ensemble model on the entire dataset and is then used
to make a prediction on a new row of data, as we might when using the model in an application.
Prediction: -134

Listing 1.12: Example output from using bagging for making a prediction on a regression
dataset.
Now that we are familiar with using the scikit-learn API to evaluate and use Bagging
ensembles, let’s look at configuring the model.
1.4. Bagging Hyperparameters 9

1.4 Bagging Hyperparameters

In this section, we will take a closer look at some of the hyperparameters you should consider
tuning for the Bagging ensemble and their effect on model performance.

1.4.1 Explore Number of Trees

An important hyperparameter for the Bagging algorithm is the number of decision trees used in
the ensemble. Typically, the number of trees is increased until the model performance stabilizes.
Intuition might suggest that more trees will lead to overfitting, although this is not the case.
Bagging and related ensembles of decision trees algorithms (like random forest) appear to be
somewhat immune to overfitting the training dataset given the stochastic nature of the learning
algorithm. The number of trees can be set via the n estimators argument and defaults to 100.
The example below explores the effect of the number of trees with values between 10 to 5,000.
# explore bagging ensemble number of trees effect on performance
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
from matplotlib import pyplot

# get the dataset

def get_dataset():
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
n_redundant=5, random_state=5)
return X, y

# get a list of models to evaluate

def get_models():
models = dict()
# define number of trees to consider
n_trees = [10, 50, 100, 500, 500, 1000, 5000]
for n in n_trees:
models[str(n)] = BaggingClassifier(n_estimators=n)
return models

# evaluate a given model using cross-validation

def evaluate_model(model, X, y):
# define the evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
return scores

# evaluate the model

scores = evaluate_model(model, X, y)
# store the results
results.append(scores)
names.append(name)
# summarize the performance along the way
print('>%s %.3f (%.3f)' % (name, mean(scores), std(scores)))
# plot model performance for comparison
pyplot.boxplot(results, labels=names, showmeans=True)
pyplot.show()

Listing 1.13: Example of evaluating the effect of the number of trees in the bagging ensemble.
Running the example first reports the mean accuracy for each configured number of decision
trees.

In this case, we can see that performance improves on this dataset until about 100 trees and
remains flat after that.
>10 0.855 (0.037)
>50 0.876 (0.035)
>100 0.882 (0.037)
>500 0.885 (0.041)
>1000 0.885 (0.037)
>5000 0.885 (0.038)

Listing 1.14: Example output from evaluating the effect of the number of trees in the bagging
ensemble.
A box and whisker plot is created for the distribution of accuracy scores for each configured
number of trees. We can see the general trend of no further improvement beyond about 100
trees.
1.4. Bagging Hyperparameters 11

Figure 1.1: Box Plot of Bagging Ensemble Size vs. Classification Accuracy.

1.4.2 Explore Number of Samples

The size of the bootstrap sample can also be varied. The default is to create a bootstrap sample
that has the same number of examples as the original dataset. Using a smaller dataset can
increase the variance of the resulting decision trees and could result in better overall performance.
The number of samples used to fit each decision tree is set via the max samples argument. The
example below explores different sized samples as a ratio of the original dataset from 10 percent
to 100 percent (the default).
# explore bagging ensemble number of samples effect on performance
from numpy import mean
from numpy import std
from numpy import arange
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
from matplotlib import pyplot

# get the dataset

def get_dataset():
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
1.4. Bagging Hyperparameters 12

n_redundant=5, random_state=5)
return X, y

# get a list of models to evaluate

def get_models():
models = dict()
# explore ratios from 10% to 100% in 10% increments
for i in arange(0.1, 1.1, 0.1):
key = '%.1f' % i
models[key] = BaggingClassifier(max_samples=i)
return models

# evaluate a given model using cross-validation

# define dataset
X, y = get_dataset()
# get the models to evaluate
models = get_models()
# evaluate the models and store results
results, names = list(), list()
for name, model in models.items():
# evaluate the model
scores = evaluate_model(model, X, y)
# store the results
results.append(scores)
names.append(name)
# summarize the performance along the way
print('>%s %.3f (%.3f)' % (name, mean(scores), std(scores)))
# plot model performance for comparison
pyplot.boxplot(results, labels=names, showmeans=True)
pyplot.show()

Listing 1.15: Example of evaluating the effect of the number of samples in the bagging ensemble.
Running the example first reports the mean accuracy for each sample set size.

In this case, the results suggest that performance generally improves with an increase in
the sample size, highlighting that the default of 100 percent the size of the training dataset
is sensible. It might also be interesting to explore a smaller sample size with a corresponding
increase in the number of trees in an effort to reduce the variance of the individual models.
>0.1 0.810 (0.036)
>0.2 0.836 (0.044)
>0.3 0.844 (0.043)
>0.4 0.843 (0.041)
1.4. Bagging Hyperparameters 13

>0.5 0.852 (0.034)

>0.6 0.855 (0.042)
>0.7 0.858 (0.042)
>0.8 0.861 (0.033)
>0.9 0.866 (0.041)
>1.0 0.864 (0.042)

Listing 1.16: Example output from evaluating the effect of the number of samples in the bagging
ensemble.
A box and whisker plot is created for the distribution of accuracy scores for each sample
size. We see a general trend of increasing accuracy with sample size.

Figure 1.2: Box Plot of Bagging Sample Size vs. Classification Accuracy.

1.4.3 Explore Alternate Algorithm

Decision trees are the most common algorithm used in a bagging ensemble. The reason for this
is that they are easy to configure to have a high variance and because they perform well in
general. Other algorithms can be used with bagging and must be configured to have a modestly
high variance. One example is the k-nearest neighbors algorithm where the k value can be
set to a low value. The algorithm used in the ensemble is specified via the base estimator
argument and must be set to an instance of the algorithm and algorithm configuration to use.
1.4. Bagging Hyperparameters 14

The example below demonstrates using a KNeighborsClassifier as the base algorithm used
in the bagging ensemble. Here, the algorithm is used with default hyperparameters where k is
set to 5.
# evaluate bagging with knn algorithm for classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import BaggingClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# define the model
model = BaggingClassifier(base_estimator=KNeighborsClassifier())
# define the evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))

Listing 1.17: Example of evaluating the effect of changing the base algorithm in the bagging
ensemble.
Running the example reports the mean and standard deviation accuracy of the model.

In this case, we can see the Bagging ensemble with KNN and default hyperparameters
achieves a classification accuracy of about 88 percent on this synthetic dataset.
Mean Accuracy: 0.888 (0.036)

Listing 1.18: Example output from evaluating the effect of changing the base algorithm in the
bagging ensemble.
We can test different values of k to find the right balance of model variance to achieve good
performance as a bagged ensemble. The below example tests bagged KNN models with k values
between 1 and 20.
# explore bagging ensemble k for knn effect on performance
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
from sklearn.neighbors import KNeighborsClassifier
from matplotlib import pyplot

# get the dataset

1.4. Bagging Hyperparameters 15

def get_dataset():
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
n_redundant=5, random_state=5)
return X, y

# get a list of models to evaluate

def get_models():
models = dict()
# evaluate k values from 1 to 20
for i in range(1,21):
# define the base model
base = KNeighborsClassifier(n_neighbors=i)
# define the ensemble model
models[str(i)] = BaggingClassifier(base_estimator=base)
return models

# evaluate a given model using cross-validation

Listing 1.19: Example of evaluating the effect of changing the configuration of KNN as the base
algorithm in the bagging ensemble.
Running the example first reports the mean accuracy for each k value.

In this case, the results suggest a small k value such as two to four results in the best mean
accuracy when used in a bagging ensemble.
>1 0.884 (0.025)
>2 0.890 (0.029)
1.4. Bagging Hyperparameters 16

>3 0.886 (0.035)

>4 0.887 (0.033)
>5 0.878 (0.037)
>6 0.879 (0.042)
>7 0.877 (0.037)
>8 0.877 (0.036)
>9 0.871 (0.034)
>10 0.877 (0.033)
>11 0.876 (0.037)
>12 0.877 (0.030)
>13 0.874 (0.034)
>14 0.871 (0.039)
>15 0.875 (0.034)
>16 0.877 (0.033)
>17 0.872 (0.034)
>18 0.873 (0.036)
>19 0.876 (0.034)
>20 0.876 (0.037)

Listing 1.20: Example output from evaluating the effect of changing the configuration of KNN
as the base algorithm in the bagging ensemble.
A box and whisker plot is created for the distribution of accuracy scores for each k value.
We see a general trend of increasing accuracy with sample size in the beginning, then a modest
decrease in performance as the variance of the individual KNN models used in the ensemble is
increased with larger k values.
1.5. Bagging Extensions 17

Figure 1.3: Box Plot of Bagging KNN Number of Neighbors vs. Classification Accuracy.

1.5 Bagging Extensions

There are many modifications and extensions to the bagging algorithm in an effort to improve
the performance of the approach. Perhaps the most famous is the random forest algorithm.
There is a number of less famous, although still effective, extensions to bagging that may be
interesting to investigate. This section demonstrates some of these approaches, such as pasting
ensemble, random subspace ensemble, and the random patches ensemble. We are not comparing
the results of these extensions on the dataset , but rather providing working examples of how to
use each technique that you can copy-paste and try with your own dataset.

1.5.1 Pasting Ensemble

The Pasting Ensemble is an extension to bagging that involves fitting ensemble members based
on random samples of the training dataset instead of bootstrap samples. The approach is
designed to use smaller sample sizes than the training dataset in cases where the training dataset
does not fit into memory.
The procedure takes small pieces of the data, grows a predictor on each small piece
and then pastes these predictors together. A version is given that scales up to
terabyte data sets. The methods are also applicable to on-line learning.
1.5. Bagging Extensions 18

— Pasting Small Votes for Classification in Large Databases and On-Line, 1999.
The example below demonstrates the Pasting ensemble by setting the bootstrap argument
to False and setting the number of samples used in the training dataset via max samples to a
modest value, in this case, 50 percent of the training dataset size.
# evaluate pasting ensemble algorithm for classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# define the model
model = BaggingClassifier(bootstrap=False, max_samples=0.5)
# define the evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))

Listing 1.21: Example of evaluating a pasting ensemble.

Running the example reports the mean and standard deviation accuracy of the model.
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
In this case, we can see the Pasting ensemble achieves a classification accuracy of about 84
percent on this dataset.
Mean Accuracy: 0.848 (0.039)

Listing 1.22: Example output from evaluating a pasting ensemble.

1.5.2 Random Subspace Ensemble

A Random Subspace Ensemble is an extension to bagging that involves fitting ensemble members
based on datasets constructed from random subsets of the features in the training dataset. It
is similar to the random forest except the data samples are random rather than a bootstrap
sample and the subset of features is selected for the entire decision tree rather than at each split
point in the tree.
The classifier consists of multiple trees constructed systematically by pseudorandomly
selecting subsets of components of the feature vector, that is, trees constructed in
randomly chosen subspaces.
— The Random Subspace Method For Constructing Decision Forests, 1998.
A worked example of the Random Subspace Ensemble is explored next in Chapter ??.
1.5. Bagging Extensions 19

1.5.3 Random Patches Ensemble

The Random Patches Ensemble is an extension to bagging that involves fitting ensemble members
based on datasets constructed from random subsets of rows (samples) and columns (features) of
the training dataset. It does not use bootstrap samples and might be considered an ensemble
that combines both the random sampling of the dataset of the Pasting ensemble and the random
sampling of features of the Random Subspace ensemble.

We investigate a very simple, yet effective, ensemble framework that builds each
individual model of the ensemble from a random patch of data obtained by drawing
random subsets of both instances and features from the whole dataset.

— Ensembles on Random Patches, 2012.

The example below demonstrates the Random Patches Ensemble with decision trees created
from a random sample of the training dataset limited to 50 percent of the size of the training
dataset, and with a random subset of 10 features.
# evaluate random patches ensemble algorithm for classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5,
random_state=5)
# define the model
model = BaggingClassifier(bootstrap=False, max_features=10, max_samples=0.5)
# define the evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model and collect the results
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))

Listing 1.23: Example of evaluating a random patches ensemble.

Running the example reports the mean and standard deviation accuracy of the model.

In this case, we can see the Random Patches Ensemble achieves a classification accuracy of
about 84 percent on this dataset.
Mean Accuracy: 0.845 (0.036)

Listing 1.24: Example output from evaluating a random patches ensemble.

1.6. Common Questions 20

1.6 Common Questions

In this section we will take a closer look at some common sticking points you may have with
the bagging ensemble procedure.

Q. What algorithm should be used in the ensemble?

The algorithm should have a moderate variance, meaning it is moderately dependent upon the
specific training data. A decision tree, often unpruned, is the default model to use because it
works well in practice. Other algorithms can be used as long as they are configured to have a
moderate variance.

... it is well known that Bagging should be used with unstable learners, and generally,
the more unstable, the larger the performance improvement.

— Page 52, Ensemble Methods, 2012.

Q. How many ensemble members should be used?

The performance of the model will converge with an increase of the number of decision trees to
a point, then remain level. Therefore, keep increasing the number of trees until the performance
stabilizes on your dataset.

... the performance of Bagging converges as the ensemble size, i.e., the number of
base learners, grows large ...

— Page 52, Ensemble Methods, 2012.

Q. Won’t the ensemble overfit with too many trees?

No. Bagging ensembles are very unlikely to overfit in general.

Q. How large should the bootstrap sample be?

It is good practice to make the bootstrap sample as large as the original dataset size. That is
100% the size or an equal number of rows as the original dataset.

Q. What problems are well suited to bagging?

Generally, bagging is well suited to problems with small or modest sized datasets. But this is a
rough guide. If you’re unsure, try it and see.

Bagging is best suited for problems with relatively small available training datasets.

— Page 12, Ensemble Machine Learning, 2012.

1.7 Further Reading

This section provides more resources on the topic if you are looking to go deeper.
1.8. Summary 21

Papers
Bagging predictors, 1996.
https://link.springer.com/article/10.1007/BF00058655

Pasting Small Votes for Classification in Large Databases and On-Line, 1999.
https://link.springer.com/article/10.1023/A:1007563306331

The Random Subspace Method For Constructing Decision Forests, 1998.

https://ieeexplore.ieee.org/abstract/document/709601

Ensembles on Random Patches, 2012.

https://link.springer.com/chapter/10.1007/978-3-642-33460-3_28

Books
Pattern Classification Using Ensemble Methods, 2010.
https://amzn.to/2zxc0F7

Ensemble Methods, 2012.

https://amzn.to/2XZzrjG

Ensemble Machine Learning, 2012.

https://amzn.to/2C7syo5

APIs
sklearn.ensemble.BaggingClassifier API.
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html

sklearn.ensemble.BaggingRegressor API.
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingRegressor.html

Articles
Bootstrap aggregating, Wikipedia.
https://en.wikipedia.org/wiki/Bootstrap_aggregating

1.8 Summary
In this tutorial, you discovered how to develop Bagging ensembles for classification and regression.
Specifically, you learned:

Bagging ensemble is an ensemble created from decision trees fit on different samples of a
dataset.

How to use the Bagging ensemble for classification and regression with scikit-learn.

How to explore the effect of Bagging model hyperparameters on model performance.

1.8. Summary 22

Next
In the next section, we will take a closer look at an extension to Bagging called Random
Subspace Ensembles.
This is Just a Sample

Thank-you for your interest in Ensemble Learning Algorithms With Python.

This is just a sample of the full text. You can purchase the complete book online from:
https://machinelearningmastery.com/ensemble-learning-algorithms-with-python/

Ensemble Learning Algorithms

With Python

Make Better Predictions with

Bagging, Boosting, and Stacking

Jason Brownlee

Statistics and Machine Learning in Python
100% (1)
Statistics and Machine Learning in Python
166 pages
The Hundred-Page Machine Learning Book - Andriy Burkov
No ratings yet
The Hundred-Page Machine Learning Book - Andriy Burkov
133 pages
Understanding Machine Learning Basics
100% (1)
Understanding Machine Learning Basics
64 pages
SMOTE For Imbalanced Classification With Python
No ratings yet
SMOTE For Imbalanced Classification With Python
8 pages
Data Science Design
No ratings yet
Data Science Design
299 pages
Python Machine Learning - Machine Learning and Deep Learning With Python Scikit Learn and Tensorflow 2 Third Edition
No ratings yet
Python Machine Learning - Machine Learning and Deep Learning With Python Scikit Learn and Tensorflow 2 Third Edition
4 pages
Next - Level - Data - Science - Sample Chapter
No ratings yet
Next - Level - Data - Science - Sample Chapter
37 pages
Deep Learning Foundations and Concepts
0% (1)
Deep Learning Foundations and Concepts
4 pages
2.4-Ensemble Methods Lecture Notes
No ratings yet
2.4-Ensemble Methods Lecture Notes
14 pages
MMC102 - Module 4 - Notes
No ratings yet
MMC102 - Module 4 - Notes
39 pages
Ensemble Methods in Data Mining
No ratings yet
Ensemble Methods in Data Mining
127 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Unit 3
No ratings yet
Unit 3
63 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
VN Read Machine Learning in 3 Steps Shi 2023-03
No ratings yet
VN Read Machine Learning in 3 Steps Shi 2023-03
17 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Unit - 6
No ratings yet
Unit - 6
51 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
4 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
21 pages
Ensemble Learning
No ratings yet
Ensemble Learning
26 pages
Unit 3
No ratings yet
Unit 3
59 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
Module 2
No ratings yet
Module 2
34 pages
Unit 4.modelselection
No ratings yet
Unit 4.modelselection
26 pages
Unit 4 ML
No ratings yet
Unit 4 ML
25 pages
Understanding Ensemble Learning Techniques
No ratings yet
Understanding Ensemble Learning Techniques
4 pages
Supervised Learning Notes
No ratings yet
Supervised Learning Notes
7 pages
Ensemble TBL Notes
No ratings yet
Ensemble TBL Notes
2 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
Ensemble Learning Techniques Explained
100% (1)
Ensemble Learning Techniques Explained
12 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Machine Learning Lecture 2,3,4
No ratings yet
Machine Learning Lecture 2,3,4
26 pages
Ensemble Models
No ratings yet
Ensemble Models
52 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
32 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
ML Workshop
No ratings yet
ML Workshop
78 pages
Ensemble Learning (Autosaved)
No ratings yet
Ensemble Learning (Autosaved)
31 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
CH 7 Ensemble Learning
No ratings yet
CH 7 Ensemble Learning
34 pages
UNIT-5 ML Notes
No ratings yet
UNIT-5 ML Notes
24 pages
Chapter 7 - Printed
No ratings yet
Chapter 7 - Printed
14 pages
Technical Report
No ratings yet
Technical Report
10 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
Machine Learning Crash Course For BCA 5th Semester
No ratings yet
Machine Learning Crash Course For BCA 5th Semester
21 pages
Fire Extinguisher Prediction Using Machine Learning Report
No ratings yet
Fire Extinguisher Prediction Using Machine Learning Report
48 pages
LN ML Rug
No ratings yet
LN ML Rug
283 pages
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
No ratings yet
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
13 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Ensemble Learning for Malware Analysis
No ratings yet
Ensemble Learning for Malware Analysis
17 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Ensemble Learning
No ratings yet
Ensemble Learning
30 pages
University Institute of Computing: Big Data Analytics 22CAH-782
No ratings yet
University Institute of Computing: Big Data Analytics 22CAH-782
27 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Ensemble Learning: Article
No ratings yet
Ensemble Learning: Article
4 pages
In Sinu Jesu A Call To Priests To Adore The Lord
100% (6)
In Sinu Jesu A Call To Priests To Adore The Lord
233 pages
Litrugia Das Horas Vol III TC 1 17
100% (4)
Litrugia Das Horas Vol III TC 1 17
1,189 pages
Prof Luciano Floridi - The Ethics of Artificial Intelligence - Principles, Challenges, and Opportunities-Oxford University Press (2023)
100% (7)
Prof Luciano Floridi - The Ethics of Artificial Intelligence - Principles, Challenges, and Opportunities-Oxford University Press (2023)
272 pages
Complete Latin Beginner To Intermediate Course
94% (32)
Complete Latin Beginner To Intermediate Course
472 pages
Jesus: Philosopher for All
No ratings yet
Jesus: Philosopher for All
58 pages
NickBostrom Superintelligence PDF
96% (57)
NickBostrom Superintelligence PDF
323 pages
Biblia (Antiga em ETIOPE)
75% (24)
Biblia (Antiga em ETIOPE)
431 pages
A Primer of Ecclesiastical Latin by John F. Collins (
75% (8)
A Primer of Ecclesiastical Latin by John F. Collins (
476 pages
Nicholas e Lombardo The Logic of Desire Aquinas On Emotion
100% (4)
Nicholas e Lombardo The Logic of Desire Aquinas On Emotion
331 pages
Maps of Meaning The Architecture of Belief (PDFDrive)
100% (12)
Maps of Meaning The Architecture of Belief (PDFDrive)
607 pages
Holodomor - Miron Dolot
100% (4)
Holodomor - Miron Dolot
332 pages
José Angel Lombo, Francesco Russo-Philosophical Anthropology - An Introduction-Midwest Theological Forum (2014)
80% (5)
José Angel Lombo, Francesco Russo-Philosophical Anthropology - An Introduction-Midwest Theological Forum (2014)
383 pages
Em Busca de Sentido - Viktor Frankl
94% (18)
Em Busca de Sentido - Viktor Frankl
106 pages
Roger Scruton Death Devoted Heart PDF
100% (5)
Roger Scruton Death Devoted Heart PDF
247 pages
Shannon Vallor - The AI Mirror - How To Reclaim Our Humanity in An Age of Machine Thinking-Oxford University Press (2024)
75% (4)
Shannon Vallor - The AI Mirror - How To Reclaim Our Humanity in An Age of Machine Thinking-Oxford University Press (2024)
273 pages
Latin Grammar
100% (19)
Latin Grammar
227 pages
The 1917 or Pio-Benedictine Code of Canon Law
100% (2)
The 1917 or Pio-Benedictine Code of Canon Law
660 pages
A First Latin Grammar
100% (5)
A First Latin Grammar
40 pages
Luiz Felipe Pondé - Marketing Existencial
No ratings yet
Luiz Felipe Pondé - Marketing Existencial
172 pages
Wippel J. - The Metaphy. Thought of Th. Aquinas
60% (15)
Wippel J. - The Metaphy. Thought of Th. Aquinas
527 pages
Vidas Dos Santos - Volume 16 - Pe Rohrbacher
No ratings yet
Vidas Dos Santos - Volume 16 - Pe Rohrbacher
452 pages
Aronson - The Social Animal (12 Ed.) PDF
91% (44)
Aronson - The Social Animal (12 Ed.) PDF
481 pages
Saname Con Tu Boca - El Arte de Besar - Victor Manuel Fernandez - Trucho
100% (9)
Saname Con Tu Boca - El Arte de Besar - Victor Manuel Fernandez - Trucho
39 pages
Etienne Gilson - The Christian Philosophy of St. Augustine PDF
96% (25)
Etienne Gilson - The Christian Philosophy of St. Augustine PDF
418 pages
Prometheus - The Religion of Man - Alvaro Calderon
No ratings yet
Prometheus - The Religion of Man - Alvaro Calderon
245 pages
AQUINAS'S Shorter Summa
100% (1)
AQUINAS'S Shorter Summa
219 pages
Consecration To ST Joseph Donald H Calloway MIC
97% (72)
Consecration To ST Joseph Donald H Calloway MIC
350 pages
Vício e Virtude - Kent Dunnington
No ratings yet
Vício e Virtude - Kent Dunnington
189 pages
Kant, Immanuel - Observations On The Feeling of The Beautiful & Sublime (Cambridge, 2011) PDF
100% (6)
Kant, Immanuel - Observations On The Feeling of The Beautiful & Sublime (Cambridge, 2011) PDF
395 pages
Beauty What It Is and Why It Matters (John-Mark L. Miravalle)
100% (7)
Beauty What It Is and Why It Matters (John-Mark L. Miravalle)
127 pages
Mental Health
No ratings yet
Mental Health
32 pages
Sample
No ratings yet
Sample
7 pages
Tatvic Digital Analytics (Responses)
No ratings yet
Tatvic Digital Analytics (Responses)
69 pages
A Multi-Task Model For Sentiment Analysis
No ratings yet
A Multi-Task Model For Sentiment Analysis
8 pages
REVISED CHAPTERS 1-3 (Formon, Sampaga, Serrano)
No ratings yet
REVISED CHAPTERS 1-3 (Formon, Sampaga, Serrano)
37 pages
The Influence of Social Media On Political Mobilization
No ratings yet
The Influence of Social Media On Political Mobilization
5 pages
Enhanced YOLOv8 Infrared Image Object Detection Method With SPD Module
No ratings yet
Enhanced YOLOv8 Infrared Image Object Detection Method With SPD Module
7 pages
Technavya A4 Booklet - Revised On 09 Jan 2024 - Afternoon
No ratings yet
Technavya A4 Booklet - Revised On 09 Jan 2024 - Afternoon
25 pages
HAZOP 4.0 for Process Safety Experts
No ratings yet
HAZOP 4.0 for Process Safety Experts
1 page
Diffusion Models Part2
No ratings yet
Diffusion Models Part2
5 pages
Assignment I
No ratings yet
Assignment I
6 pages
Adaptive IoT and AI-Enhanced QualityAssurance System For Sustainable and Secure Bakery Production
No ratings yet
Adaptive IoT and AI-Enhanced QualityAssurance System For Sustainable and Secure Bakery Production
5 pages
Northwestern Engineering Graduate Program Guide
No ratings yet
Northwestern Engineering Graduate Program Guide
12 pages
Understanding Artificial Intelligence Basics
No ratings yet
Understanding Artificial Intelligence Basics
12 pages
AIs Potential in The Future
No ratings yet
AIs Potential in The Future
6 pages
Voice Activated Technology in Gaming
No ratings yet
Voice Activated Technology in Gaming
10 pages
AI For Business Professionals
No ratings yet
AI For Business Professionals
1 page
Backpropagation in Neural Networks
No ratings yet
Backpropagation in Neural Networks
5 pages
Chatgpt 4 Medical Diagnoses M
No ratings yet
Chatgpt 4 Medical Diagnoses M
2 pages
AI Solutions for Educators
No ratings yet
AI Solutions for Educators
2 pages
English 12 AI Study Guide
No ratings yet
English 12 AI Study Guide
3 pages
Resume PRATIKKHEDKAR
No ratings yet
Resume PRATIKKHEDKAR
2 pages
DA Segmentation
No ratings yet
DA Segmentation
4 pages
Driverless Car: Autonomous Driving Using Deep Reinforcement Learning in Urban Environment
No ratings yet
Driverless Car: Autonomous Driving Using Deep Reinforcement Learning in Urban Environment
6 pages
Pattern Recognition-Theory
No ratings yet
Pattern Recognition-Theory
2 pages
Artificial Intelligence in Healthcare
No ratings yet
Artificial Intelligence in Healthcare
7 pages
Ai Journey
No ratings yet
Ai Journey
20 pages
Introduction To Machine Learning - Unit 7 - Week 4
No ratings yet
Introduction To Machine Learning - Unit 7 - Week 4
4 pages
Module 20 - Kellogg Accelerated Marketing Leadership Program - Transcript
No ratings yet
Module 20 - Kellogg Accelerated Marketing Leadership Program - Transcript
21 pages
Big Data Analytics For Dynamic Energy Management in Smart Grids
No ratings yet
Big Data Analytics For Dynamic Energy Management in Smart Grids
9 pages

Ensemble Learning Algorithms

Uploaded by

Ensemble Learning Algorithms

Uploaded by

Ensemble Learning Algorithms

Make Better Predictions with

© Copyright 2021 Jason Brownlee. All Rights Reserved.

Who Is This Book For?

 You know your way around basic Python for programming.

 You may know some basic NumPy for array manipulation.

 You may know some basic Scikit-Learn for modeling.

About Your Outcomes

 How to develop and evaluate efficient implementations of gradient boosting ensembles,

How to Read This Book

About the Book Structure

About Python Code Examples

About Further Reading

 Books and book chapters.

About Getting Help

 Help in general? You can shoot me an email. My details are in Appendix A.

Thank-you for your interest in Ensemble Learning Algorithms With Python.

Ensemble Learning Algorithms

Make Better Predictions with

Bagged Decision Trees Ensemble

 How to explore the effect of Bagging model hyperparameters on model performance.

Let’s get started.

1.1 Tutorial Overview

1. Bagging Ensemble Algorithm

2. Evaluate Bagging Ensembles

1.2 Bagging Ensemble Algorithm

Bagging predictors is a method for generating multiple versions of a predictor and

— Bagging Predictors, 1996.

— Bagging Predictors, 1996.

— Bagging Predictors, 1996.

1.3 Evaluate Bagging Ensembles

1.3.1 Bagging for Classification

Listing 1.1: Example of creating the synthetic classification dataset.

from sklearn.model_selection import RepeatedStratifiedKFold

Listing 1.3: Example of evaluating bagging on a classification dataset.

Listing 1.4: Example output from evaluating bagging on a classification dataset.

1.3.2 Bagging for Regression

Listing 1.7: Example of creating the synthetic regression dataset.

# evaluate bagging ensemble for regression

# define the evaluation procedure

Listing 1.9: Example of evaluating bagging on a regression dataset.

Listing 1.10: Example output from evaluating bagging on a regression dataset.

1.4 Bagging Hyperparameters

1.4.1 Explore Number of Trees

# get the dataset

# get a list of models to evaluate

# evaluate a given model using cross-validation

# evaluate the model

1.4.2 Explore Number of Samples

# get the dataset

# get a list of models to evaluate

# evaluate a given model using cross-validation

>0.5 0.852 (0.034)

1.4.3 Explore Alternate Algorithm

# get the dataset

# get a list of models to evaluate

# evaluate a given model using cross-validation

>3 0.886 (0.035)

1.5 Bagging Extensions

1.5.1 Pasting Ensemble

Listing 1.21: Example of evaluating a pasting ensemble.

Listing 1.22: Example output from evaluating a pasting ensemble.

1.5.2 Random Subspace Ensemble

1.5.3 Random Patches Ensemble

— Ensembles on Random Patches, 2012.

Listing 1.23: Example of evaluating a random patches ensemble.

Listing 1.24: Example output from evaluating a random patches ensemble.

1.6 Common Questions

Q. What algorithm should be used in the ensemble?

— Page 52, Ensemble Methods, 2012.

Q. How many ensemble members should be used?

— Page 52, Ensemble Methods, 2012.

Q. Won’t the ensemble overfit with too many trees?

Q. How large should the bootstrap sample be?

Q. What problems are well suited to bagging?

— Page 12, Ensemble Machine Learning, 2012.

You know your way around basic Python for programming.

You may know some basic NumPy for array manipulation.

You may know some basic Scikit-Learn for modeling.

How to develop and evaluate efficient implementations of gradient boosting ensembles,

Books and book chapters.

Help in general? You can shoot me an email. My details are in Appendix A.

How to explore the effect of Bagging model hyperparameters on model performance.

The Random Subspace Method For Constructing Decision Forests, 1998.

Ensembles on Random Patches, 2012.

Ensemble Methods, 2012.

Ensemble Machine Learning, 2012.

How to explore the effect of Bagging model hyperparameters on model performance.