0% found this document useful (0 votes)

7 views12 pages

Ensemble Learning

Uploaded by

brokenbottle571

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views12 pages

Ensemble Learning

Uploaded by

brokenbottle571

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Ensemble Learning

and Random Forests

Ensemble Learning: Introduction
❖ Ensemble learning is a machine learning technique that combines multiple individual models to
improve overall predictive performance.

❖ Law of Large numbers:

The law of large numbers states that as the number of independent and identically distributed
(i.i.d.) random variables increases, their average converges to the expected value.

In ensemble learning, each model can be considered as an independent random variable making
predictions.

❖ Reduction of Bias and Variance:

By combining diverse models, ensemble methods can reduce bias by capturing different aspects of
the data and reduce variance by averaging out individual model errors. This leads to more robust
and accurate predictions.

❖ Ensemble learning methods:

Voting, Bagging (Bootstrap Aggregating), Boosting and Stacking
Voting Classifiers
▪ In voting, multiple models are trained independently
on the same dataset.
▪ Predictions are aggregated through majority voting
(for classification) or averaging (for regression).

▪ There are two main types of voting classifiers: Hard

voting and Soft voting
▪ In hard voting, refer to fig. 2, each individual classifier
predicts the class label for a given instance, and the
final prediction is determined by a majority vote. Fig. 1: Training diverse classifiers
▪ In soft voting, instead of predicting class labels, the
individual classifiers provide probability estimates for
each class.
The final prediction is determined by averaging the
probability estimates across all classifiers and selecting
the class label with the highest average probability.
For example, for a data-point, Classifier A, B, and C
predicts the probabilities for binary classes as [0.3,
0.7], [0.6, 0.4], [0.4, 0.6].
Average probability for Class 1: (0.3 + 0.6 + 0.4) / 3 ≈ 0.43
Average probability for Class 2: (0.7 + 0.4 + 0.6) / 3 ≈ 0.57 Fig. 2: Hard voting classifier predictions
Voting Classifiers: scikit-learn

Training voting classifier Individual classifier score

>>> for name, clf in
from sklearn.datasets import make_moons voting_clf.named_estimators_.items():
from sklearn.ensemble import RandomForestClassifier, ... print(name, "=", clf.score(X_test, y_test))
VotingClassifier ...
from sklearn.linear_model import LogisticRegression lr = 0.864
from sklearn.model_selection import train_test_split rf = 0.896
from sklearn.svm import SVC svc = 0.896
X, y = make_moons(n_samples=500, noise=0.30, Hard and soft voting classifier scores
random_state=42)
>>> voting_clf.predict(X_test[:1])
X_train, X_test, y_train, y_test = train_test_split(X, y, array([1])
random_state=42)
>>> [clf.predict(X_test[:1]) for clf in
voting_clf = VotingClassifier( voting_clf.estimators_]
estimators=[ [array([1]), array([1]), array([0])]
('lr', LogisticRegression(random_state=42)),
('rf', RandomForestClassifier(random_state=42)), >>> voting_clf.score(X_test, y_test)
('svc', SVC(random_state=42)) 0.912
]
>>> voting_clf.voting = "soft"
)
>>>
voting_clf.named_estimators["svc"].probability =
voting_clf.fit(X_train, y_train)
True
>>> voting_clf.fit(X_train, y_train)
>>> voting_clf.score(X_test, y_test)
0.92
Bagging and Pasting

❖ A diverse set of predictor can also be built using the same training algorithm for every predictor but train
them on different random subsets of the training set.

❖ The key difference between bagging and pasting lies in how these subsets are sampled:
Bagging (short for bootstrap aggregating): Sampling with replacement
Pasting: Sampling without replacement.
❖ Both bagging and pasting allow training instances to
be sampled several times across multiple predictors.

❖ But, only bagging allows a training instance to be

sampled several times for the same predictor.

❖ Generally, bagging introduces more diversity in the

subsets, and often preferred over pasting.

❖ The ensemble can make a prediction just like a

voting classifier by aggregating the output of
different predictors. Fig. 3: Bagging and pasting involve training several
predictors on different random samples of the
❖ The ensemble has a similar bias but a lower variance training set
than a single predictor trained on the original
training set.
❖ Different predictors of the ensemble can all be trained in parallel, via different CPU cores or even
different servers: the algorithm scales very well.
Bagging and Pasting: scikit-learn

Training Bagging classifier

▪ If you want to use pasting instead, just set
from sklearn.ensemble import BaggingClassifier bootstrap=False).

from sklearn.tree import DecisionTreeClassifier ▪ A BaggingClassifier automatically performs

soft voting instead of hard voting if the base
bag_clf = BaggingClassifier(DecisionTreeClassifier(),
classifier can estimate class probabilities (i.e.,
n_estimators=500,
max_samples=100, n_jobs=-1, random_state=42) if it has a predict_proba() method), which is
the case with decision tree classifiers.
bag_clf.fit(X_train, y_train)

Fig. 4: A single decision tree (left) versus a bagging ensemble of 500 trees (right)
Out-of-Bag Evaluation

❖ The BaggingClassifier samples training instances with replacement, resulting in roughly 63% of
instances being sampled on average for each predictor.

❖ The remaining 37% are termed out-of-bag (OOB) instances and can be used for evaluation
without a separate validation set. This built-in cross-validation allows for accurate ensemble
prediction assessment.

❖ The out-of-bag evaluation approach provides a convenient and efficient way to estimate the
performance of the ensemble without the need for splitting the data in train and test sets.

OOB evaluation Test-set evaluation

>>> bag_clf = BaggingClassifier(DecisionTreeClassifier(), >>> from sklearn.metrics import accuracy_score
n_estimators=500, >>> y_pred = bag_clf.predict(X_test)
... oob_score=True, n_jobs=-1, random_state=42) >>> accuracy_score(y_test, y_pred)
... 0.92
>>> bag_clf.fit(X_train, y_train)
>>> bag_clf.oob_score_
0.896

❖ The OOB evaluation was a bit pessimistic for this case.

Random Forests: Bagging of Decision Trees

❖ Random Forests: A variant of bagging classifier with decision trees as the base learner.

❖ However, in contrast to the bagging classifier, random forests allows random sampling of subset of
features at each decision node of the trees, by default. This process is known as feature bagging or
feature subsampling.
❖ By default, it samples 𝑛 features (where n is the total number of features).

❖ The algorithm results in greater tree diversity, which (again) trades a higher bias for a lower variance.

❖ A RandomForestClassifier has all the hyperparameters of a DecisionTreeClassifier (to control how

trees are grown), plus all the hyperparameters of a BaggingClassifier to control the ensemble itself.

❖ Extra-Trees: It is possible to make trees even more random by also using random thresholds for each
feature at the decision nodes. For this, simply set splitter="random" when creating a
DecisionTreeClassifier.

❖ A forest of such extremely random trees is called an extremely randomized trees (or extra-trees for
short) ensemble.

❖ Along with further reduction in variance, It also makes extra-trees classifiers much faster to train than
regular random forests.
Random Forests: scikit-learn

Training of Random Forests

from sklearn.ensemble import RandomForestClassifier

rnd_clf = RandomForestClassifier(n_estimators=500,
max_leaf_nodes=16,
n_jobs=-1, random_state=42)

rnd_clf.fit(X_train, y_train)

y_pred_rf = rnd_clf.predict(X_test)

Equivalent Bagging Classifier

bag_clf = BaggingClassifier(

DecisionTreeClassifier(max_features="sqrt",
max_leaf_nodes=16),
n_estimators=500, n_jobs=-1, random_state=42)
Random Forests: Feature Importance

❖ A very useful aspect of random forests is that they make it easy to measure the relative
importance of each feature.

❖ Scikit-Learn measures a feature’s importance by looking at how much the tree nodes that use that
feature reduce impurity on average, across all trees in the forest.

❖ More precisely, it is a weighted average, where each node’s weight is proportional to the number
of training samples that are associated with it.
❖ Scikit-Learn computes this score automatically for
each feature after training, then it scales the results
so that the sum of all importances is equal to 1.
>>> from sklearn.datasets import load_iris
>>> iris = load_iris(as_frame=True)
>>> rnd_clf = RandomForestClassifier(n_estimators=500,
random_state=42)
>>> rnd_clf.fit(iris.data, iris.target)
>>> for score, name in zip(rnd_clf.feature_importances_,
iris.data.columns):
... print(round(score, 2), name)
...
0.11 sepal length (cm), 0.02 sepal width (cm) Fig. 4: MNIST pixel importance (according to a
0.44 petal length (cm), 0.42 petal width (cm) random forest classifier)
Boosting: AdaBoost

❖ Boosting refers to any ensemble method that can combine several weak learners into a strong
learner.

❖ The general idea of most boosting methods is to train predictors sequentially, each trying to correct
its predecessor.

❖ There are many boosting methods available, but by far the most popular are AdaBoost (short for
adaptive boosting) and gradient boosting.
❖ AdaBoost:
One way for a new predictor to correct its predecessor is to pay a bit more attention to the training
instances that the predecessor underfit. This results in new predictors focusing more and more on
the hard cases.

Fig. 5: Decision boundaries of consecutive predictors

AdaBoost Algorithm
❖ Weighted error rate of the jth predictor:
Each instance weight w(i) is initially set to 1/m. Then, error rate for each predictor is calculated by summing
contributions from each incorrectly predicted instance.

❖ Predictor weight: • So, predictor weight, j, is larger for smaller values of error rate,
rj, or vice-versa.

• For random predictor, rj = 0.5, predicted weight is zero.

❖ Instance weight update: Building AdaBoost Classifier

from sklearn.ensemble import AdaBoostClassifier
ada_clf = AdaBoostClassifier(
DecisionTreeClassifier(max_depth=1),
n_estimators=30,
learning_rate=0.5, random_state=42)
After the weight-update, they are normalized such that ada_clf.fit(X_train, y_train)
sum of all weights becomes 1.

❖ AdaBoost predictions: • Prediction is based on summation of predictor

weights for a given class-value, k.

• Predicted class is the one for which the

summation value is the maximum.

Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners by Scott Hartshorn
No ratings yet
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners by Scott Hartshorn
73 pages
Algorithmic Trading in Python
50% (2)
Algorithmic Trading in Python
28 pages
Mini Project - Machine Learning - Tejas Nayak
No ratings yet
Mini Project - Machine Learning - Tejas Nayak
65 pages
Mid2 Answers
No ratings yet
Mid2 Answers
42 pages
Chapter07 Ensemble Learning
No ratings yet
Chapter07 Ensemble Learning
21 pages
Jntuk Machine Learning 3-2 Unit-3
No ratings yet
Jntuk Machine Learning 3-2 Unit-3
33 pages
Unit-3 ML P (1) PPTs by DR KSR
No ratings yet
Unit-3 ML P (1) PPTs by DR KSR
21 pages
ML Unit-3 Part-1
No ratings yet
ML Unit-3 Part-1
17 pages
Unit 3
No ratings yet
Unit 3
63 pages
ML Unit 3 (DS)
No ratings yet
ML Unit 3 (DS)
31 pages
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
No ratings yet
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
61 pages
Module 2
No ratings yet
Module 2
34 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
HandsOnML Ch7E
No ratings yet
HandsOnML Ch7E
43 pages
ML Unit-3
No ratings yet
ML Unit-3
16 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Random Forest
No ratings yet
Random Forest
10 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Unit 3
No ratings yet
Unit 3
59 pages
Unit 3
No ratings yet
Unit 3
99 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
CH 7 - Ensemble Learning and Random Forests
No ratings yet
CH 7 - Ensemble Learning and Random Forests
78 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
U1-Ensemble Methods
No ratings yet
U1-Ensemble Methods
17 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Week 7 - Tree-Based Model
100% (1)
Week 7 - Tree-Based Model
8 pages
Eda - M4
No ratings yet
Eda - M4
7 pages
Random Forest
No ratings yet
Random Forest
10 pages
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 7
No ratings yet
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 7
23 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Ensemble Learning and Random Forest 4th
No ratings yet
Ensemble Learning and Random Forest 4th
19 pages
Random Forest
No ratings yet
Random Forest
25 pages
Ensembles of Classifiers: Evgueni Smirnov
No ratings yet
Ensembles of Classifiers: Evgueni Smirnov
43 pages
Ensemble Learning
No ratings yet
Ensemble Learning
7 pages
Ensembles 1
No ratings yet
Ensembles 1
4 pages
Bagging
No ratings yet
Bagging
7 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Lecture 17 - Ensemble Learning
No ratings yet
Lecture 17 - Ensemble Learning
31 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
GRADIENTBOOSTING
No ratings yet
GRADIENTBOOSTING
6 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
1 Homework 2: 1.1 Large Scale Data Analysis / Aalto University, Spring 2023
No ratings yet
1 Homework 2: 1.1 Large Scale Data Analysis / Aalto University, Spring 2023
12 pages
What Is Ensemble Learning
No ratings yet
What Is Ensemble Learning
4 pages
Scikit Learn What Were Covering
No ratings yet
Scikit Learn What Were Covering
15 pages
Random Forest
No ratings yet
Random Forest
27 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
25 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
PDS LVC 2 Post-Session Summary
No ratings yet
PDS LVC 2 Post-Session Summary
11 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
21 pages
Random Forest
No ratings yet
Random Forest
29 pages
Chapter 7 - Ensemble
No ratings yet
Chapter 7 - Ensemble
12 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
Machine Learning Lecture 2,3,4
No ratings yet
Machine Learning Lecture 2,3,4
26 pages
OS LAB-IPC Using PIPE
No ratings yet
OS LAB-IPC Using PIPE
9 pages
Regression
No ratings yet
Regression
25 pages
Short Notes.
No ratings yet
Short Notes.
2 pages
Neural Network
No ratings yet
Neural Network
22 pages
CN 1-5.
No ratings yet
CN 1-5.
46 pages
Soft Computing.
No ratings yet
Soft Computing.
51 pages
Customer Segmentation Using Data Science
No ratings yet
Customer Segmentation Using Data Science
7 pages
Farm Fusion
No ratings yet
Farm Fusion
14 pages
Stroke Prediction Using Machine Learning
No ratings yet
Stroke Prediction Using Machine Learning
8 pages
About The FDP Organizing Committee: Chief Patrons: Sri. R.Vijayakumhar
No ratings yet
About The FDP Organizing Committee: Chief Patrons: Sri. R.Vijayakumhar
2 pages
Assignments Theory
No ratings yet
Assignments Theory
9 pages
FLIGHT DELAY Prediction 4th
No ratings yet
FLIGHT DELAY Prediction 4th
18 pages
Machine Learning in A Nutshell
No ratings yet
Machine Learning in A Nutshell
36 pages
Thyroid Predection System
No ratings yet
Thyroid Predection System
23 pages
A Survey On Diabetes Risk Prediction Using Machine.50
No ratings yet
A Survey On Diabetes Risk Prediction Using Machine.50
6 pages
Villarreal Et Al 2019 - Classifier ICPhS
No ratings yet
Villarreal Et Al 2019 - Classifier ICPhS
5 pages
Project 6 - Thera Bank
No ratings yet
Project 6 - Thera Bank
13 pages
E-Commerce Fraud Detection Using Machine Learning
No ratings yet
E-Commerce Fraud Detection Using Machine Learning
19 pages
Internship Report PDF 2022
No ratings yet
Internship Report PDF 2022
25 pages
Patient Flow Control in Emergency Departments Using Simulation Modeling and The Random Forest Algorithm
No ratings yet
Patient Flow Control in Emergency Departments Using Simulation Modeling and The Random Forest Algorithm
9 pages
Project Report Half
No ratings yet
Project Report Half
33 pages
Presentation On IPL Match Winner Prediction With ML
No ratings yet
Presentation On IPL Match Winner Prediction With ML
27 pages
A Survey On The Application of Data Science and Analytics in The Field of Organized Sports
No ratings yet
A Survey On The Application of Data Science and Analytics in The Field of Organized Sports
7 pages
A Model For Predicting Dropout of Higher Education Students
No ratings yet
A Model For Predicting Dropout of Higher Education Students
36 pages
Kunal Anarse: Professional Synopsis
No ratings yet
Kunal Anarse: Professional Synopsis
3 pages
Deep Learning Techniques (Important Questions)
No ratings yet
Deep Learning Techniques (Important Questions)
5 pages
NeurIPS 2018 Information Constraints On Auto Encoding Variational Bayes Paper
No ratings yet
NeurIPS 2018 Information Constraints On Auto Encoding Variational Bayes Paper
12 pages
Plant Disease Detection Using Machine Learning
No ratings yet
Plant Disease Detection Using Machine Learning
5 pages
Analytics For Improving Talent Acquisition Processes ICADABAI2015l
No ratings yet
Analytics For Improving Talent Acquisition Processes ICADABAI2015l
16 pages
Diabetes Prediction Using Machine Learning
No ratings yet
Diabetes Prediction Using Machine Learning
8 pages
Machine Learning Methods
No ratings yet
Machine Learning Methods
27 pages
Diabetes Synopsis Report
No ratings yet
Diabetes Synopsis Report
10 pages
Cyber Threat Discovery From Dark Web
No ratings yet
Cyber Threat Discovery From Dark Web
10 pages

Ensemble Learning

Uploaded by

Ensemble Learning

Uploaded by

Ensemble Learning

and Random Forests

❖ Law of Large numbers:

❖ Reduction of Bias and Variance:

❖ Ensemble learning methods:

▪ There are two main types of voting classifiers: Hard

Training voting classifier Individual classifier score

❖ But, only bagging allows a training instance to be

❖ Generally, bagging introduces more diversity in the

❖ The ensemble can make a prediction just like a

Training Bagging classifier

from sklearn.tree import DecisionTreeClassifier ▪ A BaggingClassifier automatically performs

OOB evaluation Test-set evaluation

❖ The OOB evaluation was a bit pessimistic for this case.

❖ A RandomForestClassifier has all the hyperparameters of a DecisionTreeClassifier (to control how

Training of Random Forests

from sklearn.ensemble import RandomForestClassifier

Equivalent Bagging Classifier

Fig. 5: Decision boundaries of consecutive predictors

• For random predictor, rj = 0.5, predicted weight is zero.

❖ Instance weight update: Building AdaBoost Classifier

❖ AdaBoost predictions: • Prediction is based on summation of predictor

• Predicted class is the one for which the

You might also like