0% found this document useful (0 votes)
148 views6 pages

Slides - Ensemble

Ensemble methods combine multiple machine learning models to obtain better predictive performance than could be obtained from any of the constituent models alone. Common ensemble techniques include bagging and boosting. Bagging trains models in parallel on randomly sampled data while boosting trains models sequentially with more focus on misclassified examples.

Uploaded by

Ravinder SIngh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views6 pages

Slides - Ensemble

Ensemble methods combine multiple machine learning models to obtain better predictive performance than could be obtained from any of the constituent models alone. Common ensemble techniques include bagging and boosting. Bagging trains models in parallel on randomly sampled data while boosting trains models sequentially with more focus on misclassified examples.

Uploaded by

Ravinder SIngh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Ensemble Learning

[email protected]
472C20Y4RY

Proprietary content. ©Great Learning.


This AllforRights
file is meant Reserved.
personal Unauthorizedonly.
use by [email protected] use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action.
Ensemble Methods
• Ensembles are machine learning methods for combining predictions
from multiple separate models.

• The central motivation is rooted under the belief that a committee of


experts working together can perform better than a single expert.

Training Data

Model-1 Model-2 Model-3 … Model-n


[email protected]
472C20Y4RY

Test Data

Model-1 Model-2 Model-3 … Model-n

Prediction-1 Prediction-2 Prediction-3 … Prediction-n

Combined Prediction

2
Proprietary content. ©Great Learning.
This AllforRights
file is meant Reserved.
personal Unauthorizedonly.
use by [email protected] use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action.
• Both Regression and Classification can be done using Ensemble learning

• Combining the individual predictions can be done by using either voting or averaging

• The individual ensemble learners need to be:

• Different from each other (independent errors)

• Can be weak (slightly better than random): Because of the number of models in
an Ensemble method, computational requirements are much higher than that of
[email protected] evaluating a single model. So ensembles are a way to compensate for poor
472C20Y4RY
models by performing a lot of extra computation.

3
Proprietary content. ©Great Learning.
This AllforRights
file is meant Reserved.
personal Unauthorizedonly.
use by [email protected] use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action.
Common Ensemble Techniques
• Bagging (Bootstrap Aggregation)

• Reduced chances of over fitting by training each model only with a randomly
chosen subset of the training data. Training can be done in parallel.

• Essentially trains a large number of “strong” learners in parallel (each model is


an over fit for that subset of the data)

• Combines (averaging or voting) these learners together to "smooth out"


predictions.


[email protected]
472C20Y4RY
Boosting

• Trains a large number of "weak" learners in sequence. A weak learner is a


simple model that is only slightly better than random (eg. One depth decision
tree).

• Miss-classified data weights are increased for training the next model. So
training has to be done in sequence.

• Boosting then combines all the weak learners into a single strong learner.

Bagging uses complex models and tries to "smooth out" their predictions, while
Boosting uses simple models and tries to "boost" their aggregate complexity.
4
Proprietary content. ©Great Learning.
This AllforRights
file is meant Reserved.
personal Unauthorizedonly.
use by [email protected] use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action.
Boosting Methods

• AdaBoosting (Adaptive Boosting)

• In AdaBoost, the successive learners are created with a focus on the


ill fitted data of the previous learner

• Each successive learner focuses more and more on the harder to fit
[email protected]
472C20Y4RY
data i.e. their residuals in the previous tree

• Gradient Boosting

• Each learner is fit on a modified version of original data (original data is


replaced with the x values and residuals from previous learner

• By fitting new models to the residuals, the overall learner gradually


improves in areas where residuals are initially high

5
Proprietary content. ©Great Learning.
This AllforRights
file is meant Reserved.
personal Unauthorizedonly.
use by [email protected] use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action.
+ _ + _ + _
+ + +
+ _ + _ + _
+ _ _ + _ _ + _ _

[email protected]
472C20Y4RY

+ _
+
+ _
+ _ _

Proprietary content. ©Great Learning.


This AllforRights
file is meant Reserved.
personal Unauthorizedonly.
use by [email protected] use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action.

You might also like