0% found this document useful (0 votes)
18 views

Assignment 1

The document discusses different ensemble methods including bagging, boosting and stacking. It explains how bagging works by combining multiple models to improve accuracy compared to a single model. It also describes the steps and types of bagging including bootstrapping and aggregation. Boosting is defined as learning from previous mistakes to make better predictions. Different types of boosting algorithms are also explained. Stacking is introduced as combining weak learners with meta learners to achieve better predictions.

Uploaded by

Ali Hiadr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Assignment 1

The document discusses different ensemble methods including bagging, boosting and stacking. It explains how bagging works by combining multiple models to improve accuracy compared to a single model. It also describes the steps and types of bagging including bootstrapping and aggregation. Boosting is defined as learning from previous mistakes to make better predictions. Different types of boosting algorithms are also explained. Stacking is introduced as combining weak learners with meta learners to achieve better predictions.

Uploaded by

Ali Hiadr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Name: Ali Haidar Roll No: FA19M2BC064

Section: IT-7th (M-1)

Assignment 1: Ensemble Methods bagging and boosting

Ensemble Method:
Ensemble methods are techniques that aim at improving the
accuracy of results in models by combining multiple models instead of using
a single model. The combined models increase the accuracy of the results.
Ensemble method in Machine Learning is multimodal system in which
different classifier and techniques are strategically combined into a
predictive model (grouped as Sequential Model, Parallel Model,
Homogeneous and Heterogeneous methods etc.) Ensemble method also
helps to reduce the variance in the predicted data, minimize the biasness in
the predictive model and to classify and predict the statistics from the
complex problems with better accuracy.
Types:
 Bagging
 Boosting
 Stacking

Bagging
Bagging, also known as Bootstrap aggregating, is an ensemble
learning technique that helps to improve the performance and accuracy of
machine learning algorithms. It is used to deal with bias-variance trade-offs
and reduces the variance of a prediction model.
The reduction of variance increases accuracy, eliminating overfitting.
Bagging avoids overfitting of data and is used for both regression and
classification models, specifically for decision tree algorithms.
Steps to Perform Bagging
• Consider there are n observations and m features in the training set. You
need to select a random sample from the training dataset without
replacement
• A subset of m features is chosen randomly to create a model using sample
observations
• The feature offering the best split out of the lot is used to split the nodes
• The tree is grown, so you have the best root nodes
• The above steps are repeated n times. It aggregates the output of individual
decision trees to give the best prediction

Types of bagging
• Bagging is classified into two types, i.e., bootstrapping and aggregation.
• Bootstrapping is a sampling technique where samples are derived from the
whole population (set) using the replacement procedure. The sampling
with replacement method helps make the selection procedure randomized.
The base learning algorithm is run on the samples to complete the
procedure.
• Aggregation in bagging is done to incorporate all possible outcomes of the
prediction and randomize the outcome. Without aggregation, predictions
will not be accurate because all outcomes are not put into consideration.
Therefore, the aggregation is based on the probability bootstrapping
procedures or on the basis of all outcomes of the predictive models.

Boosting
• Boosting is an ensemble technique that learns from previous
predictor mistakes to make better predictions in the future. The technique
combines several weak base learners to form one strong learner, thus
significantly improving the predictability of models. Boosting works by
arranging weak learners in a sequence, such that weak learners learn from
the next learner in the sequence to create better predictive models.
• Boosting takes many forms, including gradient boosting, Adaptive Boosting
(AdaBoost), and XGBoost (Extreme Gradient Boosting). AdaBoost uses weak
learners in the form of decision trees, which mostly include one split that is
popularly known as decision stumps. AdaBoost’s main decision stump
comprises observations carrying similar weights.
• Gradient boosting adds predictors sequentially to the ensemble, where
preceding predictors correct their successors, thereby increasing the
model’s accuracy. New predictors are fit to counter the effects of errors in
the previous predictors. The gradient of descent helps the gradient booster
identify problems in learners’ predictions and counter them accordingly.
• XGBoost makes use of decision trees with boosted gradient, providing
improved speed and performance. It relies heavily on the computational
speed and the performance of the target model. Model training should
follow a sequence, thus making the implementation of gradient boosted
machines slow.
Stacking
Stacking is one of the popular ensemble modeling techniques in machine
learning. Various weak learners are ensembled in a parallel manner in such
a way that by combining them with Meta learners, we can predict better
predictions for the future.
This ensemble technique works by applying input of combined multiple
weak learners' predictions and Meta learners so that a better output
prediction model can be achieved.
In stacking, an algorithm takes the outputs of sub-models as input and
attempts to learn how to best combine the input predictions to make a
better output prediction.
Stacking is also known as a stacked generalization and is an extended form
of the Model Averaging Ensemble technique in which all sub-models
equally participate as per their performance weights and build a new model
with better predictions. This new model is stacked up on top of the others;
this is the reason why it is named stacking
Steps in Stacking
• Split training data sets into n-folds using the RepeatedStratifiedKFold as this
is the most common approach to preparing training datasets for meta-
models.
• Now the base model is fitted with the first fold, which is n-1, and it will
make predictions for the nth folds.
• The prediction made in the above step is added to the x1_train list.
• Repeat steps 2 & 3 for remaining n-1folds, so it will give x1_train array of
size n,
• Now, the model is trained on all the n parts, which will make predictions for
the sample data.
• Add this prediction to the y1_test list.
• In the same way, we can find x2_train, y2_test, x3_train, and y3_test by
using Model 2 and 3 for training, respectively, to get Level 2 predictions.
• Now train the Meta model on level 1 prediction, where these predictions
will be used as features for the model.
• Finally, Meta learners can now be used to make a prediction on test data in
the stacking model.

You might also like