0% found this document useful (0 votes)
11 views33 pages

Ensemble - Part 1

Uploaded by

parostdevil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views33 pages

Ensemble - Part 1

Uploaded by

parostdevil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

Bagging

Boosting
Boosting
Bootstrap Aggregation (bagging)
• Bootstrap Aggregation (bagging) is an ensemble method that
attempts to resolve overfitting for classification or regression
problems.

• Bagging aims to improve the accuracy and performance of machine


learning algorithms. It does this by taking random subsets of an
original dataset, with replacement, and fits either a classifier (for
classification) or regressor (for regression) to each subset.

• The predictions for each subset are then aggregated through


majority vote for classification or averaging for regression, increasing
prediction accuracy.
Random Forest Algorithm
• Random Forests is a Supervised Learning algorithm and is one of the most flexible and easy-to-use
algorithms.

• Random Forest leverages a multitude of decision trees to create a ‘forest’ that is more accurate and
robust than its individual components.

• Each decision tree in the Random Forest makes decisions based on the given data. When these trees
work together, they form a ‘forest’ that can handle complex problems much better than any single tree
could.

• The ‘forest’ created by the Random Forest algorithm is trained through Bagging. Bagging aims to reduce
the complexity of models that overfit the training data.

• Bagging consists of aggregation and bootstrapping; combining the words gives us ‘Bagging.’
Bootstrapping is a sampling method where we choose a sample out of a set using the replacement
method, making the selection procedure completely random.
Random
Random Forest Forest for
for Regression
Classification
Boosting
• Weak Learners: Simple models that perform slightly better than
random guessing.
• Sequential Training: Models are trained one after the other, with each
model learning
•from the mistakes of the previous ones.
• Weight Adjustment: Instances that are misclassified by a model are
assigned higher weights,
•so that subsequent models focus more on these difficult cases.
AdaBoost (Adaptive Boosting)
Algorithm
1. Initialize Weights: Assign equal weights to all training samples.
2. Train Weak Learner: Train a weak learner on the weighted training set.
3. Calculate Error: Calculate the weighted error rate of the weak learner.
4. Update Weights: Increase the weights of misclassified samples and decrease the weights of correctly classified
samples.
5. Repeat: Repeat steps 2-4 for a specified number of iterations or until a stopping criterion is met.
6. Combine Weak Learners: Combine the weak learners into a strong learner by taking a weighted majority vote.
Working
• AdaBoost starts with a simple model and iteratively improves it by focusing on the samples that the previous
model misclassified.
• It assigns higher weights to these misclassified samples, ensuring that subsequent models pay more attention to
them.
• The final prediction is a weighted combination of the predictions from all weak learners.
Gradient Boosting
Algorithm
1.Initialize Model: Initialize the model with a constant value.
2.Calculate Residuals: Calculate the residuals (errors) between the current model's
predictions and the true labels.
3.Train Weak Learner: Train a weak learner to predict the residuals.
4.Update Model: Update the model by adding a scaled version of the weak learner's
predictions.
5.Repeat: Repeat steps 2-4 for a specified number of iterations or until a stopping criterion
is met.
Working
• Gradient Boosting treats model training as an optimization problem, aiming to minimize a
loss function.
• Each weak learner is trained to predict the residuals of the previous model.
• The final prediction is the sum of the predictions from all weak learners.

You might also like