Bagging

Uploaded by

hema22k

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views6 pages

Bagging

Uploaded by

hema22k

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Bagging, short for Bootstrap Aggregating, is a technique used to improve the accuracy and

stability of machine learning models. Here's a step-by-step explanation of how it works:

1. Bootstrap Sampling:
o Bootstrap Sampling involves creating multiple new datasets (called
"bootstraps") from the original dataset by sampling with replacement. This
means you randomly select samples from the original data to create each
bootstrap dataset, and some data points might appear multiple times in the
same bootstrap dataset, while others might not appear at all.
2. Training Models:
o For each bootstrap dataset, a separate model (e.g., a decision tree) is trained. If
you create BBB bootstraps, you'll train BBB different models, each on its own
bootstrap dataset.
3. Making Predictions:
o For Classification: When you need to classify a new observation, you run this
observation through all BBB trained models. Each model makes a prediction,
and then a majority voting scheme is used to determine the final class label. If
there is a tie in the votes, the tie can be resolved arbitrarily.
o For Regression: For regression tasks, each model makes a prediction, and the
final prediction is the average of all these predictions.
4. Evaluating and Tuning:
o Bagging not only improves accuracy but also allows for the calculation of
standard errors and confidence intervals for the predictions, providing a
measure of uncertainty.
o The number of bootstraps BBB can be predetermined (e.g., set to 30) or
optimized using a separate validation dataset.
5. Why Bagging Works:
o Bagging is particularly effective with models that are unstable. Unstable
models are those where small changes in the data can lead to large changes in
the model (e.g., decision trees). By training multiple models on different
subsets of the data and aggregating their results, bagging reduces variance and
improves overall model performance.

Boosting is a machine learning ensemble technique designed to improve the accuracy of

models by combining multiple weak models into a stronger one. The fundamental idea
behind boosting is to sequentially train models that focus on the mistakes made by previous
models, with the goal of improving overall performance. Here’s a detailed explanation of
how boosting works, specifically focusing on the popular AdaBoost algorithm:

Key Concepts in Boosting

1. Weighted Data:
o In boosting, each observation in the training data is assigned a weight.
Initially, these weights are usually uniform, meaning each observation has an
equal weight.
2. Iterative Process:
o Boosting works through multiple iterations or boosting rounds. In each
iteration, a new model is trained using the weighted data, and the weights are
updated based on the performance of the model.
3. Focus on Misclassified Cases:
o Observations that are misclassified by the current model are given higher
weights in the next iteration. This means the model will focus more on the
examples it previously got wrong.
4. Final Model:
o The final model is a weighted combination of all the individual models trained
during the boosting process.

AdaBoost Algorithm Steps

Random Forests is a powerful ensemble learning technique developed by Leo Breiman. It
builds on the idea of creating a "forest" of decision trees to improve predictive accuracy and
robustness. Here's a detailed breakdown of how Random Forests work:

How Random Forests Work

1. Initialization:
o Given a dataset with n observations and N input features, the goal is to build a
collection of decision trees to make predictions.
2. Bootstrap Sampling:
o Bootstrap Sample: Given a dataset with n observations and N input features,
the goal is to build a collection of decision trees to make predictions.
o This means sampling n observations from the original dataset with
replacement. Some observations may be repeated, and others may be left out.
3. Building Each Tree:
o Random Feature Selection: For each node in the decision tree, randomly
select mmm features out of the total NNN features. The number mmm is
typically chosen beforehand and can be a constant like 1, 2, or the
recommended floor(log⁡2(N)+1)\text{floor}(\log_2(N) + 1)floor(log2(N)+1).
o Best Split: Among the randomly chosen features, select the best one for
splitting the data at that node based on some criterion (e.g., Gini impurity or
information gain).
o Fully Grow Trees: Continue to grow each decision tree to its full extent
without pruning. This means each tree is allowed to develop its structure fully
without trimming any branches.
4. Making Predictions:
o Classification: For a new observation, each tree in the forest votes on the class
label. The final prediction is determined by majority voting among the trees.
o Regression: For regression tasks, the prediction is the average of the
predictions from all the trees in the forest.

Key Features of Random Forests

 Diversity Among Trees: The diversity of the trees is crucial for the effectiveness of
Random Forests. This diversity is achieved through:
o Bootstrap Sampling: Different trees are trained on different subsets of the
data.
o Random Feature Selection: Each tree makes splitting decisions based on a
random subset of features, reducing the correlation between trees.
 Robustness and Accuracy: By averaging or voting across many trees, Random
Forests typically provide better predictive performance and are less prone to
overfitting compared to individual decision trees.

Comparison with Other Techniques

 Rotation Forests: A variant of Random Forests, Rotation Forests apply Principal

Component Analysis (PCA) to the data before building the trees. This technique
involves:
o Feature Rotation: Transforming the feature space using PCA to create new,
uncorrelated features.
o Building Trees: Training decision trees on these transformed features. This
rotation can improve the accuracy of base classifiers but may reduce the
ability to rank individual feature importance.

05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Machine Learning
No ratings yet
Machine Learning
76 pages
ML Mod1
No ratings yet
ML Mod1
48 pages
Ensemble - Part 1
No ratings yet
Ensemble - Part 1
33 pages
ML Unit 3 V2
No ratings yet
ML Unit 3 V2
47 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
ENsemble, Random Forest
No ratings yet
ENsemble, Random Forest
28 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
Module 5,1 Ensemble - Bagging, RF, Boosting
No ratings yet
Module 5,1 Ensemble - Bagging, RF, Boosting
66 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
16-Ensemble Learning - Cont... - 12-04-2024
No ratings yet
16-Ensemble Learning - Cont... - 12-04-2024
13 pages
Unit 3
No ratings yet
Unit 3
59 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
Ensemble Methods
No ratings yet
Ensemble Methods
32 pages
ML Unit 3 (DS)
No ratings yet
ML Unit 3 (DS)
31 pages
ML Chapter 3
No ratings yet
ML Chapter 3
25 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Bagging Vs Boosting - Javatpoint
No ratings yet
Bagging Vs Boosting - Javatpoint
8 pages
ML U3 Notes
No ratings yet
ML U3 Notes
10 pages
Adv Ai Ia2
No ratings yet
Adv Ai Ia2
6 pages
Unit 3
No ratings yet
Unit 3
63 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
ML Unit 3-1
No ratings yet
ML Unit 3-1
14 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
Bagging
No ratings yet
Bagging
7 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Random Forest
No ratings yet
Random Forest
27 pages
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
100% (1)
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
14 pages
Eda - M4
No ratings yet
Eda - M4
7 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
4 pages
Bagging Vs Boosting in Machine Learning
No ratings yet
Bagging Vs Boosting in Machine Learning
5 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
Define Feature Bagging
No ratings yet
Define Feature Bagging
2 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
ML Exp 9
No ratings yet
ML Exp 9
3 pages
Data Mining Notes
No ratings yet
Data Mining Notes
5 pages
2.4-Ensemble Methods Lecture Notes
No ratings yet
2.4-Ensemble Methods Lecture Notes
14 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Random Forest Algorithms - Comprehensive Guide With Examples
No ratings yet
Random Forest Algorithms - Comprehensive Guide With Examples
13 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
BTP Report
No ratings yet
BTP Report
17 pages
Project Cycle Notes Class 10 AI
No ratings yet
Project Cycle Notes Class 10 AI
14 pages
Research Paper On Artificial Intelligence, Machine Learning and Deep Learning by Deepankar Arvind Howal
100% (1)
Research Paper On Artificial Intelligence, Machine Learning and Deep Learning by Deepankar Arvind Howal
10 pages
Object Detection Using Yolo Algorithm-1
No ratings yet
Object Detection Using Yolo Algorithm-1
9 pages
Deepfakes Audio Detection Techniques Using Deep Convolutional Neural Network-Paper3
No ratings yet
Deepfakes Audio Detection Techniques Using Deep Convolutional Neural Network-Paper3
6 pages
DS Problem Statements and Codes
No ratings yet
DS Problem Statements and Codes
21 pages
CHP 1
No ratings yet
CHP 1
47 pages
Gen AI
No ratings yet
Gen AI
14 pages
Data Mining: Concepts and Techniques: - Chapter 6
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 6
112 pages
Artificial Intelligence Sandeep Reddy
No ratings yet
Artificial Intelligence Sandeep Reddy
55 pages
UPI Fraud Detection Using Convolutional Neural Net
No ratings yet
UPI Fraud Detection Using Convolutional Neural Net
16 pages
Module I-Part 1
No ratings yet
Module I-Part 1
48 pages
Two Layer ML Student DO
No ratings yet
Two Layer ML Student DO
12 pages
Estimation of Soil Liquefaction Using Artificial Inteligence Techniques - An Extended Comparison Between Machine and Deep Learning Approaches (2025)
No ratings yet
Estimation of Soil Liquefaction Using Artificial Inteligence Techniques - An Extended Comparison Between Machine and Deep Learning Approaches (2025)
23 pages
AI Based Threat Detection System - IEEE Report
No ratings yet
AI Based Threat Detection System - IEEE Report
10 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
Customer Segmentation Using Data Science
No ratings yet
Customer Segmentation Using Data Science
7 pages
Major Project Detailed Report
No ratings yet
Major Project Detailed Report
50 pages
Approximate Solution of One and Two Dimensional Nonlinear Klein Sinh Gordon Equations Using Method of Line Based On Fibonacci Polynomials
No ratings yet
Approximate Solution of One and Two Dimensional Nonlinear Klein Sinh Gordon Equations Using Method of Line Based On Fibonacci Polynomials
23 pages
DTS304TC CW2 Paper
No ratings yet
DTS304TC CW2 Paper
21 pages
Data Mining Assignment 3
No ratings yet
Data Mining Assignment 3
9 pages
Indoor Violence Detection Using Video Transformer
No ratings yet
Indoor Violence Detection Using Video Transformer
6 pages
Fourier Features Let Networks Learn High Freq Funcs in Low Dim Domains
No ratings yet
Fourier Features Let Networks Learn High Freq Funcs in Low Dim Domains
24 pages
iWCE 2024 01
No ratings yet
iWCE 2024 01
10 pages
Fruit Recognition System Using MATLAB
No ratings yet
Fruit Recognition System Using MATLAB
5 pages
Stress Prediction Documentation
No ratings yet
Stress Prediction Documentation
4 pages
Research Article An Improved Covid-19 Detection Using Gan-Based Data Augmentation and Novel Qunet-Based Classification
No ratings yet
Research Article An Improved Covid-19 Detection Using Gan-Based Data Augmentation and Novel Qunet-Based Classification
9 pages
Classification in R
No ratings yet
Classification in R
5 pages
Task 3 (Dataset Preparation For Fine-Tuning)
No ratings yet
Task 3 (Dataset Preparation For Fine-Tuning)
2 pages
Task 5
No ratings yet
Task 5
2 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Bagging

Uploaded by

Bagging

Uploaded by

Bagging, short for Bootstrap Aggregating, is a technique used to improve the accuracy and

stability of machine learning models. Here's a step-by-step explanation of how it works:

Boosting is a machine learning ensemble technique designed to improve the accuracy of

Key Concepts in Boosting

AdaBoost Algorithm Steps

How Random Forests Work

Key Features of Random Forests

Comparison with Other Techniques

 Rotation Forests: A variant of Random Forests, Rotation Forests apply Principal

You might also like