0% found this document useful (0 votes)

62 views8 pages

Bagging and Boosting in Data Mining: Carolina Ruiz

Bagging and boosting are two approaches to improve the stability and accuracy of models. Bagging creates multiple bootstrap replicates of the dataset and fits a model to each, then averages the predictions. Boosting iteratively reweights instances to focus on those misclassified by previous models, improving accuracy. Both methods are easy to implement and parallelizable. Bagging stabilizes unstable models while boosting explicitly improves classification performance.

Uploaded by

Rocking Ridz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views8 pages

Bagging and Boosting in Data Mining: Carolina Ruiz

Uploaded by

Rocking Ridz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 8

Bagging and Boosting in Data Mining

Carolina Ruiz
[email protected] http://www.cs.wpi.edu/~ruiz

Motivation and Background

Problem Definition:

Given: a dataset of instances and a target concept Find: a model (e.g. set of association rules, decision tree, neural network) that helps in predicting the classification of unseen instances. The model should be stable (i.e. shouldnt depend too much on input data used to construct it) The model should be a good predictor (difficult to achieve when input dataset is small)

Difficulties:

Two Approaches

Bagging (Bootstrap Aggregating)

Leo Breiman, UC Berkeley

Boosting

Rob Schapire, ATT Research Jerry Friedman, Stanford U.

Bagging

Model Creation:

Create bootstrap replicates of the dataset and fit a model to each one Average/vote predictions of each model

Prediction:

Advantages

Stabilizes unstable methods Easy to implement, parallelizable.

Bagging Algorithm

1. Create k bootstrap replicates of the dataset 2. Fit a model to each of the replicates 3. Average/vote the predictions of the k models

Boosting

Creating the model:

Construct a sequence of datasets and models in such a way that a dataset in the sequence weights an instance heavily when the previous model has misclassified it.

Prediction:

Merge the models in the sequence

Improves classification accuracy

Advantages:

Generic Boosting Algorithm

1. Equally weight all instance in dataset 2. For I = 1 to T

2.1. Fit a model to current dataset 2.2. Upweight poorly predicted instances 2.3 Downweight well-predicted instances

3. Merge the models in the sequence to obtain the final model

Conclusions and References

Boosted nave Bayes tied for first place in KDD-cup 1997 Reference:

Combining Estimators to Improve Performance KDD-99 tutorial notes

John F. Elder Greg Ridgeway

P1 Create A Design Specification For Data Structures Explaining The Valid Operations That Can Be C
100% (1)
P1 Create A Design Specification For Data Structures Explaining The Valid Operations That Can Be C
7 pages
Answer:: Chapter 19 - Solution Procedures For Transportation and Assignment Problems True / False
No ratings yet
Answer:: Chapter 19 - Solution Procedures For Transportation and Assignment Problems True / False
13 pages
A Collection of Technical Interview Questions
82% (11)
A Collection of Technical Interview Questions
34 pages
Strategic Analysis of Apple Inc.: Brian Masi
No ratings yet
Strategic Analysis of Apple Inc.: Brian Masi
35 pages
50 Days DSA Challenge
No ratings yet
50 Days DSA Challenge
21 pages
Assignment 2 Brief: Format
No ratings yet
Assignment 2 Brief: Format
24 pages
The Maximum Flow Problem
No ratings yet
The Maximum Flow Problem
64 pages
Hamilton Cycle
No ratings yet
Hamilton Cycle
7 pages
Problem Set 1 Solutions
No ratings yet
Problem Set 1 Solutions
9 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
Data Structure and Algorithm (CS-102) : Ashok K Turuk
No ratings yet
Data Structure and Algorithm (CS-102) : Ashok K Turuk
58 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
R23 DSA Lab Manual (Instructor Copy)
No ratings yet
R23 DSA Lab Manual (Instructor Copy)
104 pages
A Complete Reference of HTML
No ratings yet
A Complete Reference of HTML
160 pages
Unit 6 Assessment
No ratings yet
Unit 6 Assessment
8 pages
L8 - Segmented & Incremental Sieve
No ratings yet
L8 - Segmented & Incremental Sieve
13 pages
Linear Time Sorting
No ratings yet
Linear Time Sorting
5 pages
L17 - Longest Sequence of 1 After Flipping A Bit
No ratings yet
L17 - Longest Sequence of 1 After Flipping A Bit
15 pages
3 Array and Linked Lists
No ratings yet
3 Array and Linked Lists
32 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Top20 Model Paper
No ratings yet
Top20 Model Paper
17 pages
Live Complete Course Brochure
No ratings yet
Live Complete Course Brochure
21 pages
Trees (Unit Review) PDF
No ratings yet
Trees (Unit Review) PDF
2 pages
Chapter 2,3,4
No ratings yet
Chapter 2,3,4
8 pages
Data Structures in Java
No ratings yet
Data Structures in Java
9 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Knapsack Problems
No ratings yet
Knapsack Problems
306 pages
Data Structures 2
No ratings yet
Data Structures 2
17 pages
Article: Southeast Asian Stock Market Linkages: Evidence From Pre-And Post-October 1997
No ratings yet
Article: Southeast Asian Stock Market Linkages: Evidence From Pre-And Post-October 1997
7 pages
Data Structures Using C
No ratings yet
Data Structures Using C
1 page
DAAT2
No ratings yet
DAAT2
8 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Answers: 1. Dcba 2. Abdec 3. Aebdc 4. Bedac 5. Cbda 6. Bdca 7. Bdca 8. Decab 9. Cbeda 10.DEABC
No ratings yet
Answers: 1. Dcba 2. Abdec 3. Aebdc 4. Bedac 5. Cbda 6. Bdca 7. Bdca 8. Decab 9. Cbeda 10.DEABC
1 page
Part A
No ratings yet
Part A
1 page
Lecture 7 - Classification (Rules and Naïve Bayes)
100% (1)
Lecture 7 - Classification (Rules and Naïve Bayes)
19 pages
State Nomiation Requirement For Nursing
No ratings yet
State Nomiation Requirement For Nursing
2 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
100% (1)
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
14 pages
Knapsack Backtracking
No ratings yet
Knapsack Backtracking
5 pages
Sybca Sem III Slips
No ratings yet
Sybca Sem III Slips
30 pages
hw3 Sols
No ratings yet
hw3 Sols
5 pages
Week 5 Practice
No ratings yet
Week 5 Practice
16 pages
Ensemble Learning (Autosaved)
No ratings yet
Ensemble Learning (Autosaved)
31 pages
Final Exam - HDSA - Qs
No ratings yet
Final Exam - HDSA - Qs
10 pages
Unit 4 ML
No ratings yet
Unit 4 ML
25 pages
Association Rules v3
No ratings yet
Association Rules v3
9 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
Class Adv Classification V
No ratings yet
Class Adv Classification V
50 pages
Lec06 - Ensembling Methods Bagging Boosting
No ratings yet
Lec06 - Ensembling Methods Bagging Boosting
48 pages
Module 5,1 Ensemble - Bagging, RF, Boosting
No ratings yet
Module 5,1 Ensemble - Bagging, RF, Boosting
66 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Gradient Boosted Trees: Dr. Geetha Kuntoji
No ratings yet
Gradient Boosted Trees: Dr. Geetha Kuntoji
24 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Bagging
No ratings yet
Bagging
7 pages
Ensemble - Part 1
No ratings yet
Ensemble - Part 1
33 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
Unit 3 Ds
No ratings yet
Unit 3 Ds
10 pages
HW DecisionTrees
No ratings yet
HW DecisionTrees
2 pages
Chapter Five
No ratings yet
Chapter Five
42 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Session 5
No ratings yet
Session 5
36 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Unit 3
No ratings yet
Unit 3
63 pages
UNIT3 Class
No ratings yet
UNIT3 Class
30 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
Machine Learning: Video 106: Gradient Boosting Explained - How Gradient Boosting Works?
No ratings yet
Machine Learning: Video 106: Gradient Boosting Explained - How Gradient Boosting Works?
6 pages
Bagging Vs Boosting in Machine Learning - GeeksforGeeks
No ratings yet
Bagging Vs Boosting in Machine Learning - GeeksforGeeks
9 pages
Unit 2 - SVM
No ratings yet
Unit 2 - SVM
137 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Ensemble Learning
No ratings yet
Ensemble Learning
30 pages
ML Unit 3-1
No ratings yet
ML Unit 3-1
14 pages
Boosting
No ratings yet
Boosting
2 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
Bagging Vs Boosting in Machine Learning
No ratings yet
Bagging Vs Boosting in Machine Learning
5 pages
2.4-Ensemble Methods Lecture Notes
No ratings yet
2.4-Ensemble Methods Lecture Notes
14 pages
Types of Boosting
No ratings yet
Types of Boosting
4 pages
Boosting
No ratings yet
Boosting
6 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
27 pages
Bagging Vs Boosting in Machine Learning
No ratings yet
Bagging Vs Boosting in Machine Learning
4 pages
AI & ML Unit 4 Notes
No ratings yet
AI & ML Unit 4 Notes
16 pages
Cornell CS578: Bagging and Boosting
No ratings yet
Cornell CS578: Bagging and Boosting
10 pages
Combining Classifiers: Outline
No ratings yet
Combining Classifiers: Outline
15 pages
MLDM Lect17 Classification Ensembles
No ratings yet
MLDM Lect17 Classification Ensembles
2 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages

Bagging and Boosting in Data Mining: Carolina Ruiz

Uploaded by

Bagging and Boosting in Data Mining: Carolina Ruiz

Uploaded by

Bagging and Boosting in Data Mining

Motivation and Background

Bagging (Bootstrap Aggregating)

Leo Breiman, UC Berkeley

Rob Schapire, ATT Research Jerry Friedman, Stanford U.

Stabilizes unstable methods Easy to implement, parallelizable.

Creating the model:

Merge the models in the sequence

Generic Boosting Algorithm

1. Equally weight all instance in dataset 2. For I = 1 to T

3. Merge the models in the sequence to obtain the final model

Conclusions and References

Combining Estimators to Improve Performance KDD-99 tutorial notes

John F. Elder Greg Ridgeway

You might also like