100% found this document useful (1 vote)

59 views34 pages

Lec 18

This document provides an overview of ensemble learning techniques. It discusses the motivation for using ensemble methods, how ensemble classifiers are constructed by manipulating training sets, input features, class labels, or learning algorithms. Specific ensemble methods like bagging, boosting, and random forests are explained. Bagging works by training base classifiers on randomly sampled subsets of data and combining predictions through voting. Boosting iteratively focuses on misclassified examples by increasing their weights. AdaBoost is presented as a popular boosting algorithm.

Uploaded by

ABHIRAJ E

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

59 views34 pages

Lec 18

Uploaded by

ABHIRAJ E

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Ensemble Learning

Prof. Navneet Goyal

Ensemble Learning
• Introduction & Motivation
• Construction of Ensemble Classifiers
– Boosting (Ada Boost)
– Bagging
– Random Forests
• Empirical Comparison
Introduction & Motivation
• Suppose that you are a patient with a set of symptoms
• Instead of taking opinion of just one doctor (classifier),
you decide to take opinion of a few doctors!
• Is this a good idea? Indeed it is.
• Consult many doctors and then based on their diagnosis;
you can get a fairly accurate idea of the diagnosis.
• Majority voting - ‘bagging’
• More weightage to the opinion of some ‘good’
(accurate) doctors - ‘boosting’
• In bagging, you give equal weightage to all classifiers,
whereas in boosting you give weightage according to
the accuracy of the classifier.
Introduction & Motivation
• Suppose you are preparing for an exam
• As part of the preparation, you have to solve 50
problems
• You, as an individual, can solve only 15 problems
• But, one of your wingies can solve 20 problems, some of
which are same as your 15
• Another wingie can solve a set of 12 problems
• When you put them all together, you find that you guys
have solution to 32 problems
• 32 is much better than what you could solve
individually!!
• Collaborative Learning
Introduction & Motivation
• Why do you think I ask you to form groups for
assignments?
– I am against individual student doing an assignment
• May be coz I want to reduce my evaluation work!!
– NOT TRUE!!!
• Group members complement each other and do a
better job
– Provided it is not just one student who actually does the
assignment and rest piggyback!!
• If all students in a group contribute properly, it’s a
WIN-WIN situation for both students and faculty!!
Ensemble Methods
• Construct a set of classifiers from the training data

• Predict class label of previously unseen records by

aggregating predictions made by multiple classifiers
General Idea
Original
D Training data

Step 1:
Create Multiple D1 D2 .... Dt-1 Dt
Data Sets

Step 2:
Build Multiple C1 C2 Ct -1 Ct
Classifiers

Step 3:
Combine C*
Classifiers

Figure taken from Tan et. al. book “Introduction to Data Mining”
Ensemble Classifiers (EC)
• An ensemble classifier constructs a set of ‘base
classifiers’ from the training data
• Methods for constructing an EC
• Manipulating training set
• Manipulating input features
• Manipulating class labels
• Manipulating learning algorithms
Ensemble Classifiers (EC)
• Manipulating training set
• Multiple training sets are created by resampling the data
according to some sampling distribution
• Sampling distribution determines how likely it is that an
example will be selected for training – may vary from one trial
to another
• Classifier is built from each training set using a paritcular
learning algorithm
• Examples: Bagging & Boosting
Ensemble Classifiers (EC)
• Manipulating input features
• Subset of input features chosen to form each training set
• Subset can be chosen randomly or based on inputs given by
Domain Experts
• Good for data that has redundant features
• Random Forest is an example which uses DT as its base
classifierss
Ensemble Classifiers (EC)
• Manipulating class labels
• When no. of classes is sufficiently large
• Training data is transformed into a binary class problem by
randomly partitioning the class labels into 2 disjoint subsets,
A0 & A1
• Re-labelled examples are used to train a base classifier
• By repeating the class labeling and model building steps
several times, and ensemble of base classifiers is obtained
• How a new tuple is classified?
• Example – error correcting output codings (pp 307)
Ensemble Classifiers (EC)
• Manipulating learning algorithm
• Learning algorithms can be manipulated in such a way that
applying the algorithm several times on the same training
data may result in different models
• Example – ANN can produce different models by changing
network topology or the initial weights of links between
neurons
• Example – ensemble of DTs can be constructed by
introducing randomness into the tree growing procedure –
instrad of choosing the best split attribute at each node, we
randomly choose one of the top k attributes
Ensemble Classifiers (EC)
• First 3 approaches are generic – can be applied to any
classifier
• Fourth approach depends on the type of classifier used
• Base classifiers can be generated sequentially or in
parallel
Ensemble Classifiers
• Ensemble methods work better with ‘unstable
classifiers’
• Classifiers that are sensitive to minor perturbations in
the training set
• Examples:
– Decision trees
– Rule-based
– Artificial neural networks
Why does it work?
• Suppose there are 25 base classifiers
– Each classifier has error rate, e = 0.35
– Assume classifiers are independent
– Probability that the ensemble classifier makes a wrong
prediction:
25
æ 25 ö i
å ç
ç i ÷
i =13 è
÷e (1 - e ) 25 -i
= 0.06
ø
– CHK out yourself if it is correct!!

Example taken from Tan et. al. book “Introduction to Data Mining”
Examples of Ensemble Methods

• How to generate an ensemble of classifiers?

– Bagging
– Boosting
– Random Forests
Bagging
• Also known as bootstrap aggregation
Original Data 1 2 3 4 5 6 7 8 9 10
Bagging (Round 1) 7 8 10 8 2 5 10 10 5 9
Bagging (Round 2) 1 4 9 1 2 3 2 7 3 2
Bagging (Round 3) 1 8 5 10 5 5 9 6 3 7
• Sampling uniformly with replacement
• Build classifier on each bootstrap sample
• 0.632 bootstrap
• Each bootstrap sample Di contains approx. 63.2% of
the original training data
• Remaining (36.8%) are used as test set

Example taken from Tan et. al. book “Introduction to Data Mining”
Bagging

• Accuracy of bagging:
k
Acc( M ) = å (0.632 * Acc( M i ) test _ set + 0.368 * Acc( M i ) train _ set )
i =1
• Works well for small data sets
• Example:

X 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
y 1 1 1 -1 -1 -1 -1 1 1 1
Actual
Class
labels

Example taken from Tan et. al. book “Introduction to Data Mining”
Bagging
• Decision Stump
• Single level decision binary
tree
• Entropy – x<=0.35 or
x<=0.75
• Accuracy at most 70%

Actual
Class
labels

Example taken from Tan et. al. book “Introduction to Data Mining”
Bagging

Accuracy of ensemble classifier: 100% J

Example taken from Tan et. al. book “Introduction to Data Mining”
Bagging- Final Points
• Works well if the base classifiers are unstable
• Increased accuracy because it reduces the variance
of the individual classifier
• Does not focus on any particular instance of the
training data
• Therefore, less susceptible to model over-fitting
when applied to noisy data
• What if we want to focus on a particular instances of
training data?
Boosting
• An iterative procedure to adaptively change
distribution of training data by focusing more on
previously misclassified records
– Initially, all N records are assigned equal weights
– Unlike bagging, weights may change at the end of a
boosting round
Boosting
• Records that are wrongly classified will have their
weights increased
• Records that are classified correctly will have their
weights decreased
Original Data 1 2 3 4 5 6 7 8 9 10
Boosting (Round 1) 7 3 2 8 7 9 4 10 6 3
Boosting (Round 2) 5 4 9 4 2 5 1 7 4 2
Boosting (Round 3) 4 4 8 10 4 5 4 6 3 4

• Example 4 is hard to classify

• Its weight is increased, therefore it is more likely
to be chosen again in subsequent rounds

Example taken from Tan et. al. book “Introduction to Data Mining”
Boosting
• Equal weights are assigned to each training tuple
(1/d for round 1)
• After a classifier Mi is learned, the weights are
adjusted to allow the subsequent classifier Mi+1 to
“pay more attention” to tuples that were
misclassified by Mi.
• Final boosted classifier M* combines the votes of
each individual classifier
• Weight of each classifier’s vote is a function of its
accuracy
• Adaboost – popular boosting algorithm
Adaboost
• Input:
– Training set D containing d tuples
– k rounds
– A classification learning scheme
• Output:
– A composite model
Adaboost
• Data set D containing d class-labeled tuples (X1,y1),
(X2,y2), (X3,y3),….(Xd,yd)
• Initially assign equal weight 1/d to each tuple
• To generate k base classifiers, we need k rounds or
iterations
• Round i, tuples from D are sampled with
replacement , to form Di (size d)
• Each tuple’s chance of being selected depends on its
weight
Adaboost
• Base classifier Mi, is derived from training tuples of
Di
• Error of Mi is tested using Di
• Weights of training tuples are adjusted depending
on how they were classified
– Correctly classified: Decrease weight
– Incorrectly classified: Increase weight
• Weight of a tuple indicates how hard it is to classify
it (directly proportional)
Adaboost
• Some classifiers may be better at classifying some
“hard” tuples than others
• We finally have a series of classifiers that
complement each other!
• Error rate of model Mi:d
error ( M i )= å w j * err ( X j )
j
where err(Xj) is the misclassification error for Xj(=1)
• If classifier error exceeds 0.5, we abandon it
• Try again with a new Di and a new Mi derived from it
Adaboost
• error (Mi) affects how the weights of training tuples are
updated
• If a tuple is correctly classified in round i, its weight is
multiplied by error ( M i )
1 - error ( M i )
• Adjust weights of all correctly classified tuples
• Now weights of all tuples (including the misclassified tuples)
are normalized sum _ of _ old _ weights
• Normalization factor = sum _ of _ new _ weights
error ( M i )
• Weight of a classifier Mi’s weight is log 1 - error ( M )
i
Adaboost
• The lower a classifier error rate, the more accurate it is, and
therefore, the higher its weight for voting should be
• Weight of a classifier Mi’s vote is
error ( M i )
log
1 - error ( M i )
• For each class c, sum the weights of each classifier that
assigned class c to X (unseen tuple)
• The class with the highest sum is the WINNER!
Example: AdaBoost
• Base classifiers: C1, C2, …, CT

• Error rate:

å w d (C ( x ) ¹ y )
N
1
ei = j i j j
N j =1
• Importance of a classifier:

1 æ 1 - ei ö
ai = lnçç ÷÷
2 è ei ø
Example: AdaBoost
• Weight update:
-a j
( j +1)
( j)
w ï ìexp if C j ( xi ) = yi
wi =i
í a
Z j ïî exp j if C j ( xi ) ¹ yi
where Z j is the normalization factor
C * ( x ) = arg max å a jd (C j ( x ) = y )
T

y j =1
• If any intermediate rounds produce error rate higher
than 50%, the weights are reverted back to 1/n and
the re-sampling procedure is repeated
• Classification:
Illustrating AdaBoost
Initial weights for each data point Data points
for training

0.1 0.1 0.1

Original
Data +++ - - - - - ++

B1
0.0094 0.0094 0.4623
Boosting
Round 1 +++ - - - - - - - a = 1.9459
Illustrating AdaBoost
B1
0.0094 0.0094 0.4623
Boosting
Round 1 +++ - - - - - - - a = 1.9459

B2
0.3037 0.0009 0.0422
Boosting
Round 2 - - - - - - - - ++ a = 2.9323

B3
0.0276 0.1819 0.0038
Boosting
Round 3 +++ ++ ++ + ++ a = 3.8744

Overall +++ - - - - - ++

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Intentionality of Sensation A Grammatical Feature GEM Anscombe PDF
No ratings yet
The Intentionality of Sensation A Grammatical Feature GEM Anscombe PDF
21 pages
Therapeutic Communication Skills in Mental Health
100% (2)
Therapeutic Communication Skills in Mental Health
27 pages
Lec 20
No ratings yet
Lec 20
2 pages
Lec 16,17
No ratings yet
Lec 16,17
90 pages
Lec 1,2
No ratings yet
Lec 1,2
69 pages
Pix2Vox Context-Aware 3D Reconstruction From Single and Multi-View Images
No ratings yet
Pix2Vox Context-Aware 3D Reconstruction From Single and Multi-View Images
9 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Lesson 3 Holistic and Partial Perspectives For Hand Outs
No ratings yet
Lesson 3 Holistic and Partial Perspectives For Hand Outs
13 pages
Leadership and Management (ESSAY)
No ratings yet
Leadership and Management (ESSAY)
2 pages
Deep Learning Syllabus
100% (1)
Deep Learning Syllabus
2 pages
Art: Grade 5 Unit 2: Perspective Drawing
No ratings yet
Art: Grade 5 Unit 2: Perspective Drawing
12 pages
Case Study 3 - Creative Deviance (Apple Org. Chart)
100% (6)
Case Study 3 - Creative Deviance (Apple Org. Chart)
2 pages
Childhood Administration Scale
No ratings yet
Childhood Administration Scale
6 pages
Mili Dhanki's Answer To What Should One Do A Day Before Your Ielts Exam - Quora
No ratings yet
Mili Dhanki's Answer To What Should One Do A Day Before Your Ielts Exam - Quora
4 pages
FACTORS Affecting SHS Students Chapter 1 2 3 1
No ratings yet
FACTORS Affecting SHS Students Chapter 1 2 3 1
22 pages
History Essay
No ratings yet
History Essay
2 pages
Towards Machine Learning Guided by Best Practices
No ratings yet
Towards Machine Learning Guided by Best Practices
5 pages
Update Resume
No ratings yet
Update Resume
1 page
Aplikasi SIRUS Zulhalim Sirs Slide
No ratings yet
Aplikasi SIRUS Zulhalim Sirs Slide
330 pages
s4 Term 2 Geogeraphy
No ratings yet
s4 Term 2 Geogeraphy
4 pages
Questions Good Coaches Ask HBR Article Dec2104
No ratings yet
Questions Good Coaches Ask HBR Article Dec2104
4 pages
Marlon S. Marquez Lesson Exemplar PDF
No ratings yet
Marlon S. Marquez Lesson Exemplar PDF
7 pages
Flip Lecture 1 - Smart Goals 19022024 010551pm
No ratings yet
Flip Lecture 1 - Smart Goals 19022024 010551pm
20 pages
The Relationship Between Self-Efficacy and Lecturer's Assertive Behavior With Japanese Public Speaking Anxiety
No ratings yet
The Relationship Between Self-Efficacy and Lecturer's Assertive Behavior With Japanese Public Speaking Anxiety
17 pages
RPS - English Semantics Ok
No ratings yet
RPS - English Semantics Ok
13 pages
Emma Catherine Lowery Resume
No ratings yet
Emma Catherine Lowery Resume
2 pages
Leadership Case Study: Inspires and Motivates Others To High Performance
No ratings yet
Leadership Case Study: Inspires and Motivates Others To High Performance
2 pages
RPP Body Parts For Young Learners Elda A1B219075
No ratings yet
RPP Body Parts For Young Learners Elda A1B219075
5 pages
Comprehension Purpose Questions
No ratings yet
Comprehension Purpose Questions
29 pages
Giaquinto - Hilbert's Philosophy of Mathematics
100% (1)
Giaquinto - Hilbert's Philosophy of Mathematics
15 pages
Lesson Plan Template For Teachers
No ratings yet
Lesson Plan Template For Teachers
3 pages
Unit 1
No ratings yet
Unit 1
32 pages
3 Advertising Appeals
No ratings yet
3 Advertising Appeals
33 pages
Math g2 m1 Full Module
No ratings yet
Math g2 m1 Full Module
128 pages

Lec 18

Uploaded by

Lec 18

Uploaded by

Ensemble Learning

Prof. Navneet Goyal

• Predict class label of previously unseen records by

• How to generate an ensemble of classifiers?

Accuracy of ensemble classifier: 100% J

• Example 4 is hard to classify

0.1 0.1 0.1

You might also like