0% found this document useful (0 votes)

12 views

Chapter 06

The document discusses the concept of AdaBoost, an ensemble learning method that enhances machine learning models by combining multiple weak classifiers to improve accuracy, particularly in risk management applications. It outlines the steps of the AdaBoost algorithm, its objectives in boosting weak learners, reducing bias and variance, improving generalization, and model aggregation. Additionally, it covers unsupervised learning techniques, including clustering methods like K-means and hierarchical clustering, along with evaluation metrics such as the elbow method and silhouette coefficient.

Uploaded by

aakash626273

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Chapter 06

Uploaded by

aakash626273

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

TITLE: CLASSIFIER ENSEMBLES:

BAGGING/ADABOOST

By ,
YOUSUF SK
ASSISTANT PROF.
PIET - CSE
SUBTITLE: UNDERSTANDING THE APPLICATION
OF ADABOOST IN RISK MANAGEMENT
INTRODUCTION TO ADABOOST

DEFINITION: ADABOOST, SHORT FOR ADAPTIVE BOOSTING, IS AN ENSEMBLE LEARNING METHOD

THAT COMBINES MULTIPLE WEAK CLASSIFIERS TO FORM A STRONG CLASSIFIER.
INVENTOR: INTRODUCED BY YOAV FREUND AND ROBERT SCHAPIRE IN 1996.

PURPOSE: TO IMPROVE THE PERFORMANCE OF MACHINE LEARNING MODELS BY FOCUSING ON

HARD-TO-CLASSIFY INSTANCES.

HOW ADABOOST WORKS

• PROCESS:
⚬ INITIALLY, ASSIGN EQUAL WEIGHTS TO ALL TRAINING SAMPLES.
⚬ TRAIN A WEAK CLASSIFIER ON THE WEIGHTED DATASET.
⚬ INCREASE THE WEIGHTS OF MISCLASSIFIED SAMPLES TO FOCUS ON THEM IN THE NEXT
ITERATION.
⚬ COMBINE THE WEAK CLASSIFIERS INTO A SINGLE STRONG CLASSIFIER.
• VISUALIZATION: SHOW A DIAGRAM ILLUSTRATING THE ITERATIVE PROCESS OF ADABOOST.
ADABOOST ALGORITHM STEPS

• INITIALIZE WEIGHTS FOR ALL SAMPLES.

• TRAIN A WEAK CLASSIFIER.
• CALCULATE THE CLASSIFIER’S ERROR.
• UPDATE WEIGHTS BASED ON THE CLASSIFIER’S PERFORMANCE.
• REPEAT STEPS 2-4 FOR A SPECIFIED NUMBER OF ITERATIONS.
• COMBINE THE WEAK CLASSIFIERS INTO A STRONG CLASSIFIER.

APPLICATION IN RISK MANAGEMENT

• RISK MANAGEMENT DEFINITION: THE PROCESS OF IDENTIFYING, ASSESSING, AND
CONTROLLING THREATS TO AN ORGANIZATION'S CAPITAL AND EARNINGS.
• ADABOOST APPLICATION: USED TO PREDICT RISK FACTORS AND IMPROVE
DECISION-MAKING BY FOCUSING ON HARD-TO-PREDICT RISK INSTANCES.
KEY OBJECTIVES OF ADABOOST IN CLASSIFIER ENSEMBLES

1. BOOSTING WEAK LEARNERS:

SEQUENTIAL LEARNING: ADABOOST TRAINS A SEQUENCE OF WEAK LEARNERS, TYPICALLY SIMPLE

MODELS LIKE DECISION STUMPS, WHERE EACH LEARNER IS TRAINED TO CORRECT THE ERRORS OF
THE PREVIOUS ONES.
ERROR MINIMIZATION: THE ALGORITHM GIVES MORE WEIGHT TO MISCLASSIFIED INSTANCES,
FORCING SUBSEQUENT LEARNERS TO FOCUS ON THE HARD-TO-CLASSIFY EXAMPLES.

2. REDUCING BIAS AND VARIANCE:

BIAS REDUCTION: BY COMBINING MULTIPLE WEAK LEARNERS, ADABOOST REDUCES BIAS, MAKING
THE OVERALL MODEL MORE FLEXIBLE AND CAPABLE OF CAPTURING COMPLEX PATTERNS IN THE
DATA.
CONTROLLED VARIANCE: EVEN THOUGH EACH WEAK LEARNER MAY HAVE HIGH VARIANCE, THE
AGGREGATION PROCESS HELPS TO AVERAGE OUT THEIR ERRORS, REDUCING THE OVERALL
VARIANCE.
3.IMPROVING GENERALIZATION:
ENHANCED ACCURACY: ITERATIVE FOCUS ON MISCLASSIFIED INSTANCES IMPROVES
ACCURACY ON TRAINING DATA AND UNSEEN DATA.
ADAPTIVE WEIGHTING: DYNAMICALLY ADJUSTS WEIGHTS OF TRAINING INSTANCES TO
FOCUS ON CHALLENGING PARTS OF THE DATASET.

4.MODEL AGGREGATION:
WEIGHTED MAJORITY VOTING: FINAL PREDICTION IS MADE BY WEIGHTED MAJORITY VOTE
OF ALL WEAK LEARNERS, WITH WEIGHTS BASED ON THEIR ACCURACY.
CUMULATIVE LEARNING: COMBINES ALL WEAK LEARNERS TO CREATE A STRONG
CLASSIFIER, LEVERAGING COLLECTIVE KNOWLEDGE.
Unsupervised Machine Learning
In contrast to supervised learning, unsupervised learning doesn't require labelled data for training.
Instead, it aims to discover hidden patterns and insights within the dataset. Just like how humans
learn, unsupervised learning enables models to act on unlabelled datasets without explicit guidance.
Due to the absence of corresponding output data, unsupervised learning cannot be directly applied
to regression or classification problems. Its primary goal is to uncover a dataset's inherent structure,
group data based on similarities, and represent the dataset in a more condensed form.
Consider a dataset of images of cats and dogs (Figure 1). Without prior training on this specific
dataset, the algorithm's task is to identify each image's distinctive features independently. The
algorithm will then cluster the images into groups based on similarities.
Clustering

Clustering is an unsupervised learning technique used to group a set of objects in such a way that objects in the same
group (or cluster) are more similar to each other than to those in other groups. The goal is to identify natural groupings
within the data based on inherent similarities.

Clustering Criteriation Function:

A clustering criterion function is a mathematical measure used to evaluate the quality of the clusters formed
by a clustering algorithm. It quantifies how well the objects within each cluster are similar to each other and
how distinct the clusters are from one another. Common clustering criterion functions include

1. Within-Cluster Sum of Squares (WCSS):

Definition:
The sum of squared distances between each point and the centroid of its assigned cluster.
WCSS is the measure used to find this optimal number of clusters

:
ELBOW METHOD

FOLLOWING ARE THE STEPS IN THE ELBOW METHOD TO FIND THE OPTIMAL VALUE OF CLUSTERS.
EXECUTE THE K-MEANS CLUSTERING ON A GIVEN DATASET FOR DIFFERENT K VALUES (1-10).
FOR EACH VALUE OF K, CALCULATE THE WCSS VALUE.
PLOTS A CURVE BETWEEN CALCULATED WCSS VALUES AND THE NUMBER OF CLUSTERS K.
THE SHARP POINT OF BEND OR A POINT OF THE PLOT LOOKS LIKE AN ARM, AND THAT POINT IS
CONSIDERED AS THE BEST VALUE OF K

FOLLOWING ARE THE STEPS IN THE ELBOW METHOD TO FIND THE OPTIMAL VALUE OF CLUSTERS:
· EXECUTE THE K-MEANS CLUSTERING ON A GIVEN DATASET FOR DIFFERENT K VALUES (1-10).
· FOR EACH VALUE OF K, CALCULATE THE WCSS VALUE.
· PLOTS A CURVE BETWEEN CALCULATED WCSS VALUES AND THE NUMBER OF CLUSTERS K.
THE SHARP POINT OF BEND OR A POINT OF THE PLOT LOOKS LIKE AN ARM, AND THAT POINT IS
CONSIDERED AS THE BEST VALUE OF K.
FIGURE 5: ELBOW
GRAPH
SILHOUETTE
COEFFICIENT
DEFINITION: MEASURES HOW SIMILAR A POINT IS TO ITS OWN CLUSTER COMPARED TO OTHER CLUSTERS.
OBJECTIVE: MAXIMIZE THE SILHOUETTE COEFFICIENT, WHICH RANGES FROM -1 TO 1.

WHERE 𝑎(𝑖) IS THE AVERAGE DISTANCE FROM POINT 𝑖 TO OTHER POINTS IN ITS OWN CLUSTER, AND 𝑏(𝑖) IS
THE MINIMUM AVERAGE DISTANCE FROM POINT TO POINTS IN ANOTHER CLUSTER.
K-MEANS CLUSTERING
DEFINITION: K-MEANS CLUSTERING IS A POPULAR UNSUPERVISED LEARNING ALGORITHM THAT PARTITIONS A
DATASET INTO K DISTINCT, NON-OVERLAPPING CLUSTERS. EACH CLUSTER IS REPRESENTED BY ITS CENTROID,
WHICH IS THE MEAN OF THE DATA POINTS IN THE CLUSTER.

ALGORITHM OF K-MEANS CLUSTERING

I1.INITIALIZATION:
⚬ CCHOOSE THE NUMBER OF CLUSTERS K.
⚬ RANDOMLY SELECT KKK INITIAL CENTROIDS FROM THE DATASET.

2.ASSIGNMENT STEP:
ASSIGN EACH DATA POINT TO THE NEAREST CENTROID. THIS FORMS K CLUSTERS BASED ON THE EUCLIDEAN
DISTANCE

3.UPDATE STEP:
RECALCULATE THE CENTROID OF EACH CLUSTER AS THE MEAN OF ALL DATA POINTS ASSIGNED TO THAT
CLUSTER.

4.REPEAT:
REPEAT THE ASSIGNMENT AND UPDATE STEPS UNTIL THE CENTROIDS NO LONGER CHANGE SIGNIFICANTLY OR
A MAXIMUM NUMBER OF ITERATIONS IS REACHED.
EUCLIDEAN
DISTANCE

HIERARCHICAL CLUSTERING
DEFINITION: HIERARCHICAL CLUSTERING IS AN UNSUPERVISED LEARNING ALGORITHM USED TO GROUP DATA
POINTS INTO NESTED CLUSTERS IN A HIERARCHICAL STRUCTURE. IT CAN BE VISUALIZED USING A
DENDROGRAM, WHICH ILLUSTRATES THE MERGING OR SPLITTING OF CLUSTERS AT VARIOUS LEVELS OF
SIMILARITY.
AGGLOMERATIVE HIERARCHICAL CLUSTERING ALGORITHM
DEFINITION: AGGLOMERATIVE HIERARCHICAL CLUSTERING IS AN UNSUPERVISED LEARNING ALGORITHM THAT
STARTS WITH EACH DATA POINT AS AN INDIVIDUAL CLUSTER AND ITERATIVELY MERGES THE CLOSEST PAIRS OF
CLUSTERS UNTIL ALL POINTS ARE IN A SINGLE CLUSTER OR THE DESIRED NUMBER OF CLUSTERS IS ACHIEVED.

STEP 1: INITIALIZE EACH DATA POINT AS A SEPARATE CLUSTER.

STEP 2: COMPUTE THE DISTANCE MATRIX FOR ALL CLUSTERS.
STEP 3: FIND THE CLOSEST PAIR OF CLUSTERS.
STEP 4: MERGE THE CLOSEST PAIR INTO A SINGLE CLUSTER.
STEP 5: UPDATE THE DISTANCE MATRIX TO REFLECT THE MERGE.
STEP 6: REPEAT STEPS 3 TO 5 UNTIL ONE CLUSTER OR DESIRED CLUSTERS ARE FORMED
DIVISIVE (TOP-DOWN) CLUSTERING
ALGORITHM
DEFINITION: DIVISIVE HIERARCHICAL CLUSTERING IS AN UNSUPERVISED LEARNING
ALGORITHM THAT STARTS WITH ALL DATA POINTS IN A SINGLE CLUSTER AND ITERATIVELY
SPLITS THEM INTO SMALLER CLUSTERS UNTIL EACH DATA POINT IS ITS OWN CLUSTER OR THE
DESIRED NUMBER OF CLUSTERS IS ACHIEVED.

STEP 1: INITIALIZE WITH ALL DATA POINTS IN A SINGLE CLUSTER.

STEP 2: CHOOSE A CLUSTER TO SPLIT.
STEP 3: COMPUTE THE DISTANCE MATRIX WITHIN THE CHOSEN CLUSTER.
STEP 4: DIVIDE THE CHOSEN CLUSTER INTO TWO SMALLER CLUSTERS.
STEP 5: UPDATE THE CLUSTERING STRUCTURE.
STEP 6: REPEAT STEPS 2 TO 5 UNTIL THE DESIRED CLUSTERING STRUCTURE IS ACHIEVED.
Thank You!

"These Are Just Rough Notes For References" What Is K-Means Clustering
No ratings yet
"These Are Just Rough Notes For References" What Is K-Means Clustering
9 pages
FML Unit4
No ratings yet
FML Unit4
14 pages
Unit IV
No ratings yet
Unit IV
96 pages
Ml Unit5 Notes
No ratings yet
Ml Unit5 Notes
18 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
10 pages
UnsupervisedLearning_FoundationalMathofAI_S24
No ratings yet
UnsupervisedLearning_FoundationalMathofAI_S24
6 pages
K-MEANS-FINAL
No ratings yet
K-MEANS-FINAL
10 pages
IDS26 Clustering and Classification
No ratings yet
IDS26 Clustering and Classification
30 pages
K-Means Clustering Clearly Explained
No ratings yet
K-Means Clustering Clearly Explained
12 pages
Module 6 - Un-Supervised Learning Algorithms
No ratings yet
Module 6 - Un-Supervised Learning Algorithms
31 pages
Unit_4 (1)
No ratings yet
Unit_4 (1)
63 pages
som-new
No ratings yet
som-new
21 pages
Unit 2 - SVM
No ratings yet
Unit 2 - SVM
137 pages
AI & ML Unit 4 Notes
No ratings yet
AI & ML Unit 4 Notes
16 pages
R20 machine learning unit 4
No ratings yet
R20 machine learning unit 4
49 pages
K-Means Clustering Algorithm
No ratings yet
K-Means Clustering Algorithm
13 pages
Machine Learning Bloque 4
No ratings yet
Machine Learning Bloque 4
12 pages
Week 9
No ratings yet
Week 9
66 pages
Unsupervised Learning Notes
No ratings yet
Unsupervised Learning Notes
21 pages
Week 10
No ratings yet
Week 10
50 pages
UNIT - 4 DWDM
No ratings yet
UNIT - 4 DWDM
27 pages
Mod4_Unsupervised Learning
No ratings yet
Mod4_Unsupervised Learning
9 pages
Determining Clusters
No ratings yet
Determining Clusters
4 pages
Unit-5
No ratings yet
Unit-5
33 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
66 pages
Unsupervised Learning Final
No ratings yet
Unsupervised Learning Final
17 pages
K-MEANS CLUSTERING ppt kpu
No ratings yet
K-MEANS CLUSTERING ppt kpu
4 pages
K Means
No ratings yet
K Means
26 pages
Clustering FinancialData
No ratings yet
Clustering FinancialData
38 pages
MODULE 4 - 5TH SEM (2)
No ratings yet
MODULE 4 - 5TH SEM (2)
23 pages
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
No ratings yet
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
95 pages
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
No ratings yet
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
65 pages
ARTIFICIAL INTELLIGENCE LEC 5
No ratings yet
ARTIFICIAL INTELLIGENCE LEC 5
20 pages
Summary - MachineLearning (Part 2)
No ratings yet
Summary - MachineLearning (Part 2)
19 pages
Introduction To Machine Learning-Presentation
No ratings yet
Introduction To Machine Learning-Presentation
28 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
UNIT IV
No ratings yet
UNIT IV
19 pages
UNIT 4 AIML
No ratings yet
UNIT 4 AIML
24 pages
k Means Presentation
No ratings yet
k Means Presentation
69 pages
1731009606_Clustering_(Class_38-39)
No ratings yet
1731009606_Clustering_(Class_38-39)
45 pages
Lecture 3 Types of Machine Learning
No ratings yet
Lecture 3 Types of Machine Learning
40 pages
Unsupervised - Learning Final
No ratings yet
Unsupervised - Learning Final
20 pages
8. Clustering
No ratings yet
8. Clustering
80 pages
A Comprehensive Overview of Clustering Algorithms in Pattern Recognition
No ratings yet
A Comprehensive Overview of Clustering Algorithms in Pattern Recognition
8 pages
Slide TIF311 DM 10 11
No ratings yet
Slide TIF311 DM 10 11
49 pages
Unit-4
No ratings yet
Unit-4
53 pages
unsupervised learning
No ratings yet
unsupervised learning
23 pages
Unsupervised Learning: K-Means Clustering
No ratings yet
Unsupervised Learning: K-Means Clustering
23 pages
fuzzy meaning
No ratings yet
fuzzy meaning
6 pages
Unit5 - Unsupervised Learning
No ratings yet
Unit5 - Unsupervised Learning
48 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
17 pages
K Means Clustering
No ratings yet
K Means Clustering
22 pages
Clustering
No ratings yet
Clustering
44 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
U1 - KMeans - 5th Sem - DS
No ratings yet
U1 - KMeans - 5th Sem - DS
14 pages
ML+Clustering
No ratings yet
ML+Clustering
33 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
6 Clustering
No ratings yet
6 Clustering
15 pages
Unit-Iv Material
No ratings yet
Unit-Iv Material
24 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Chapter 07
No ratings yet
Chapter 07
18 pages
chapter 04
No ratings yet
chapter 04
48 pages
Chapter 05
No ratings yet
Chapter 05
25 pages
Chapter 02
No ratings yet
Chapter 02
27 pages
Chapter 08
No ratings yet
Chapter 08
12 pages
Mini Java 2023-24
No ratings yet
Mini Java 2023-24
5 pages
JOIREM
No ratings yet
JOIREM
10 pages
Image Processing and Computer Vision (Notes)
No ratings yet
Image Processing and Computer Vision (Notes)
64 pages
Lec-1 ML Intro
No ratings yet
Lec-1 ML Intro
15 pages
AI Based Questions
No ratings yet
AI Based Questions
19 pages
A Decade's Battle On Dataset Bias
No ratings yet
A Decade's Battle On Dataset Bias
20 pages
Data Classification - Algorithms and Applications-Chapman and Hall - CRC (2014) - (Chapman & Hall - CRC Data Mining and Knowledge Discovery Series) Charu C. Aggarwal PDF
100% (1)
Data Classification - Algorithms and Applications-Chapman and Hall - CRC (2014) - (Chapman & Hall - CRC Data Mining and Knowledge Discovery Series) Charu C. Aggarwal PDF
704 pages
AI & ML QUESTION BANK FOR CIE-2
No ratings yet
AI & ML QUESTION BANK FOR CIE-2
2 pages
AAA Minahil Saeed Assigment 1
No ratings yet
AAA Minahil Saeed Assigment 1
4 pages
Resume - Abdullah Alabdullah
No ratings yet
Resume - Abdullah Alabdullah
1 page
LO1 Final
No ratings yet
LO1 Final
7 pages
Lecture 3
No ratings yet
Lecture 3
15 pages
Artificial Intelligence Concepts Areas Techniques and Applications 1st Edition Anne Håkansson instant download
100% (1)
Artificial Intelligence Concepts Areas Techniques and Applications 1st Edition Anne Håkansson instant download
49 pages
Pratham Balodi FlowCV Resume 20250321 (1)
No ratings yet
Pratham Balodi FlowCV Resume 20250321 (1)
1 page
Chapter 4
No ratings yet
Chapter 4
111 pages
Leveraging On Machine Learning Solution For Pioneering Wells Augmented Stuck Pipe Indicator in Real Time Centre
No ratings yet
Leveraging On Machine Learning Solution For Pioneering Wells Augmented Stuck Pipe Indicator in Real Time Centre
10 pages
1 s2.0 S1386505623003556 Main
No ratings yet
1 s2.0 S1386505623003556 Main
11 pages
the-essential-guide-to-aiops
No ratings yet
the-essential-guide-to-aiops
14 pages
Parv Dahiya: Professional Summary
No ratings yet
Parv Dahiya: Professional Summary
2 pages
Artificial Intelligence (AI): When Humans and Machines Might Have to Coexist
No ratings yet
Artificial Intelligence (AI): When Humans and Machines Might Have to Coexist
15 pages
Machine Learning Il MATLAB Essentials
No ratings yet
Machine Learning Il MATLAB Essentials
5 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
72 pages
Bring Your Data To Life!: About The Author
No ratings yet
Bring Your Data To Life!: About The Author
592 pages
A Study of Machine Learning Algorithms On Email Spam Classification
No ratings yet
A Study of Machine Learning Algorithms On Email Spam Classification
10 pages
2025 CSC14003 Artificial Intelligence 23CLC
No ratings yet
2025 CSC14003 Artificial Intelligence 23CLC
6 pages
Online Reinforcement Learning-Based Control of An Active Suspension System Using The Actor Critic Approach
No ratings yet
Online Reinforcement Learning-Based Control of An Active Suspension System Using The Actor Critic Approach
13 pages
An_Empirical_Analysis_of_CNN_for_American_Sign_Language_Recognition
No ratings yet
An_Empirical_Analysis_of_CNN_for_American_Sign_Language_Recognition
8 pages
Software Engineering BSC - Innovation Hub Programme Flyer
No ratings yet
Software Engineering BSC - Innovation Hub Programme Flyer
5 pages
Plant Disease Detection and It's Health Monitoring Using CNN and Arduino
No ratings yet
Plant Disease Detection and It's Health Monitoring Using CNN and Arduino
9 pages
Down The Rabbit Hole Detecting Online Extremism, Radicalisation, and Politicised Hate Speech
No ratings yet
Down The Rabbit Hole Detecting Online Extremism, Radicalisation, and Politicised Hate Speech
35 pages

Chapter 06

Uploaded by

Chapter 06

Uploaded by

TITLE: CLASSIFIER ENSEMBLES:

DEFINITION: ADABOOST, SHORT FOR ADAPTIVE BOOSTING, IS AN ENSEMBLE LEARNING METHOD

PURPOSE: TO IMPROVE THE PERFORMANCE OF MACHINE LEARNING MODELS BY FOCUSING ON

HOW ADABOOST WORKS

• INITIALIZE WEIGHTS FOR ALL SAMPLES.

APPLICATION IN RISK MANAGEMENT

1. BOOSTING WEAK LEARNERS:

SEQUENTIAL LEARNING: ADABOOST TRAINS A SEQUENCE OF WEAK LEARNERS, TYPICALLY SIMPLE

2. REDUCING BIAS AND VARIANCE:

Clustering Criteriation Function:

1. Within-Cluster Sum of Squares (WCSS):

ALGORITHM OF K-MEANS CLUSTERING

STEP 1: INITIALIZE EACH DATA POINT AS A SEPARATE CLUSTER.

STEP 1: INITIALIZE WITH ALL DATA POINTS IN A SINGLE CLUSTER.

You might also like