0% found this document useful (0 votes)

54 views15 pages

Lecture 3

Machine learning

Uploaded by

dpmanish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views15 pages

Lecture 3

Machine learning

Uploaded by

dpmanish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Chapter 3: Ensemble Learning

Contents
3.1 Understanding Ensembles, K-fold cross validation, Boosting, Stumping, XGBoost
3.2 Bagging, Subagging, Random Forest, Comparison with Boosting, Different ways to
combine classifiers

3.1 Ensemble learning

 This is nothing but the concept of combining of many solutions is always

better than considering only one solution.
 The basic ideas behind the ensemble learning combining different models
into single, the different models having slightly different results on a same
dataset, the result which are generated are slightly better while considering
single solution but the condition that combining the solution together well
otherwise result can be considerably worst.
 Supposed that if you are planning for family picnic then first important
thing is to hotel booking, so for selecting one hotel one must check for
many facilities like location that means it must be close to market then
average rating of hotel, rooms availability, other facility provided and cost,
then all these features aggregate together and we select the room.

3.1.1 Cross Validation in Machine Learning:

 Cross validation is used to estimate performance of model on unseen data.

 It divides given data into the number of subset or folds, uses any one of the
subset as validation set and train the model on remaining subset.
 This process is repeated multiple times, every time different subset can be
used as validation set.
 Finally result from each validation process is averaged so that more robust
model with better performance is generated.
 The main purpose of cross-validation is to avoid overfitting problem.
 Since we must ensure that, we do enough training on data to produce
generalized model. If we train for too long, then we will overfit the data.
 In cross-validation, we evaluate the model on number of validation set so
that it performs well on new or unseen data.
 The basic steps for Cross-validation:
1. Divide the dataset into subsets or folds.
2. Reserve a subset as a validation set.
3. Provide remaining subset for training the model.
4. Evaluate model performance using validation set
 Methods used in Cross-validation:
1. Leave one out cross-validation
2. Validation set approach
3. Leave-P-out cross-validation
4. K-fold cross-validation
5. Stratified K-fold cross-validation

3.1.1.1 K-Fold Cross-validation:

 In this approach each dataset is dividing in to the k equal size samples. Each
individual sample is called as Fold.
 For each set, k-1 folds are used for prediction function and remaining is
used for testing.
 The steps for k-fold cross-validation is:
1. Divide the dataset into the k samples
2. For each iteration.
- Reserve one fold as the test data set
- Use remaining for training purpose
- Then evaluate the performance of model based on training set
 For example, let us consider 5-fold cross-validation, here dataset is divided
into the 5 folds. In the 1st iteration first fold is reserved for testing dataset
and remaining is used for training purpose. In the 2nd iteration second fold is
used for testing and remaining is used for train the dataset. This method is
continuing until all the fold is used as a test dataset.

Fig. K-fold Cross-validation process

3.1.1.2 Stratified k-fold cross-validation:

 This technique is very much similar to the k-fold cross-validation with some
minor changes.
 This method is based on stratification process, means to arrange the data
so that each fold is good representative of complete dataset.
 This is the good method to handle bias and variance.
 For example mobile prices, the price of some mobile gadgets are high as
compared to others so for that we can use stratified k-fold cross validation
process.
 Advantages of Cross-validation:
- Overfitting: It resolves the problem of overfitting as it evaluates the
strong model performance on unseen data.
- Model selection: In cross-validation it combines the different model
performance.
- Data Efficient: This method allows us to use all the data for training
as well as testing, so this makes the model as data efficient.
 Disadvantages of Cross-validation:
- Expensive: It requires high cost as model is complex and require long
time to train.
- Time consuming: As more number of complex model are there to
combine then more time is required for training and testing.
- Bias-variance trade-off: as some folds are result in high variance and
some may results in high bias.

3.2 Boosting:

 Ensemble learning is combining several models to improve the

performance compared to single model.
 Basically we learn set of classifiers means experts and allow them to vote.
Boosting is also a one the classifier or we can say that type of machine
learning.
 Boosting is weak leaner classifier, this technique used to build strong
classifier by using several weak classifiers.
 First by using training data model is developed, then second model is
developed by correcting error present in the first model. This process is
continuing until training data set is predicted correctly or enough models
are added.
 There are several boosting algorithms, but AdaBoost was 1st successful
boosting technique which is developed for binary classification. AdaBoost
means Adaptive Boosting which is combines several weak classifiers into a
single strong classifier.
 Algorithm:
- Initialized large dataset
- Assign weights to training dataset
- Provide this input to the model and classify the dataset into wrongly
classified data and correctly classified data.
- Then increased the weight of wrongly classified data and decreased
the weight of correctly classified data. And then organized the
weights.
- If predicted output is correct then stop the process or
Else continue from step 2.
Fig Boosting Process

3.3 Stumping:

 Decision stump is nothing but one level decision tree, that means it consist
of one internal node called as root node and which is connected to terminal
nodes means leaf nodes.
 Decision stump is used as weak learners in ensemble learning techniques
such as boosting or bagging.
 In this technique, decision is just based on single input features so that it is
also called as 1-rule technique.
 As it is binary classification technique, initially we assign some threshold
value, if input value is greater than threshold then we classify as 1 or if it is
less than or equal to threshold then we classify as 0.
 There are several categories are there to build stump depending upon the
input, for nominal features build stump such that each input feature has
leaf. On other hand build stump based on categorical leaf.

Student Gender == Male

Yes No

Boys Girls

Fig. Binary classification with decision stump

 The Above fig shows, one root node with two leaf node decision is based on
yes or no.
 Algorithm:
- Input: Feature matrix X and Label vector Y (Target value)
- For each feature,
 Set yes who satisfies the rule
 Set no who doesn’t satisfy the rule
 Calculate prediction for each features
 Then calculate error based on if prediction is not equal to set
value.
- Output: Final model with stump rule.
3.4 XGBoost:

 It stands for extreme Gradient Boosting algorithm and it can handle large
dataset easily and achieve better performance in classification and
regression.
 It combines many weak models and develop strong prediction model. This
can makes us to understand data and make better decisions
 The advantage of XGBoost is speed, easy to use and better performance in
large set of data.
 One of the important features of XGBoost is, real world data if some values
are missing then it can handle the same without any preprocessing. This
can be possible by training large data set in small amount of time.
 It is used in many applications such as recommendation system, kaggle
competition, click through prediction system and so on.
 It allows modification in parameters so that model can be highly optimized
and highly personalized.
 It is also useful for managing overfitting problem by adding some weights
and biases in trees.
 It follows parallel learning so it can be easily scalable on clusters. It supports
both classifications as well as regression model.
 In this we use series of models and combine them to achieve highly
accurate model.
 For adding new model in the existing one it uses gradient descent
algorithm.
 Since there are some amount of input feature. If we want to find target
output variable and which is in the continuous format, in that regression
algorithm is used.
 In this our responsibility to guide the data so that one can achieve highly
accurate data model.
 Suppose if you are having input set features, and we want to calculate
target output feature which is in the categorical format then we use
classification algorithm. In this data can be guided by past observations of
dataset.

3.5 Bagging:

 Bagging is also known as Bootstrap Aggregation. It is used to reduce

variance and avoid overfitting problem. It is model of averaging method
which is applied on decision tree algorithm.
 Bagging classifier is used as base fit classifier on random subset of the
original dataset then aggregate (Averaging or voting) their individual model
prediction to form better prediction model.
 Each classifier is trained on training data in parallel which is generated
randomly by replacement method. Bagging reduces overfitting by using
averaging or voting. The training sets of classifier are independent of each
other.
Fig Steps of Prediction by using Bagging

 The above fig shows how bagging works. Bagging creates subset of original
data by using replacement method. It generates subset by bootstrap
resampling and trains each subset separately.
 Final prediction model can be developed by considering averaging or voting
from all prediction models.
 Bagging classifier uses different base classifiers such as decision tree, neural
network, linear classifier and so on.
 Algorithm:
- Subset data is created from original dataset by bootstrap
rasampling with replacement.
- A base classifier is created for each subset.
- Each classifier works parallel on training subset data and these are
independent of each other.
- The final prediction model is developed by considering averaging
or voting of all predictions.
-

3.6 Random Forest:

 Random forest is supervised learning problem which can be used for both
classification and regression algorithm.
 It is an ensemble learning process which combines different classifiers
predictions and improve the overall performance.
 The input dataset is divided into number of subset, random forest classifier
consist of decision tree on every subset and combines predictions from
each tree. It predicts the final output based on majority votes of
predictions.
 This can prevent problem of overfitting and achieve higher accuracy if
greater number of trees are there in the forest.

Fig Random Forest algorithm

 Above Fig shows working of random forest algorithm.

 In this algorithm, multiple trees are used to predict the output but
sometimes many of trees can predict the correct output or some may not.
But in together all can predict the correct output.
 For that some assumptions need to make, such that there is no correlation
or low correlation between predictions. And there must be actual value in
the original dataset so that it can predict correct result.
 Algorithm: working of random forest algorithm is as follows
- Select random subset from training set
- Develop decision tree on selected subset data points
- Combine predictions by using averaging or voting.
- Then final prediction model is achieved with higher accuracy.
 Advantages:
- It resolves the overfitting problem and achieves higher accuracy.
- It maintains good accuracy even if some amount of data is missing.
- It works well on large dataset
 Disadvantages:
- More computational resources are required for implementation.
- In this complexity are the major issues.
- It is time consuming process.
 Applications:
- Banking: Used in to sanctioned loan
- Marketing: Used to identify marketing trends
- Medicines: Used in disease predictions.
- Land: To identify similar area

3.7 Difference between Boosting and Bagging:

Sr. Boosting Bagging
No.
1 This method combines different This method combines same types
types of predictions. of predictions.
2 This is use to decrease bias. This is use to decrease variance.
3 This model is dependent upon This model is developing
past develop model. independently.
4 This is use to decrease bias. It tries to solve overfitting problem.
5 In this classifiers are trained In this classifiers are trained
serially. parallel.
6 If the classifier is having high bias If the classifier is having high
then uses boosting. variance then uses bagging.
7 Example: AdaBoost Example: Random Forest

3.8 Difference between Boosting and Random Forest:

Sr. Boosting Random Forest

No.
1 It combines weak learners to It uses decision trees to make
make predictions. prediction.
2 Decision is based on different It uses voting and averaging for
types of prediction. prediction.
3 It gives accurate result only when It can give better accuracy as
classification is used. compared to boosting.
4 One is develop depend on Each decision tree is developing
previous stumps. independently.
5 Bias problem can be resolved Overfitting problem can be
resolved
6 Models are ensemble Models are ensemble in parallels.
sequentially.

3.8 Different ways to combine classifier:

 Combine classifier means to make combination of set of classifier in this

individual decision of each classifier are grouped to make prediction.
 Combination classifier gives much more accurate result than individual.
 The aim of combination is that training set cannot provide sufficient data to
classifier. Another is some complex problem cannot be solved by suggested
learning algorithm.
 We can make prediction by considering weighted approximation of all the
models.

University Asked Questions:

1. Explain the Random Forest algorithm in detail Dec 22 10M

Ans Refer Section 3.6

2. Explain different ways to combine classifier Dec 22 10 M

Ans. Refer Section 3.2, 3.5 and 3.8

Review Questions:

1. What is ensemble learning? Explain in detail.

Ans. Refer Section 3.1

2. Explain K-Fold Cross Validation.

Ans. Refer Section 3.1.1.1

3. Difference between boosting and Bagging.

Ans. Refer Section 3.7

4. Difference between Boosting and Random forest.

Ans. Refer Section 3.8

5. Explain Stumping in detail.

Ans. Refer Section 3.3

6. Explain XGBoost in detail.

Ans. Refer Section 3.4

Summary

 This chapter includes combining different type classifier to predict the

result.
 Different methods to combine the classifier.

Ans
100% (3)
Ans
3 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
DL Notes 1 5 Deep Learning
100% (1)
DL Notes 1 5 Deep Learning
189 pages
Agent-Based Hybrid Intelligent System
No ratings yet
Agent-Based Hybrid Intelligent System
200 pages
Oreilly AI Driven Analytics
0% (1)
Oreilly AI Driven Analytics
38 pages
Ebs 2026 ( )
No ratings yet
Ebs 2026 ( )
42 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
B.SC., Computer Scince With Artificial Intelligence
No ratings yet
B.SC., Computer Scince With Artificial Intelligence
136 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Computer Ethics
No ratings yet
Computer Ethics
28 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Baker 2023
No ratings yet
Baker 2023
33 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
100% (1)
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
14 pages
Prompt Engineering
No ratings yet
Prompt Engineering
25 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
8 pages
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
No ratings yet
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
34 pages
IEEE Project Domains 2020 - 2021: We Can Help You To Develop The Project With Your
No ratings yet
IEEE Project Domains 2020 - 2021: We Can Help You To Develop The Project With Your
317 pages
Unit 4
No ratings yet
Unit 4
24 pages
A Deep Learning Framework For Neuroscience
No ratings yet
A Deep Learning Framework For Neuroscience
25 pages
Unit-6 AI (April 11, 2023)
No ratings yet
Unit-6 AI (April 11, 2023)
42 pages
Human Computer Interaction and Robotics
No ratings yet
Human Computer Interaction and Robotics
43 pages
Module3-Ensemble Learning
No ratings yet
Module3-Ensemble Learning
107 pages
I Built An AI System That Creates Viral YouTube Shorts On Autopilot!
No ratings yet
I Built An AI System That Creates Viral YouTube Shorts On Autopilot!
9 pages
Module 3.5 Ensemble Learning XGBoost
No ratings yet
Module 3.5 Ensemble Learning XGBoost
26 pages
Unit 2
No ratings yet
Unit 2
28 pages
ML Notes (Module-3)
No ratings yet
ML Notes (Module-3)
21 pages
P-2.1.2 Cross Validation and Regularization
No ratings yet
P-2.1.2 Cross Validation and Regularization
37 pages
cs229 Notes Ensemble
No ratings yet
cs229 Notes Ensemble
7 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
ML11 Generalization
No ratings yet
ML11 Generalization
40 pages
VAISHNAVI Project Finaaaaaaaaallll
No ratings yet
VAISHNAVI Project Finaaaaaaaaallll
67 pages
CH 05 Optimization Technique
No ratings yet
CH 05 Optimization Technique
58 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
Section 1: Cross-Validation and Model Performance
No ratings yet
Section 1: Cross-Validation and Model Performance
33 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
UNIT3 Class
No ratings yet
UNIT3 Class
30 pages
Manuscript - Nabus - Remos - Wood Proofread Done and Edited
No ratings yet
Manuscript - Nabus - Remos - Wood Proofread Done and Edited
108 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Machine Learning
No ratings yet
Machine Learning
76 pages
Farm Fusion
No ratings yet
Farm Fusion
14 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
Ens Embling
No ratings yet
Ens Embling
19 pages
Artifcial Intelligencein Customer
No ratings yet
Artifcial Intelligencein Customer
12 pages
Blue and Pink Professional Business Strategy Presentation
No ratings yet
Blue and Pink Professional Business Strategy Presentation
20 pages
Module 6 - ML
No ratings yet
Module 6 - ML
30 pages
Unit 5 ML
No ratings yet
Unit 5 ML
14 pages
ML-4th Unit
No ratings yet
ML-4th Unit
44 pages
List Steps in Data Preparation. Give Short Description of Each Step
No ratings yet
List Steps in Data Preparation. Give Short Description of Each Step
20 pages
??????? ???????? ??????????!
No ratings yet
??????? ???????? ??????????!
16 pages
Soyer - AI and Civil Liability
No ratings yet
Soyer - AI and Civil Liability
13 pages
Potato Leaf Disease Detection.-Test1
No ratings yet
Potato Leaf Disease Detection.-Test1
8 pages
AWS Security at Scale From Development To Production
No ratings yet
AWS Security at Scale From Development To Production
28 pages
Classification Random Forest
No ratings yet
Classification Random Forest
13 pages
Privacy and Data Security Concerns in AI1
No ratings yet
Privacy and Data Security Concerns in AI1
17 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
Adaptive Resonance Theory Based Neural Networks
No ratings yet
Adaptive Resonance Theory Based Neural Networks
20 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
Eldar: Name: Ticket:N3 Group:E27-24
No ratings yet
Eldar: Name: Ticket:N3 Group:E27-24
10 pages
Rule Book Quantum 2K25 Final
No ratings yet
Rule Book Quantum 2K25 Final
9 pages
Unit 5 New
No ratings yet
Unit 5 New
9 pages
Unit 4 ML
No ratings yet
Unit 4 ML
9 pages
Ownership Dilemmas in Age of Creative Machines - 010752
No ratings yet
Ownership Dilemmas in Age of Creative Machines - 010752
8 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
Unit 3
No ratings yet
Unit 3
13 pages
Chap 2 Logistique Regression
No ratings yet
Chap 2 Logistique Regression
32 pages
Unit 4
No ratings yet
Unit 4
24 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Bagging Vs Boosting - Javatpoint
No ratings yet
Bagging Vs Boosting - Javatpoint
8 pages
Cross Validation in ML
No ratings yet
Cross Validation in ML
5 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
Ensemble Learning (Autosaved)
No ratings yet
Ensemble Learning (Autosaved)
31 pages
MLDM Lect17 Classification Ensembles
No ratings yet
MLDM Lect17 Classification Ensembles
2 pages
Bagging
No ratings yet
Bagging
7 pages
Historia Bot
No ratings yet
Historia Bot
2 pages
Rohini 89299003921
No ratings yet
Rohini 89299003921
3 pages
NN Assignment PDF
No ratings yet
NN Assignment PDF
3 pages
Eda - M4
No ratings yet
Eda - M4
7 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet