0% found this document useful (0 votes)

21 views53 pages

ML - 5

Uploaded by

Snehargha Saha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views53 pages

ML - 5

Uploaded by

Snehargha Saha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Course Name : AI & ML

Click to edit Master subtitle style

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

The Ensemble methods

• The ensemble methods, also known as committee-based learning or
learning multiple classifier systems train multiple hypotheses to solve
the same problem. One of the most common examples of ensemble
modeling is the random forest trees where a number of decision trees
are used to predict outcomes.
• An ensemble contains a number of hypothesis or learners which are
usually generated from training data with the help of a base learning
algorithm. Most ensemble methods use a single base learning algorithm
to produce homogenous base learners or homogenous ensembles and
there are also some other methods which use multiple learning
algorithms and thus produce heterogenous ensembles. Ensemble
methods are well known for their ability to boost weak learners.

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

Why Use Ensemble Methods?

• The learning algorithms which output only a single hypothesis tends to
suffer from basically three issues. These issues are the statistical
problem, the computational problem and the representation problem
which can be partly overcome by applying ensemble methods.
• The learning algorithm which suffers from the statistical problem is
said to have high variance. The algorithm which exhibits the
computational problem is sometimes described as having
computational variance and the learning algorithm which suffers from
the representational problem is said to have a high bias. These three
fundamental issues can be said as the three important ways in which
existing learning algorithms fail. The ensemble methods promise of
reducing both the bias and the variance of these three shortcomings of
the standard learning algorithm.
03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

• Different Techniques
• Some of the commonly used Ensemble techniques are
discussed below
Bagging
• Bagging or Bootstrap Aggregation is a powerful, effective
and simple ensemble method. The method uses multiple
versions of a training set by using the bootstrap, i.e.
sampling with replacement and t it can be used with any
type of model for classification or regression. Bagging is
only effective when using unstable (i.e. a small change in
the training set can cause a significant change in the model)
non-linear models.
03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

Boosting
• Boosting is a meta-algorithm which can be viewed as a model averaging
method. It is the most widely used ensemble method and one of the
most powerful learning ideas. This method was originally designed for
classification but it can also be profitably extended to regression. The
original boosting algorithm combined three weak learners to generate a
strong learner.
Stacking
• Stacking is concerned with combining multiple classifiers generated by
using different learning algorithms on a single dataset which consists of
pairs of feature vectors and their classifications. This technique consists
of basically two phases, in the first phase, a set of base-level classifiers
is generated and in the second phase, a meta-level classifier is learned
which combines the outputs of the base-level classifiers.
03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

Applications Of Ensemble Methods

• Ensemble methods can be used as overall diagnostic procedures for a more
conventional model building. The larger the difference in fit quality between
one of the stronger ensemble methods and a conventional statistical model, the
more information that the conventional model is probably missing.
• Ensemble methods can be used to evaluate the relationships between
explanatory variables and the response in conventional statistical models.
Predictors or basis functions overlooked in a conventional model may surface
with an ensemble approach.
• With the help of the ensemble method, the selection process could be better
captured and the probability of membership in each treatment group estimated
with less bias.
• One could use ensemble methods to implement the covariance adjustments
inherent in multiple regression and related procedures. One would
“residualized” the response and the predictors of interest with ensemble
methods.
03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

7
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• How to generate an ensemble of classifiers?

– Bagging

– Boosting

8
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Bagging

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Bootstrapping is the method of randomly creating samples of data out of a

population with replacement to estimate a population parameter.

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Steps to Perform Bagging

• Consider there are n observations and m features in the
training set. You need to select a random sample from the
training dataset without replacement
• A subset of m features is chosen randomly to create a model
using sample observations
• The feature offering the best split out of the lot is used to split
the nodes
• The tree is grown, so you have the best root nodes
• The above steps are repeated n times. It aggregates the
output of individual decision trees to give the best prediction

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Advantages of Bagging in Machine Learning

• Bagging minimizes the overfitting of data
• It improves the model’s accuracy
• It deals with higher dimensional data
efficiently

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Bagging
• Sampling with replacement Training Data
Data ID
Original Data 1 2 3 4 5 6 7 8 9 10
Bagging (Round 1) 7 8 10 8 2 5 10 10 5 9
Bagging (Round 2) 1 4 9 1 2 3 2 7 3 2
Bagging (Round 3) 1 8 5 10 5 5 9 6 3 7

• Build classifier on each bootstrap sample

• Each sample has probability (1 – 1/n)n of being selected as test data
• Training data = 1- (1 – 1/n)n of the original data

13
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• This method is also called the 0.632 bootstrap

– A particular training data has a probability of 1-
1/n of not being picked
– Thus its probability of ending up in the test data
(not selected) is:

– This means the training data will contain

approximately 63.2% of the instances

14
14
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Bagging Example : Assume that the training data is:

+1 0.4 to 0.7: -1 +1

0.8 x
0.3

Goal: find a collection of 10 simple thresholding classifiers that

collectively can classify correctly.
-Each simple (or weak) classifier is:
(x<=K  class = +1 or -1 depending on
which value yields the lowest error; where K
is determined by entropy minimization)

15
16
Bagging (applied to training data)

Accuracy of ensemble classifier: 100% 

17
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Bagging Summary
• Works well if the base classifiers are unstable
(complement each other)
• Increased accuracy because it reduces the
variance of the individual classifier
• Does not focus on any particular instance of the
training data
– Therefore, less susceptible to model over-fitting
when applied to noisy data
• What if we want to focus on a particular instances
of training data?
18
19
20
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Definition

• WHAT IS RANDOM FOREST?

• Random forest is a supervised learning algorithm. The
"forest" it builds, is an ensemble of decision trees,
usually trained with the “bagging” method. The general
idea of the bagging method is that a combination of
learning models increases the overall result.
• Put simply: random forest builds multiple decision trees
and merges them together to get a more accurate and
stable prediction.

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

• Random forest adds additional randomness to the model,

while growing the trees. Instead of searching for the most
important feature while splitting a node, it searches for the
best feature among a random subset of features. This
results in a wide diversity that generally results in a better
model.
• Therefore, in random forest, only a random subset of the
features is taken into consideration by the algorithm for
splitting a node. You can even make trees more random by
additionally using random thresholds for each feature
rather than searching for the best possible thresholds (like
a normal decision tree does).
03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

DIFFERENCE BETWEEN DECISION TREES AND

RANDOM FORESTS
• While random forest is a collection of decision trees, there are some differences.
• If you input a training dataset with features and labels into a decision tree, it will
formulate some set of rules, which will be used to make the predictions.
• For example, to predict whether a person will click on an online advertisement, you might
collect the ads the person clicked on in the past and some features that describe his/her
decision. If you put the features and labels into a decision tree, it will generate some rules
that help predict whether the advertisement will be clicked or not. In comparison, the
random forest algorithm randomly selects observations and features to build several
decision trees and then averages the results.
• Another difference is "deep" decision trees might suffer from overfitting. Most of the time,
random forest prevents this by creating random subsets of the features and building
smaller trees using those subsets. Afterwards, it combines the subtrees. It's important to
note this doesn’t work every time and it also makes the computation slower, depending on
how many trees the random forest builds.
03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

• ADVANTAGES AND DISADVANTAGES OF THE RANDOM FOREST

ALGORITHM
• One of the biggest advantages of random forest is its versatility. It can be used for both regression and
classification tasks, and it’s also easy to view the relative importance it assigns to the input features.
• Random forest is also a very handy algorithm because the default hyperparameters it uses often
produce a good prediction result. Understanding the hyperparameters is pretty straightforward, and
there's also not that many of them.
• One of the biggest problems in machine learning is overfitting, but most of the time this won’t happen
thanks to the random forest classifier. If there are enough trees in the forest, the classifier won’t overfit
the model.
• The main limitation of random forest is that a large number of trees can make the algorithm too slow
and ineffective for real-time predictions. In general, these algorithms are fast to train, but quite slow to
create predictions once they are trained. A more accurate prediction requires more trees, which
results in a slower model. In most real-world applications, the random forest algorithm is fast enough
but there can certainly be situations where run-time performance is important and other approaches
would be preferred.
• And, of course, random forest is a predictive modeling tool and not a descriptive tool, meaning if
you're looking for a description of the relationships in your data, other approaches would be better.

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

• How does Random Forest algorithm work?

• Random Forest works in two-phase first is to create the random forest
by combining N decision tree, and second is to make predictions for
each tree created in the first phase.
The Working process can be explained in the below steps and diagram:
• Step-1: Select random K data points from the training set.
• Step-2: Build the decision trees associated with the selected data points
(Subsets).
• Step-3: Choose the number N for decision trees that you want to build.
• Step-4: Repeat Step 1 & 2.
• Step-5: For new data points, find the predictions of each decision tree,
and assign the new data points to the category that wins the majority
votes.
03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

• Applications of Random Forest

• There are mainly four sectors where Random forest mostly used:
• Banking: Banking sector mostly uses this algorithm for the
identification of loan risk.
• Medicine: With the help of this algorithm, disease trends and risks of
the disease can be identified.
• Land Use: We can identify the areas of similar land use by this
algorithm.
• Marketing: Marketing trends can be identified using this algorithm.

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Basic Concept

• Lazy learners
• Lazy learners simply store the training data and wait until a testing data
appear. When it does, classification is conducted based on the most
related data in the stored training data. Compared to eager learners,
lazy learners have less training time but more time in predicting.
• Ex. k-nearest neighbor, Case-based reasoning
• Eager learners
• Eager learners construct a classification model based on the given
training data before receiving data for classification. It must be able to
commit to a single hypothesis that covers the entire instance space. Due
to the model construction, eager learners take a long time for train and
less time to predict.
• Ex. Decision Tree, Naive Bayes, Artificial Neural Networks
03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• k-Nearest-Neighbor Method
– first described in the early 1950s
– It has since been widely used in the area of pattern
recognition.
– The training instances are described by n attributes.
– Each instance represents a point in an n-
dimensional
space.
– A k-nearest-neighbor classifier searches the pattern
space for the k training instances that are closest to
the unknown instance.

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• The nearest neighbor can be defined in terms of

Euclidean distance, dist(X1, X2)
• The Euclidean distance between two points or
instances, say, X1 = (x11, x12, … , x1n) and X2 = (x21, x22, ... ,
x2n), is:

– Nominal attributes: distance either 0 or 1

– Refer to cluster analysis for more distance metrics
KNN Example

03/06/2022
Continued…

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• For k-nearest-neighbor classification, the unknown

instance is assigned the most common class among its
k nearest neighbors.
• When k = 1, the unknown instance is assigned the
class of the training instance that is closest to it in
pattern space.
• Nearest-neighbor classifiers can also be used for
prediction, that is, to return a real-valued prediction
for a given unknown instance.
– In this case, the classifier returns the average value of the
real-valued labels associated with the k nearest neighbors of
the unknown instance.
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Distances for categorical attributes:

– A simple method is to compare the corresponding
value of the attribute in instance X1 with that in
instance X2.
– If the two are identical (e.g., instances X1 and X2 both
have the color blue), then the difference between the
two is taken as 0, otherwise 1.
– Other methods may incorporate more sophisticated
schemes for differential grading (e.g., where a
difference score is assigned, say, for blue and white
than for blue and black).
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Handling missing values:

– In general, if the value of a given attribute A is missing
in instance X1 and/or in instance X2, we assume the
maximum possible difference.
– For categorical attributes, we take the difference value
to be 1 if either one or both of the corresponding values
of A are missing.
– If A is numeric and missing from both instances X1
and X2, then the difference is also taken to be 1.
◆ If only one value is missing and the other (which we’ll call v’) is
present and normalized, then we can take the difference to be
either |1 - v’| or |0 – v’| , whichever is greater.
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Typically, we normalize the values of each attribute in

advanced.
• This helps prevent attributes with initially large ranges
(such as income) from outweighing attributes with
initially smaller ranges (such as binary attributes).
• Min-max normalization:

– all attribute values lie between 0 and 1

– For more information on normalization methods refer
to data preprocessing section
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Determining a good value for k:

– k can be determined experimentally.
– Starting with k = 1, we use a test set to estimate the
error rate of the classifier.
– This process can be repeated each time by
incrementing k to allow for one more neighbor.
– The k value that gives the minimum error rate may be
selected.
– In general, the larger the number of training instances
is, the larger the value of k will be
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Boosting
• An iterative procedure to adaptively change
distribution of training data by focusing
more on previously misclassified records
– Initially, all N records are assigned equal
weights
– Unlike bagging, weights may change at the end
of a boosting round

39
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Steps for Boosting

• Initialise the dataset and assign equal weight to each of the data
point.
• Provide this as input to the model and identify the wrongly
classified data points.
• Increase the weight of the wrongly classified data points.
• if (got required results)
Goto step 5
else
Goto step 2

• End

40
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Explanation:
The above diagram explains the AdaBoost algorithm in a very simple way. Let’s try to
understand it in a stepwise process:
• B1 consists of 10 data points which consist of two types namely plus(+) and minus(-) and 5 of
which are plus(+) and the other 5 are minus(-) and each one has been assigned equal weight
initially. The first model tries to classify the data points and generates a vertical separator line
but it wrongly classifies 3 plus(+) as minus(-).
• B2 consists of the 10 data points from the previous model in which the 3 wrongly classified
plus(+) are weighted more so that the current model tries more to classify these pluses(+)
correctly. This model generates a vertical separator line that correctly classifies the previously
wrongly classified pluses(+) but in this attempt, it wrongly classifies three minuses(-).
• B3 consists of the 10 data points from the previous model in which the 3 wrongly classified
minus(-) are weighted more so that the current model tries more to classify these minuses(-)
correctly. This model generates a horizontal separator line that correctly classifies the
previously wrongly classified minuses(-).
• B4 combines together B1, B2, and B3 in order to build a strong prediction model which is much
better than any individual model used.

03/06/2022
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Records that are wrongly classified will have

their weights increased
• Records that are classified correctly will have
their weights decreased
Original Data 1 2 3 4 5 6 7 8 9 10
Boosting (Round 1) 7 3 2 8 7 9 4 10 6 3
Boosting (Round 2) 5 4 9 4 2 5 1 7 4 2
Boosting (Round 3) 4 4 8 10 4 5 4 6 3 4

• Example 4 is hard to classify

• Its weight is increased, therefore it is more likely to
be chosen again in subsequent rounds

43
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA
Boosting
• Equal weights are assigned to each training
instance (1/N for round 1) at first round
• After a classifier Ci is learned, the weights are
adjusted to allow the subsequent classifier
Ci+1 to “pay more attention” to data that were
misclassified by Ci.
• Final boosted classifier C* combines the votes of
each individual classifier
– Weight of each classifier’s vote is a function of its
accuracy
• Adaboost – popular boosting algorithm

44
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Adaboost (Adaptive Boost)

• Input:
– Training set D containing N instances
– T rounds
– A classification learning scheme
• Output:
– A composite model

45
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Adaboost: Training Phase

• Training data D contain N labeled data (X1,y1), (X2,y2 ),
(X3,y3),….(XN,yN)
• Initially assign equal weight 1/d to each data
• To generate T base classifiers, we need T rounds or
iterations
• Round i, data from D are sampled with replacement , to
form Di (size N)
• Each data’s chance of being selected in the next rounds
depends on its weight
– Each time the new sample is generated directly from the
training data D with different sampling probability according
to the weights; these weights are not zero

46
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

• Base classifier Ci, is derived from training data of

Di
• Error of Ci is tested using Di
• Weights of training data are adjusted depending
on how they were classified
– Correctly classified: Decrease weight
– Incorrectly classified: Increase weight
• Weight of a data indicates how hard it is to
classify it (directly proportional)

47
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Adaboost: Testing Phase

• The lower a classifier error rate, the more accurate it is,
and therefore, the higher its weight for voting should be
• Weight of a classifier Ci’s vote is

• Testing:
– For each class c, sum the weights of each classifier that assigned
class c to X (unseen data)
– The class with the highest sum is the WINNER!

48
Example: Error and Classifier Weight in AdaBoost

• Base classifiers: C1, C2, …, CT

• Error rate: (i = index of classifier,

j=index of instance)

• Importance of a classifier:

49
Example: Data Instance Weight in AdaBoost

• Assume: N training data in D, T rounds, (xj,yj) are

the training data, Ci, ai are the classifier and
weight of the ith round, respectively.
• Weight update on all training data in D:

50
Illustrating AdaBoost
Initial weights for each data point Data points
for training

51
Illustrating AdaBoost

52
UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Thank You

03/06/2022

Secure DLMS/COSEM Communication For Next Generation Advanced Metering Infrastructure
No ratings yet
Secure DLMS/COSEM Communication For Next Generation Advanced Metering Infrastructure
7 pages
Economic Order Quantity in Fuzzy Sense For Inventory
No ratings yet
Economic Order Quantity in Fuzzy Sense For Inventory
74 pages
Stanford CS224W Limitations of Graph Neural Networks 18-Limitations
No ratings yet
Stanford CS224W Limitations of Graph Neural Networks 18-Limitations
75 pages
An Overview of Turbo Codes and Their Applications: November 2005
No ratings yet
An Overview of Turbo Codes and Their Applications: November 2005
11 pages
Digital Certificates and Digital Signature
No ratings yet
Digital Certificates and Digital Signature
5 pages
Lab 04 - Seismic Deconvolution
No ratings yet
Lab 04 - Seismic Deconvolution
10 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
3 - CentumVP Engineering Course Day 3
No ratings yet
3 - CentumVP Engineering Course Day 3
76 pages
QM
No ratings yet
QM
4 pages
Roulette Wheel
No ratings yet
Roulette Wheel
8 pages
CSE 101 Homework 0 Solutions: Winter 2021
No ratings yet
CSE 101 Homework 0 Solutions: Winter 2021
2 pages
Mathematic Modelling of Dynamic SYSTEMS Ch. 2
No ratings yet
Mathematic Modelling of Dynamic SYSTEMS Ch. 2
31 pages
Air Quality Prediction
No ratings yet
Air Quality Prediction
21 pages
Fast - Algorithms - For - Mining Association Rules - R Agrawal - R Srikant-IBM
No ratings yet
Fast - Algorithms - For - Mining Association Rules - R Agrawal - R Srikant-IBM
32 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
K Means Clustering Project - Sample
No ratings yet
K Means Clustering Project - Sample
9 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
RCS Minuteslast Minute Notes PDF
No ratings yet
RCS Minuteslast Minute Notes PDF
9 pages
Dte WR
No ratings yet
Dte WR
8 pages
Rida Farouki Course
No ratings yet
Rida Farouki Course
7 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Algorithm-Lab Updated
No ratings yet
Algorithm-Lab Updated
125 pages
1999 - A Statistical Method For Practical Assessment of Sawability With Diamond Wire Cutting Machine of Ankara-Cubuk Andesites
No ratings yet
1999 - A Statistical Method For Practical Assessment of Sawability With Diamond Wire Cutting Machine of Ankara-Cubuk Andesites
4 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
Icipcn 2020 1
No ratings yet
Icipcn 2020 1
2 pages
Ensemble
No ratings yet
Ensemble
2 pages
Introduction (v4)
No ratings yet
Introduction (v4)
16 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
Ensemble Learning
No ratings yet
Ensemble Learning
24 pages
Ensemble Learning
No ratings yet
Ensemble Learning
15 pages
Lecture 9
No ratings yet
Lecture 9
12 pages
Unit I ML (I) 24-25
No ratings yet
Unit I ML (I) 24-25
79 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
Ensemble Learning: David Sontag New York University
No ratings yet
Ensemble Learning: David Sontag New York University
17 pages
Quantitative Methods For Economic Analysis 1 Solved MCQs (Set-7)
100% (1)
Quantitative Methods For Economic Analysis 1 Solved MCQs (Set-7)
5 pages
Ma 3 H0
No ratings yet
Ma 3 H0
2 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Introduction To Feed Forward Neural Networks
No ratings yet
Introduction To Feed Forward Neural Networks
121 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
BUS336 A3 Spring 2024
No ratings yet
BUS336 A3 Spring 2024
4 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
Ensemble Models
No ratings yet
Ensemble Models
52 pages
6 - Modeling Road Traffic Flow On The Link
No ratings yet
6 - Modeling Road Traffic Flow On The Link
15 pages
Ensemble Learning: Wisdom of The Crowd
100% (1)
Ensemble Learning: Wisdom of The Crowd
12 pages
Unit 4 Ensemble Techniques and Unsupervised Learning
100% (1)
Unit 4 Ensemble Techniques and Unsupervised Learning
25 pages
Evolutionary Bagging For Ensemble Learning: Keywords
No ratings yet
Evolutionary Bagging For Ensemble Learning: Keywords
16 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
Module 2
No ratings yet
Module 2
34 pages
Ensemble Methods
No ratings yet
Ensemble Methods
32 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Unit 3
No ratings yet
Unit 3
59 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Technical Report
No ratings yet
Technical Report
10 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
4 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
2025 Ensemble Learning
No ratings yet
2025 Ensemble Learning
25 pages
Lecture 5
No ratings yet
Lecture 5
11 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
Unit 3
No ratings yet
Unit 3
63 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Breach&Bound
No ratings yet
Breach&Bound
11 pages
Machine Learning Lecture 2,3,4
No ratings yet
Machine Learning Lecture 2,3,4
26 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
2.4-Ensemble Methods Lecture Notes
No ratings yet
2.4-Ensemble Methods Lecture Notes
14 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Ensemble TBL Notes
No ratings yet
Ensemble TBL Notes
2 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
ML Unit 3-1
No ratings yet
ML Unit 3-1
14 pages
Aiml Unit 4
No ratings yet
Aiml Unit 4
26 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Simple Equations Questions
No ratings yet
Simple Equations Questions
4 pages
Stability and Root Locus
No ratings yet
Stability and Root Locus
7 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Ensemble Learning
No ratings yet
Ensemble Learning
26 pages
Eda - M4
No ratings yet
Eda - M4
7 pages
Unit 4 ML
No ratings yet
Unit 4 ML
25 pages
Ensemble Methods Send
No ratings yet
Ensemble Methods Send
20 pages
Bagging
No ratings yet
Bagging
7 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet

ML - 5

Uploaded by

ML - 5

Uploaded by

UNIVERSITY OF ENGINEERING & MANAGEMENT, KOLKATA

Course Name : AI & ML

Click to edit Master subtitle style

The Ensemble methods

Why Use Ensemble Methods?

Applications Of Ensemble Methods

• How to generate an ensemble of classifiers?

Bootstrapping is the method of randomly creating samples of data out of a

Steps to Perform Bagging

Advantages of Bagging in Machine Learning

• Build classifier on each bootstrap sample

• This method is also called the 0.632 bootstrap

– This means the training data will contain

Bagging Example : Assume that the training data is:

Goal: find a collection of 10 simple thresholding classifiers that

Accuracy of ensemble classifier: 100% 

• WHAT IS RANDOM FOREST?

• Random forest adds additional randomness to the model,

DIFFERENCE BETWEEN DECISION TREES AND

• ADVANTAGES AND DISADVANTAGES OF THE RANDOM FOREST

• How does Random Forest algorithm work?

• Applications of Random Forest

• The nearest neighbor can be defined in terms of

– Nominal attributes: distance either 0 or 1

• For k-nearest-neighbor classification, the unknown

• Distances for categorical attributes:

• Handling missing values:

• Typically, we normalize the values of each attribute in

– all attribute values lie between 0 and 1

• Determining a good value for k:

Steps for Boosting

• Records that are wrongly classified will have

• Example 4 is hard to classify

Adaboost (Adaptive Boost)

• Adaboost: Training Phase

• Base classifier Ci, is derived from training data of

Adaboost: Testing Phase

• Base classifiers: C1, C2, …, CT

• Error rate: (i = index of classifier,

• Assume: N training data in D, T rounds, (xj,yj) are

You might also like