0% found this document useful (0 votes)
39 views

Ensemble Learning in Machine Learning

Machine Learning

Uploaded by

Hare Ram Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Ensemble Learning in Machine Learning

Machine Learning

Uploaded by

Hare Ram Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Ensemble Learning in Machine

Learning
-Anuradha Srinivasaraghavan
Email:- [email protected]
Mob:- 9967534067
Contents
• About Supervised Learning Algorithms
• Common Classifier models
• Bias-Variance tradeoff
• Why Ensemble Learning
• What is Ensemble Learning
• Types of Ensemble Learning
– Bagging
– Boosting
– Stacked
• Conclusion
About Supervised Learning Algorithms
Machine Learning
• Machine Learning is about turning data into
information
• It is an intersection of Computer Science,
Engineering and Statistics
Examples
Machine Learning Methods
• ML tasks are generally classified into two
broad categories.
• Supervised learning : trains algorithms based
on example input and output data that is
labeled by humans
• Unsupervised learning : provides algorithm
with no labeled data to find structure within
its input data.
Supervised Learning example
What is Supervised Learning
Algorithms
• Supervised Learning algorithms has datasets
with labeled outputs for all the inputs
• A part of dataset(3/4th) is used for training the
model.
• The model is then validated with the test set
(1/4)
• Based on the accuracy of test set the model
can be used for prediction purposes
Prominent Supervised Learning
Algorithms
• Based on the output we have two types of
supervised learning algorithms
• If the output is categorical then its called as
Classification or classifier models
• If the output is continuous then its called as
regression models
Common classifier models
• Decision Tree
• Support Vector Machine
• Logistic regression
• Artificial Neural networks
• Naïve Bayes Classifier
Decision Tree
• A decision tree is a flowchart-like tree
structure, where
– each internal node (non-leaf node) denotes a test
on an attribute,
– each branch represents an outcome of the test,
and
– each leaf node (or terminal node) holds a class
label.
• The topmost node in a tree is the root node.
Example :
Support Vector Machine (SVM)
• SVM is a supervised machine learning
algorithm for both classification and
regression.
• The goal of a support vector machine is to find
the optimal separating hyperplane which
maximizes the margin of the training data
• The plot is size and weight of several people to distinguish
between men and women.
• Using SVM we get answer to the following question:
• Given a particular data point (weight and size), is the
person a man or a woman ?
• For instance: If someone measures 175 cm and
weights 80 kg, is it a man of a woman?
What is a separating hyperplane?
Logistic Regression
• Logistic regression is a classification technique
which is used to find the probability of a
dataset belonging to one class or the other
• Logit function and maximum likelihood
estimation is used in computing the
probability of the output class
Example
Artificial neural networks
Accuracy of the models
• The accuracy of the model is computed with
the percentage of times the outputs are
classified correctly in the test data set
Bias Variance Tradeoff
Under-fitting and Over-fitting
Bias Variance Tradeoff
• If our model is too simple and has very few
parameters then it may have high bias and low
variance
• If our model has large number of parameters
then it’s going to have high variance and low
bias
• A right/good balance has to be arrived at
without over-fitting and under-fitting the
data.
Ensemble Learning
What is Ensemble Learning?
• Ensemble is the art of combining diverse set
of learners (individual models) together to
improvise on the stability and predictive
power of the model.
• The way all the predictions are combined
together will be termed as Ensemble Learning.
Ensemble learning
Example of Ensemble Learning
• For Investing in a company XYZ a person looks for
an advice whether the stock prices increase more
than 6% in the next 6 months
• Employee of XYZ – 70% Keywords :- Combining
from Diverse Learners
• Financial Advisor of XYZ -75%
• Stock Market Trader-70%
• Employer of Competitor-60%
• Market research team-70%
• Social Media Expert-65%
1 - 30%*25%*30%*40%*25%*35%
= 1 - 0.07875 = 99.92125%
Why are models different?
• Ensemble learning is required when different
classifier models have different levels of
accuracy. Some possible reasons could be
– Difference in population
– Differences in hypothesis
– Difference in modeling techniques
– Difference in initial seed
Types of Ensemble Learning
• If several weak learners are combined we get
a strong learner
• But there are techniques for combining. Based
on that we have 3 types of ensemble learning
– Bagging
– Boosting
– Stacking
Bagging
(or Bootstrap AGGregating)
• Bagging, considers homogeneous weak
learners, learns them independently from
each other in parallel and combines them
following some kind of deterministic averaging
process Lower Variance model is obtained Regression models:- Averaging
Classifier models:- voting
Boosting
• Boosting is a sequential model which considers
homogeneous weak learners, learns them
sequentially in a very adaptive way (a base model
depends on the previous ones) and combines them
following a deterministic strategy
Lower Bias models are obtained
Stacking
Stacking, considers heterogeneous weak learners,
learns them in parallel and combines them by
training a meta-model to output a prediction based
on the different weak models predictions
Conclusion
• Ensemble learning trains to solve the same
problem and combines to get better
performances
• Hypothesis is if weak learners are combined
the right way robust models can be obtained
• In Bagging instances of weak models are
trained in parallel and aggregated
Conclusion Contd…
• In boosting , the models are trained in
sequence by concentrating on the wrong
classification output
• Stacking works on heterogeneous models and
builds a meta model to predict outputs based
on the outputs returned from the base
models.
References
[1]. Basics of Ensemble Learning Explained in Simple English
(https://www.analyticsvidhya.com/blog/2015/08/introduction-
ensemble-learning/)
[2]. Ensemble Learning- The heart of Machine learning
(https://medium.com/ml-research-lab/ensemble-learning-the-
heart-of-machine-learning-b4f59a5f9777)
[3]. Understanding the Bias-Variance Tradeoff
(https://towardsdatascience.com/understanding-the-bias-
variance-tradeoff-165e6942b229)
[4]. Ensemble methods: bagging, boosting and stacking
(https://towardsdatascience.com/ensemble-methods-bagging-
boosting-and-stacking-c9214a10a205)

You might also like