0% found this document useful (0 votes)
10 views24 pages

Lecture 6

Machine learning -

Uploaded by

mia349373
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views24 pages

Lecture 6

Machine learning -

Uploaded by

mia349373
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Ensemble Methods in Machine

Learning

Presented BY:
Dr. Amran Hossain
Associate Professor
CSE,DUET
Ensemble Methods, what are they?
• Ensemble method is a machine learning
technique that combines several base models in
order to produce one optimal predictive model. To
better understand this definition lets take a step
back into ultimate goal of machine learning and
model building.
• Ensemble learning is a general meta approach to
machine learning that seeks better predictive
performance by combining the predictions from
multiple models.
• Ensemble methods are techniques that aim at
improving the accuracy of results in models by
combining multiple models instead of using a
single model. The combined models increase the
accuracy of the results significantly
Types of Ensemble Methods
BAGGING
• Bagging, the short form for bootstrap aggregating, is mainly applied in classification
and regression.
• It increases the accuracy of models through decision trees, which reduces variance to
a large extent. The reduction of variance increases accuracy, eliminating overfitting,
which is a challenge to many predictive models.
• Bagging is classified into two types, i.e., bootstrapping and aggregation.
• Bootstrapping is a sampling technique where samples are derived from the whole population (set)
using the replacement procedure. The sampling with replacement method helps make the
selection procedure randomized. The base learning algorithm is run on the samples to complete
the procedure.
• Aggregation in bagging is done to incorporate all possible outcomes of the prediction and
randomize the outcome. Without aggregation, predictions will not be accurate because all
outcomes are not put into consideration. Therefore, the aggregation is based on the probability
bootstrapping procedures or on the basis of all outcomes of the predictive models.
BAGGING
• BAGGing, or Bootstrap AGGregating.
BAGGing gets its name because it
combines Bootstrapping and
Aggregation to form one ensemble
model.
• Given a sample of data, multiple
bootstrapped subsamples are pulled.
• A Decision Tree is formed on each of
the bootstrapped subsamples.
• After each subsample Decision Tree
has been formed, an algorithm is used
to aggregate over the Decision Trees to
form the most efficient predictor. The
image below will help explain:
BOOSTING
• Boosting is an ensemble technique that learns from previous predictor
mistakes to make better predictions in the future.
• The technique combines several weak base learners to form one strong
learner, thus significantly improving the predictability of models.
• Boosting works by arranging weak learners in a sequence, such that weak
learners learn from the next learner in the sequence to create better
predictive models.
• Boosting takes many forms, including gradient boosting, Adaptive Boosting
(AdaBoost), and XGBoost (Extreme Gradient Boosting).
• AdaBoost uses weak learners in the form of decision trees, which mostly
include one split that is popularly known as decision stumps.
• AdaBoost’s main decision stump comprises observations carrying similar
weights.
BOOSTING
• Gradient boosting adds predictors sequentially to the
ensemble, where preceding predictors correct their
successors, thereby increasing the model’s accuracy. New
predictors are fit to counter the effects of errors in the
previous predictors.
• The key property of boosting ensembles is the idea of
correcting prediction errors. The models are fit and added to
the ensemble sequentially such that the second model
attempts to correct the predictions of the first model, the
third corrects the second model, and so on.
• The gradient of descent helps the gradient booster identify
problems in learners’ predictions and counter them
accordingly.
• XGBoost makes use of decision trees with boosted gradient,
providing improved speed and performance.
• It relies heavily on the computational speed and the
performance of the target model. Model training should
follow a sequence, thus making the implementation of
gradient boosted machines slow.
BOOSTING
Stacking
• Stacking, another ensemble method, is often referred to as
stacked generalization. This technique works by allowing a
training algorithm to ensemble several other similar learning
algorithm predictions. Stacking has been successfully
implemented in regression, density estimations, distance
learning, and classifications. It can also be used to measure
the error rate involved during bagging.
• Stacked Generalization, or stacking for short, is an ensemble
method that seeks a diverse group of members by varying
the model types fit on the training data and using a model to
combine predictions.
• Stacking is a general procedure where a learner is trained to
combine the individual learners. Here, the individual learners
are called the first-level learners, while the combiner is called
the second-level learner, or meta-learner.
Random Forest
• Random Forest is a popular machine learning algorithm that belongs to
the supervised learning technique. It can be used for both Classification
and Regression problems in ML.
• It is based on the concept of ensemble learning, which is a process of
combining multiple classifiers to solve a complex problem and to improve
the performance of the model.
• As the name suggests, "Random Forest is a classifier that contains a
number of decision trees on various subsets of the given dataset and
takes the average to improve the predictive accuracy of that dataset.“
• Instead of relying on one decision tree, the random forest takes the
prediction from each tree and based on the majority votes of predictions,
and it predicts the final output.
Random Forest
Assumptions for Random Forest
• Since the random forest combines multiple trees to
predict the class of the dataset, it is possible that some
decision trees may predict the correct output, while
others may not.
• But together, all the trees predict the correct output.
Therefore, below are two assumptions for a better
Random forest classifier:
•There should be some actual values in the feature
variable of the dataset so that the classifier can predict
accurate results rather than a guessed result.
•The predictions from each tree must have very low
correlations.
Random Forest
• Why use Random Forest?
• Below are some points that explain why we should use the Random
Forest algorithm:
• It takes less training time as compared to other algorithms.
• It predicts output with high accuracy, even for the large dataset it runs
efficiently.
• It can also maintain accuracy when a large proportion of data is missing.
How does Random Forest algorithm work?
• Random Forest works in two-phase first is to create the random forest by
combining N decision tree, and second is to make predictions for each tree
created in the first phase.
• The Working process can be explained in the below steps and diagram:
• Step-1: Select random K data points from the training set.
• Step-2: Build the decision trees associated with the selected data points (Subsets).
• Step-3: Choose the number N for decision trees that you want to build.
• Step-4: Repeat Step 1 & 2.
• Step-5: For new data points, find the predictions of each decision tree, and assign the
new data points to the category that wins the majority votes.
Example of random Forest
• Suppose there is a dataset that contains multiple fruit images. So, this dataset is given to the Random forest classifier. The dataset is
divided into subsets and given to each decision tree. During the training phase, each decision tree produces a prediction result, and
when a new data point occurs, then based on the majority of results, the Random Forest classifier predicts the final decision. Consider
the below image:

Applications of Random Forest


There are mainly four sectors where Random forest mostly
used:
1.Banking: Banking sector mostly uses this algorithm for the
identification of loan risk.
2.Medicine: With the help of this algorithm, disease trends
and risks of the disease can be identified.
3.Land Use: We can identify the areas of similar land use by
this algorithm.
4.Marketing: Marketing trends can be identified using this
algorithm.
Random Forest(RF) Example Using dataset
• Dataset
Random Forest(RF)
• Random selection features:
Random Forest(RF)
• Draw Random decision tree:
Random Forest(RF)
• Classification Predication with new dataset:

Prediction Class
Random Forest(RF)
• Calculating and Combine Majority Voting:
• (1,0,1,1)

• Majority voting is 1 and final class is 1


• Advantages of Random Forest
• Random Forest is capable of performing both Classification and Regression tasks.
• It is capable of handling large datasets with high dimensionality.
• It enhances the accuracy of the model and prevents the overfitting issue.
• Powerful and highly accurate
• No need to normalizing
• Can handle several features at once
• Run trees in parallel ways
• Can perform both regression and classification tasks.
• Produces good prediction that is easily understandable.

• Disadvantages of Random Forest


• Although random forest can be used for both classification and regression tasks, it is
not more suitable for Regression tasks.
• Disadvantages
1. They are biased to certain features sometimes
2. Slow- One of the major disadvantages of random forest is that due to the
presence of a large number of trees, the algorithm can become quite slow
and ineffective for real-time predictions.
3. Can not be used for linear methods
4. Worse for high dimensional data
5. Since the random forest is a predictive modeling tool and not a descriptive
one, it would be better to opt for other methods, especially if you are trying
to find out the description of the relationships in your data
Difference between Decision Tree and
Random Forest

You might also like