0% found this document useful (0 votes)
23 views20 pages

Random Forest

The document discusses machine learning techniques for reducing overfitting and underfitting including ensemble learning, bagging, boosting, and random forests. It describes how each technique works and its advantages.

Uploaded by

danyalshah9009
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views20 pages

Random Forest

The document discusses machine learning techniques for reducing overfitting and underfitting including ensemble learning, bagging, boosting, and random forests. It describes how each technique works and its advantages.

Uploaded by

danyalshah9009
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Machine Learning

By: Ammara Yaseen


Agenda:
1. Overfitting and Underfitting
2. Ensemble Learning
3. Bagging and Boosting
4. Random Forest
5. OOB Evaluation and error
Overfitting
• A statistical model is said to be overfitted when the model does not
make accurate predictions on testing data.
• When a model gets trained with so much data, it starts learning
from the noise and inaccurate data entries in our data set.
• Broadly speaking, overfitting means our training has focused on the
particular training set so much that it has missed the point entirely.
In this way, the model is not able to adapt to new data as it’s too
focused on the training set.
• A solution to avoid overfitting is using a linear algorithm if we
have linear data or using the parameters like the maximal depth if
we are using decision trees.
Overfitting

• Reasons for Overfitting:


• High variance and low bias.
• The model is too complex.
• The size of the training data.
• Techniques to Reduce Overfitting
• Increase training data.
• Reduce model complexity.
• Early stopping during the training phase (have an eye over the loss over
the training period as soon as loss begins to increase stop training).
• Ridge Regularization and Lasso Regularization.
• Use dropout for neural networks to tackle overfitting.
Underfitting
• Underfitting, on the other hand, means the model
has not captured the underlying logic of the data. It
doesn’t know what to do with the task we’ve given it
and, therefore, provides an answer that is far from
correct.
Underfitting

Reasons for Underfitting


• The model is too simple, So it may be not capable to represent the complexities in
the data.
• The input features which is used to train the model is not the adequate
representations of underlying factors influencing the target variable.
• The size of the training dataset used is not enough.
• Excessive regularization are used to prevent the overfitting, which constraint the
model to capture the data well.
• Features are not scaled.
Techniques to Reduce Underfitting
• Increase model complexity.
• Increase the number of features, performing feature engineering.
• Remove noise from the data.
• Increase the number of epochs or increase the duration of training to get better
Good Fit

Ideally, the case when the model makes the predictions with 0 error, is said to have a
good fit on the data. This situation is achievable at a spot between overfitting and
underfitting. In order to understand it, we will have to look at the performance of
our model with the passage of time, while it is learning from the training dataset.

With the passage of time, our model will keep on learning, and thus the error for the
model on the training and testing data will keep on decreasing. If it will learn for too
long, the model will become more prone to overfitting due to the presence of noise
and less useful details. Hence the performance of our model will decrease. In order
to get a good fit, we will stop at a point just before where the error starts increasing.
At this point, the model is said to have good skills in training datasets as well as our
unseen testing dataset.
Overfitting,Apropriate Fitting,Underfitting
Ensemble Learning

Ensemble learning is a machine learning technique that enhances accuracy and resilience in
forecasting by merging predictions from multiple models. It aims to mitigate errors or biases
that may exist in individual models by leveraging the collective intelligence of the ensemble.

The underlying concept behind ensemble learning is to combine the outputs of diverse
models to create a more precise prediction. By considering multiple perspectives and utilizing
the strengths of different models, ensemble learning improves the overall performance of
the learning system. This approach not only enhances accuracy but also provides resilience
against uncertainties in the data. By effectively merging predictions from multiple models,
ensemble learning has proven to be a powerful tool in various domains, offering more robust
and reliable forecasts.
Bagging and Boosting

Bagging is an Ensemble Learning technique which aims to reduce the error learning
through the implementation of a set of homogeneous machine learning algorithms. The
key idea of bagging is the use of multiple base learners which are trained separately
with a random sample from the training set, which through a voting or averaging
approach, produce a more stable and accurate model.
Bagging and Boosting

The main two components of bagging technique are: the random


sampling with replacement (bootstrapping) and the set of
homogeneous machine learning algorithms (ensemble learning).
The bagging process is quite easy to understand, first it is extracted
“n” subsets from the training set, then these subsets are used to
train “n” base learners of the same type. For making a prediction,
each one of the “n” learners are feed with the test sample, the
output of each learner is averaged (in case of regression) or voted
(in case of classification).
Whole bagging process
Boosting

Boosting is an Ensemble Learning technique that, like bagging, makes use of a set
of base learners to improve the stability and effectiveness of a ML model. The idea
behind a boosting architecture is the generation of sequential hypotheses, where
each hypothesis tries to improve or correct the mistakes made in the previous one .

The central idea of boosting is the implementation of homogeneous ML


algorithms in a sequential way, where each of these ML algorithms tries to improve
the stability of the model by focusing on the errors made by the previous ML
algorithm. The way in which the errors of each base learner is considered to be
improved with the next base learner in the sequence, is the key differentiator
between all variations of the boosting technique.
Boosting
Boosting
Boosting
The Explanation for Training the Boosting Model:

The above diagram explains the AdaBoost algorithm in a very simple way. Let’s try to understand it in a stepwise process:

• B1 consists of 10 data points which consist of two types namely plus(+) and minus(-) and 5 of which are plus(+) and the
other 5 are minus(-) and each one has been assigned equal weight initially. The first model tries to classify the data points
and generates a vertical separator line but it wrongly classifies 3 plus(+) as minus(-).

• B2 consists of the 10 data points from the previous model in which the 3 wrongly classified plus(+) are weighted more so
that the current model tries more to classify these pluses(+) correctly. This model generates a vertical separator line that
correctly classifies the previously wrongly classified pluses(+) but in this attempt, it wrongly classifies three minuses(-).

• B3 consists of the 10 data points from the previous model in which the 3 wrongly classified minus(-) are weighted more so
that the current model tries more to classify these minuses(-) correctly. This model generates a horizontal separator line that
correctly classifies the previously wrongly classified minuses(-).

• B4 combines together B1, B2, and B3 in order to build a strong prediction model which is much better than any individual
model used.
Boosting

Disadvantages of Boosting Algorithms


• Boosting algorithms also have some disadvantages
these are:
• Boosting Algorithms are vulnerable to the outliers
• It is difficult to use boosting algorithms for Real-Time
applications.
• It is computationally expensive for large datasets
Random Forest
Random Forest is a technique that uses ensemble learning, that combines many
weak classifiers to provide solutions to complex problems.
As the name suggests random forest consists of many decision trees. Rather than
depending on one tree it takes the prediction from each tree and based on the
majority votes of predictions, predicts the final output.

The main difference between these two is that Random Forest is a bagging method
that uses a subset of the original dataset to make predictions and this property of
Random Forest helps to overcome Overfitting. Instead of building a single decision
tree, Random forest builds a number of DT’s with a different set of observations. One
big advantage of this algorithm is that it can be used for classification as well as
regression problems.
Random Forest

Steps involved in Random Forest Algorithm

Step-1 – We first make subsets of our original data. We will do row sampling and
feature sampling that means we’ll select rows and columns with replacement and
create subsets of the training dataset

Step- 2 – We create an individual decision tree for each subset we take

Step-3 – Each decision tree will give an output

Step 4 – Final output is considered based on Majority Voting if it’s a classification


problem and average if it’s a regression problem.
OOB(Out Of Bag) Evaluation

OOB (Out of the bag) Evaluation


Since sampling is done with replacement, about one-third of the data is not used to
train the model and this data is called out of the bag samples.

To get the oob evaluation we need to set a parameter called oob_score to TRUE. We can see that the score we
Decision Trees Random Forests
get from oob samples, and the test dataset is somewhat the same. In this way, we can use these left-out samples
in evaluating our model.

You might also like