Bagging and Boosting
Bagging and Boosting
Ensemble methods is a machine learning technique that combines several base models in order to
produce one optimal predictive model.
Ensemble Methods come in handy! Rather than just relying on one Decision Tree and hoping we made
the right decision at each split, Ensemble Methods allow us to take a sample of Decision Trees into
account, calculate which features to use or questions to ask at each split, and make a final predictor
based on the aggregated results of the sampled Decision Trees.
https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/
https://towardsdatascience.com/ensemble-methods-in-machine-learning-what-are-they-and-why-use-
them-68ec3f9fef5f
BAGGing, or Bootstrap AGGregating. BAGGing gets its name because it combines Bootstrapping and
Aggregation to form one ensemble model. Given a sample of data, multiple bootstrapped subsamples
are pulled. A Decision Tree is formed on each of the bootstrapped subsamples. After each subsample
Decision Tree has been formed, an algorithm is used to aggregate over the Decision Trees to form the
most efficient predictor. The image below will help explain:
Given a Dataset, bootstrapped subsamples are pulled. A Decision Tree is formed on each bootstrapped
sample. The results of each tree are aggregated to yield the strongest, most accurate predictor.
Good Resource:
https://towardsdatascience.com/boosting-and-adaboost-clearly-explained-856e21152d3e
https://www.educba.com/bagging-and-boosting/?source=leftnav
https://towardsdatascience.com/ensemble-methods-bagging-boosting-and-stacking-c9214a10a205
Bootstrapping
Confidence Interval for a Proportion.
We take sample data and copy over and over and called population. The concept of Bootstrap. Our first
bootstrap sample is same size as original sample but randomly selected. First time it was 24 ,16. This
time it is 20, 20. On stat key we have original count as 24 out of 40 samples. Now we generate bootstrap
sample and got this graph.
https://youtu.be/655X9eZGxls
Bagging
Bagging Refers to Bootstrap Aggregating. There is another way we can build an ensemble of learners.
We can build them using the same learning algorithm but train each learner on a different set of data.
This is what we call bootstrap aggregating or bagging.
So, what we do is we create a number of subsets of the data. To understand it can be thinking of
different bags of data. And each one of these is a subset of original data. How do we collect this data, we
do it randomly? We create M different bags where each of the bag contain n prime different data
instances chosen at random with replacement.
n: number of instances
M: number of bags.
n’<n: 60% So each of these bags has about 60% as many training instances.
Now, we use each of these collections of data to train different model. We have not m different models
each one trained on a little bit of different data. And just like when we have ensemble of different
learning algorithm. Here we have an ensemble of different models we query in the same way.
We query each model with same X and we collect all their outputs.
We take Y output of each model, take their mean and that’s out why for the ensemble.
https://www.udacity.com/course/machine-learning-for-trading--ud501
https://youtu.be/2Mg8QD0F1dQ
Boosting
Boosting is a fairly simple variation on bagging that strives to improve the learners by focusing on areas
where the system is not performing well. One of the most well-known algorithms is called Adaboost.
Ada stands for adaptive.
We build our first bag of data in usual way. We select randomly from our training data. We then train a
model in usual way. The next we do is something different we take all our training data and use it to test
our model. In order to discover that some of the points in here, our x’s and our y’s are not well
predicted. So, there will be some points in Training data for which there would be a significant error.
Now when when we are going to build our next D2 bag of data . Again, we chose randomly from our
original data but each instance is weighted according to its error. So, the points that had significant
error, would more likely to get picked to go on to D2. We build a model from this data and when we test
it. Again we test from the training data on both the models M1 and M2, and again we measure error
across this data. Similarly we do it to the number of models we want to create or the numbers of bags
are exhausted.
So, bagging is simply choosing subset of the data at random with replacement and we create each bag in
the same way. Boosting is add on to the idea where in subsequent bags we chose those data instances
that had been modeled poorly in overall system before.
https://www.youtube.com/watch?v=GM3CDQfQ4sw