15 Ada Boost
15 Ada Boost
Techniques
Introduction to Ensemble Techniques
● Ensemble methods are techniques that create multiple models and then combine them to produce improved results
● Ensemble methods usually produces more accurate solutions than a single model.
Let understand the concept of an ensemble technique with an example Suppose you
are a director of a company and you have created a new product on a very
important and interesting topic. Now, you want to take preliminary feedback
(ratings) on the product before making it public. What are the possible ways by
which you can do that?
A: You may ask one of your friends to rate the product for you.
Now it’s entirely possible that the person you have chosen loves you very much and
doesn’t want to break your heart by providing a 1-star rating to the horrible work
you have created.
The responses, in this case, would be more generalized and practical since now you have people with different sets of skills.
And as it turns out this is a better approach to get honest ratings than the previous cases we saw.
On the behalf of this example we conclude that diverse group of people are likely to make better decisions as compare to the
individual
Similarly that thing is true for diverse group of models in comparison to single models .
This
Simple Ensemble Techniques
1. Max Voting
2. Averaging
3. Weighted Averaging
Max voting : Max voting method is generally used to for classification problem In this technique, multiple models are used to make predictions
for each data point. The predictions by each model are considered as a ‘vote’. The predictions which we get from the majority of the models are
used as the final prediction
Suppose there are five person who giving a rating for a movie . three person rate 4 and two person rate 5 , since the majority of rating is 4 the
final rating is 4.
5 4 4 5 4 4
Averaging :Similar to the max voting technique, multiple predictions are made for each data point in averaging.In this method, we
take an average of predictions from all the models and use it to make the final prediction
Averaging can be used for making predictions in regression problems or while calculating probabilities for classification problems.
5 4 4 5 4 4.4
Weighted Average :it is an extension of the averaging method. All models are assigned different weights defining the importance
of each model for prediction
rating 5 4 4 5 4 4.41
Advanced Ensemble techniques
There several advanced Ensemble technique but mainly boosting and bagging is used
Bagging :Bootstrapping is a sampling technique in which we create subsets of observations from the original dataset, with replacement. The size of
the subsets is the same as the size of the original set.
Bagging (or Bootstrap Aggregating) technique uses these subsets (bags) to get a fair idea of the distribution (complete set). The size of subsets
created for bagging may be less than the original set.
Advanced Ensemble techniques
1. Multiple subsets are created from the original dataset, selecting observations with replacement.
2. A base model (weak model) is created on each of these subsets.
3. The models run in parallel and are independent of each other.
4. The final predictions are determined by combining the predictions from all the models.
Boosting
Suppose result predicted by first model is incorrect and then the next (probably all models), will
combining the predictions provide better results? Such situations are taken care of by boosting.
Boosting is a sequential process, where each subsequent model attempts to correct the errors of the
previous model. The succeeding models are dependent on the previous model. Let’s understand the
way boosting works in the below steps.
Thus boosting technique combine number of week learners and make them strong learner
Adaboost algorithm
● Adaboost is short form of adaptive boosting.Adaboost was the first successful algorithm developed for binary
classification.
● Adaboost is used with short decision trees.
● Multiple sequential models are created ,each correcting from previous model.
● It assigned weights to observations which was incorrectly predicted to make the prediction correctly.
Adaboost algorithm
Step to perform Adaboost
● Initially all the data point have equal weight in dataset
● Then built a model on subset of data
● Using these model predictions are made
● Then calculate the error between the predicted value and actual value.
● While creating the next model, higher weights are given to the data points which were predicted incorrectly.
● Higher weight are assign to the incorrectly predicted data point .
● This process is repeated until the error function does not change, or the maximum limit of the number of estimators is
reached.
Parameter of Adaboost algorithm
● Base estimators :It helps to specify the type of base estimator, that is, the machine learning algorithm to be used as base learner
● N_estimators : It define the number of base eastimaters , by default it is 10 but you take it higher for better result
● Learning_rate: it is the amount of weights are updated during training .There is a trade-off between learning_rate and n_estimators
● max_depth:Defines the maximum depth of the individual estimator. Tune this parameter for best performance
● Random_state :An integer value to specify the random data split . A definite value of random_state will always produce same results if given with
same parameters and training data.
Advantages of adaboost
● It is easier to use due to less parameters.
● Adaboost is not prone to overfitting.
● It is used to improve the accuracy of weak classifier.
● It is extended more than binary classification and found use cases in text and images.
Disadvantages of adaboost
● Adaboost learn progressively,so it is necessary to ensure the quality of data.
● Adaboost is sensitive to noisy data and outlier .so it is necessary to eliminate before model training.
● It slower in compare to other boosting algorithm ,
Thank
You...