ENsemble, Random Forest
ENsemble, Random Forest
Nidhin Pattaniyil
Table of Contents
- Voting Classifier
- Bagging and Pasting
- Random Forests
- Boosting
- Stacking
Introduction
- Ensemble: group of predictors
- Ensemble Learning: aggregate predictions of a group of predictors
- Ensemble method: ensemble learning algorithm
- Ensemble Methods:
- Bagging
- Boosting
- Stacking
- Work best when predictors are independent from one another as possible
Voting Classifiers
Voting Classifiers
- Hard Voting Classifier: predict the class that gets the most vote
Voting Classifiers: Soft Voting
- clf1 -> [0.2, 0.8], clf2 -> [0.1, 0.9], clf3 ->
[0.8, 0.2]
-
- With equal weights, the probabilities will
get calculated as the following:
-
- Prob of Class 0 = 0.33*0.2 + 0.33*0.1 +
0.33*0.8 = 0.363
-
- Prob of Class 1 = 0.33*0.8 + 0.33*0.9 +
0.33*0.2 = 0.627
-
- The probability predicted by ensemble
classifier will be [36.3%, 62.7%].
Logistic Regression: 0.864
RandomForestClassifier 0.896
SVC: 0.888
VotingClassifier 0.904
Bagging and Pasting
Bagging and Pasting
- Use same training algorithm but train on different random subsets of training
set
- Two types:
- Bagging: sampling with replacement;
- Pasting: sampling without replacement
- Each individual predictor has a higher bias
- Ensemble has a similar bias but a lower variance than a single predictor
trained on original training set
Bagging and Pasting
- Ensemble’s prediction will likely generalize better than single Decision Tree
Out-of-bag Dataset
Reference:
https://www.analyticsvidhya.com/blog/2021/03/gradient-boosting-machine-for-data-scientists/
Gradient Boosting (step 1)
- Train model 1
- compute predictions
Reference: https://www.analyticsvidhya.com/blog/2021/03/gradient-boosting-machine-for-data-scientists/
Gradient Boosting (step 2)
- Using the predictions , compute residual
- Save model 1 predictions
Reference: https://www.analyticsvidhya.com/blog/2021/03/gradient-boosting-machine-for-data-scientists/
Gradient Boosting (step 3)
- Train a new model where the target is the error from model 1
- Save model 1 predictions
- Repeat for further models
Reference: https://www.analyticsvidhya.com/blog/2021/03/gradient-boosting-machine-for-data-scientists/
Gradient Boosting
- Model 0: predicts the target
- Model 1 and above, target is the previous error
Reference: https://www.analyticsvidhya.com/blog/2021/03/gradient-boosting-machine-for-data-scientists/
Gradient Boosting
- XGBoost, LightGBM, Catboost are other popular libraries
- Gradient Boosting also used for ranking
Stacking
Stacking
- Instead of using hard voting, train a model
to perform the aggregating
- Training
- Create a hold out dataset
- Train classifiers on split 1
- Get output from classifier on split 2 and
use as training data
- Blender is trained from first layers
predictions
Summary
- Ensemble methods: Bagging / Boosting / Stacking
- Voting: Hard or Soft Voting
- Sample Training Data / Sample Features
- Random Forests: Bagging Tree Classifier ; feature importance, OOB score
- Boosting: AdaBoost / Gradient Boosting
- Stacking: model to perform aggregation