0% found this document useful (0 votes)

15 views22 pages

Gradient Boosting - Hyperparameter Tuning Python

This document provides a comprehensive guide to hyperparameter tuning in Gradient Boosting Machine (GBM) using Python, emphasizing the importance of understanding and adjusting parameters to optimize model performance. It covers the mechanics of boosting, the categorization of GBM parameters, and offers practical examples for parameter tuning. The article aims to empower users to move beyond treating GBM as a 'black box' and to effectively manage bias-variance trade-offs in machine learning.

Uploaded by

Kshitij Saxena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views22 pages

Gradient Boosting - Hyperparameter Tuning Python

Uploaded by

Kshitij Saxena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning Guide to Parameter Tuning in Gradient Boosting (GBM) in Python

Mastering Language Models: From Concepts to Code in

PyTorch with Joshua Starmer (StatQuest) on 10th
August,2024
Register Now K
Home Algorithm
Complete Machine Learning Guide to Parameter Tuning in Gradient
Boo...

Aarshay Jain
15 Jun, 2022 • 14 min read

Overview
Learn parameter tuning in gradient boosting algorithm
using Python
Understand how to adjust bias-variance trade-off in
machine learning for gradient boosting

Introduction
If you have been using GBM as a ‘black box’ till now,
maybe it’s time for you to open it and see, how it actually
works!
This article is inspired by Owen Zhang’s (Chief Product
Officer at DataRobot and Kaggle Rank 3) approach
shared at NYC Data Science Academy. He delivered a~2
hours talk and I intend to condense it and present the
most precious nuggets here.
Boosting algorithms play a crucial role in dealing with bias
variance trade-off. Unlike bagging algorithms, which only
controls for high variance in a model, boosting controls
both the aspects (bias & variance), and is considered to
be more effective. A sincere understanding of GBM here
should give you much needed confidence to deal with
such critical issues.
We use cookies on Analytics VidhyaInwebsites
this article, I’ll our
to deliver disclose
services,theanalyze
science behindandusing
web traffic, GBM
improve your experience on the site. By using
Analytics
usingVidhya,
Python.youAnd,
agree most
to our important,
Privacy Policyhow
and Terms of Use.
you can tuneAccept
its
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 1/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learningparameters and obtain incredible

Guide to Parameter Tuning
32
results.
in Gradient Boosting (GBM) in Python
If you are completely new to the world of Ensemble
learning, you can enrol in this free course which covers all
the techniques in a structured manner: Ensemble Learning
and Ensemble Learning Techniques

Special Thanks: Personally, I would like to acknowledge

the timeless support provided by Mr. Sudalai Rajkumar,
currently AV Rank 2. This article wouldn’t be possible
without his guidance. I am sure the whole community will
benefit from the same.

Complete Guide to Parameter Tuning in Gradient

Boosting (GBM) in Python

Table of Contents
1. How Boosting Works?
2. Understanding GBM Parameters
3. Tuning Parameters (with Example)

1. How Boosting Works ?

Boosting is a sequential technique which works on the
principle of ensemble. It combines a set of weak
We use cookies on Analytics Vidhyalearners
websites toand delivers
deliver improved
our services, analyzeprediction
web traffic, accuracy. At experience on the site. By using
and improve your
Analytics Vidhya, you
any instant agreemodel
t, the to ouroutcomes
Privacy Policyareandweighed
Terms of based
Use. Accept
on
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 2/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningtheGuideoutcomes of Tuning

to Parameter previousin Gradient
32
instant Boosting
t-1. The
(GBM) in Python
outcomes predicted correctly are given a lower weight
and the ones miss-classified are weighted higher. This
technique is followed for a classification problem while a
similar technique is used for regression.
Let’s understand it visually:

Observations:
1. Box 1: Output of First Weak Learner (From the left)
Initially all points have same weight (denoted by
their size).
The decision boundary predicts 2 +ve and 5 -ve
points correctly.
2. Box 2: Output of Second Weak Learner
The points classified correctly in box 1 are given a
lower weight and vice versa.
The model focuses on high weight points now and
classifies them correctly. But, others are
misclassified now.
Similar trend can be seen in box 3 as well. This continues
for many iterations. In the end, all models are given a
weight depending on their accuracy and a consolidated
result is generated.
Did I whet your appetite? Good. Refer to these articles
(focus on GBM right now):
Learn Gradient Boosting Algorithm for better
predictions (with codes in R)
Quick Introduction to Boosting Algorithms in Machine
Learning
Getting smart with Machine Learning – AdaBoost and
Gradient Boost
We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 3/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning2.Guide

GBMtoParameters
Parameter Tuning
32
in Gradient Boosting (GBM) in Python
The overall parameters of this ensemble model can be
divided into 3 categories:
1. Tree-Specific Parameters: These affect each
individual tree in the model.
2. Boosting Parameters: These affect the boosting
operation in the model.
3. Miscellaneous Parameters: Other parameters for
overall functioning.
I’ll start with tree-specific parameters. First, lets look at
the general structure of a decision tree:

The parameters used for defining a tree are further

explained below. Note that I’m using scikit-learn (python)
specific terminologies here which might be different in
other software packages like R. But the idea remains the
same.
1. min_samples_split
Defines the minimum number of samples (or
observations) which are required in a node to be
considered for splitting.
Used to control over-fitting. Higher values prevent
a model from learning relations which might be
highly specific to the particular sample selected
for a tree.
Too high values can lead to under-fitting hence, it
should be tuned using CV.
2. min_samples_leaf
Defines the minimum samples (or observations)
required in a terminal node or leaf.
Used to control over-fitting similar to
min_samples_split.
We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 4/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning GuideGenerally lower values

to Parameter Tuning
32
should
in be chosen
Gradient for (GBM) in Python
Boosting
imbalanced class problems because the regions in
which the minority class will be in majority will be
very small.
3. min_weight_fraction_leaf
Similar to min_samples_leaf but defined as a
fraction of the total number of observations
instead of an integer.
Only one of #2 and #3 should be defined.
4. max_depth
The maximum depth of a tree.
Used to control over-fitting as higher depth will
allow model to learn relations very specific to a
particular sample.
Should be tuned using CV.
5. max_leaf_nodes
The maximum number of terminal nodes or leaves
in a tree.
Can be defined in place of max_depth. Since
binary trees are created, a depth of ‘n’ would
produce a maximum of 2^n leaves.
If this is defined, GBM will ignore max_depth.
6. max_features
The number of features to consider while
searching for a best split. These will be randomly
selected.
As a thumb-rule, square root of the total number of
features works great but we should check upto 30-
40% of the total number of features.
Higher values can lead to over-fitting but depends
on case to case.
Before moving on to other parameters, lets see the overall
pseudo-code of the GBM algorithm for 2 classes:
1. Initialize the outcome
2. Iterate from 1 to total number of trees
2.1 Update the weights for targets based on previous run
2.2 Fit the model on selected subsample of data
2.3 Make predictions on the full set of observations
2.4 Update the output with current results taking into acc
3. Return the final output.

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 5/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningThis

Guideis toanParameter
extremelyTuning
simplified
32
in (probably
Gradient naive)
Boosting (GBM) in Python
explanation of GBM’s working. The parameters which we
have considered so far will affect step 2.2, i.e. model
building. Lets consider another set of parameters for
managing boosting:
1. learning_rate
This determines the impact of each tree on the
final outcome (step 2.4). GBM works by starting
with an initial estimate which is updated using the
output of each tree. The learning parameter
controls the magnitude of this change in the
estimates.
Lower values are generally preferred as they make
the model robust to the specific characteristics of
tree and thus allowing it to generalize well.
Lower values would require higher number of trees
to model all the relations and will be
computationally expensive.
2. n_estimators
The number of sequential trees to be modeled
(step 2)
Though GBM is fairly robust at higher number of
trees but it can still overfit at a point. Hence, this
should be tuned using CV for a particular learning
rate.
3. subsample
The fraction of observations to be selected for
each tree. Selection is done by random sampling.
Values slightly less than 1 make the model robust
by reducing the variance.
Typical values ~0.8 generally work fine but can be
fine-tuned further.
Apart from these, there are certain miscellaneous
parameters which affect overall functionality:
1. loss
It refers to the loss function to be minimized in
each split.
It can have various values for classification and
We use cookies on Analytics Vidhya websites to deliver our
regression services,
case. analyzethe
Generally webdefault
traffic, and improve
values your experience on the site. By using
work
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 6/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning Guidefine. Other values

to Parameter should
Tuning
32
in be chosen
Gradient only (GBM)
Boosting if in Python
you understand their impact on the model.
2. init
This affects initialization of the output.
This can be used if we have made another model
whose outcome is to be used as the initial
estimates for GBM.
3. random_state
The random number seed so that same random
numbers are generated every time.
This is important for parameter tuning. If we don’t
fix the random number, then we’ll have different
outcomes for subsequent runs on the same
parameters and it becomes difficult to compare
models.
It can potentially result in overfitting to a particular
random sample selected. We can try running
models for different random samples, which is
computationally expensive and generally not used.
4. verbose
The type of output to be printed when the model
fits. The different values can be:
0: no output generated (default)
1: output generated for trees in certain
intervals
>1: output generated for all trees
5. warm_start
This parameter has an interesting application and
can help a lot if used judicially.
Using this, we can fit additional trees on previous
fits of a model. It can save a lot of time and you
should explore this option for advanced
applications
6. presort
Select whether to presort data for faster splits.
It makes the selection automatically by default but
it can be changed if needed.
I know its a long list of parameters but I have simplified it
We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
for you
Analytics in anyouexcel
Vidhya, agree file
to ourwhich
Privacyyou cananddownload
Policy from Accept
Terms of Use. my
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 7/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningGitHub

Guiderepository.
to Parameter Tuning
32
in Gradient Boosting (GBM) in Python

3. Parameter Tuning with Example

We will take the dataset from Data Hackathon 3.x AV
hackathon. The details of the problem can be found on
the competition page. You can download the data set
from here. I have performed the following steps:
1. City variable dropped because of too many categories
2. DOB converted to Age | DOB dropped
3. EMI_Loan_Submitted_Missing created which is 1 if
EMI_Loan_Submitted was missing else 0 | Original
variable EMI_Loan_Submitted dropped
4. EmployerName dropped because of too many
categories
5. Existing_EMI imputed with 0 (median) since only 111
values were missing
6. Interest_Rate_Missing created which is 1 if
Interest_Rate was missing else 0 | Original variable
Interest_Rate dropped
7. Lead_Creation_Date dropped because made little
intuitive impact on outcome
8. Loan_Amount_Applied, Loan_Tenure_Applied imputed
with median values
9. Loan_Amount_Submitted_Missing created which is 1 if
Loan_Amount_Submitted was missing else 0 | Original
variable Loan_Amount_Submitted dropped
10. Loan_Tenure_Submitted_Missing created which is 1 if
Loan_Tenure_Submitted was missing else 0 | Original
variable Loan_Tenure_Submitted dropped
11. LoggedIn, Salary_Account dropped
12. Processing_Fee_Missing created which is 1 if
Processing_Fee was missing else 0 | Original variable
Processing_Fee dropped
13. Source – top 2 kept as is and all others combined into
different category
We use cookies on Analytics Vidhya14.websites
Numerical andourOne-Hot-Coding
to deliver performed
services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 8/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningForGuide

thosetowho have theTuning
Parameter originalin data
32
from Boosting
Gradient competition,
(GBM) in Python
you can check out these steps from the
data_preparation iPython notebook in the repository.
Lets start by importing the required libraries and loading
the data:
Python Code:

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 9/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning Guide to Parameter Tuning in Gradient Boosting (GBM) in Python
parametertun @ShilpiMazumdar1
32
Show files Open on Replit

main.py

1 #Import libraries:
2 import pandas as pd
3 import numpy as np
4
5 #import sklearn
6 from sklearn.ensemble import
GradientBoostingClassifier #GBM algorithm
7 from sklearn.model_selection import GridSearchCV
8 from sklearn.model_selection import train_test_split
9 #from sklearn.grid_search import GridSearchCV
#Perforing grid search
10
11 import matplotlib.pylab as plt
12 #%matplotlib inline
13 from matplotlib.pylab import rcParams
14 rcParams['figure.figsize'] = 12, 4
15
16 #train = pd.read_csv('train_modified.csv', encoding =
'utf-8')
17 train = pd.read_csv('train_modified.csv',
encoding='ISO-8859–1')
18 target = 'Disbursed'
19 IDcol = 'ID'

Before proceeding further, lets define a function which will

help us create GBM models and perform cross-validation.
We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics#Fit
Vidhya,
theyou agree to our Privacy Policy and Terms of Use. Accept
def modelfit(alg, dtrain, predictors, performCV=True, printF
algorithm on the data

https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 10/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning Guide to Parameter Tuning in Gradient Boosting (GBM) in Python
alg.fit(dtrain[predictors], dtrain['Disbursed'])

32
#Predict training set:
dtrain_predictions = alg.predict(dtrain[predictors])
dtrain_predprob = alg.predict_proba(dtrain[predictors])

#Perform cross-validation:
if performCV:
cv_score = cross_validation.cross_val_score(alg, dt

#Print model report:

print "\nModel Report"
print "Accuracy : %.4g" % metrics.accuracy_score(dtrain
print "AUC Score (Train): %f" % metrics.roc_auc_score(dt

if performCV:
print "CV Score : Mean - %.7g | Std - %.7g | Min - %

#Print Feature Importance:

if printFeatureImportance:
feat_imp = pd.Series(alg.feature_importances_, predi
feat_imp.plot(kind='bar', title='Feature Importances
plt.ylabel('Feature Importance Score')

The code is pretty self-explanatory. Please feel free to

drop a note in the comments if you find any challenges in
understanding any part of it.
Lets start by creating a baseline model. In this case, the
evaluation metric is AUC so using any constant value will
give 0.5 as result. Typically, a good baseline can be a GBM
model with default parameters, i.e. without any tuning.
Lets find out what it gives:
#Choose all predictors except target & IDcols
predictors = [x for x in train.columns if x not in [target,
gbm0 = GradientBoostingClassifier(random_state=10)
modelfit(gbm0, train, predictors)

So, the mean CV score is 0.8319 and we should expect

our model to do better than this.

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
General
Analytics Approach
Vidhya, you agree toforourParameter
Privacy PolicyTuning
and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 11/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningAsGuide

discussed earlier, there
to Parameter are two
Tuning
32
in types of parameter
Gradient Boosting to in Python
(GBM)
be tuned here – tree based and boosting parameters.
There are no optimum values for learning rate as low
values always work better, given that we train on
sufficient number of trees.
Though, GBM is robust enough to not overfit with
increasing trees, but a high number for a particular
learning rate can lead to overfitting. But as we reduce the
learning rate and increase trees, the computation
becomes expensive and would take a long time to run on
standard personal computers.
Keeping all this in mind, we can take the following
approach:
1. Choose a relatively high learning rate. Generally the
default value of 0.1 works but somewhere between
0.05 to 0.2 should work for different problems
2. Determine the optimum number of trees for this
learning rate. This should range around 40-70.
Remember to choose a value on which your system
can work fairly fast. This is because it will be used for
testing various scenarios and determining the tree
parameters.
3. Tune tree-specific parameters for decided learning
rate and number of trees. Note that we can choose
different parameters to define a tree and I’ll take up an
example here.
4. Lower the learning rate and increase the estimators
proportionally to get more robust models.

Fix learning rate and number of estimators for

tuning tree-based parameters
In order to decide on boosting parameters, we need to set
some initial values of other parameters. Lets take the
following values:
1. min_samples_split = 500 : This should be ~0.5-1% of
total values. Since this is imbalanced class problem,
we’ll take a small value from the range.
We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 12/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

min_samples_leaf
Complete Machine Learning2.Guide to Parameter=Tuning
50 : Canin Gradient
be selected based on
Boosting (GBM) in Python
32
intuition. This is just used for preventing overfitting
and again a small value because of imbalanced
classes.
3. max_depth = 8 : Should be chosen (5-8) based on the
number of observations and predictors. This has 87K
rows and 49 columns so lets take 8 here.
4. max_features = ‘sqrt’ : Its a general thumb-rule to
start with square root.
5. subsample = 0.8 : This is a commonly used used start
value
Please note that all the above are just initial estimates and
will be tuned later. Lets take the default learning rate of
0.1 here and check the optimum number of trees for that.
For this purpose, we can do a grid search and test out
values from 20 to 80 in steps of 10.
#Choose all predictors except target & IDcols
predictors = [x for x in train.columns if x not in [target,
param_test1 = {'n_estimators':range(20,81,10)}
gsearch1 = GridSearchCV(estimator = GradientBoostingClassifi
param_grid = param_test1, scoring='roc_auc',n_jobs=4,iid=Fa
gsearch1.fit(train[predictors],train[target])

The output can be checked using following command:

gsearch1.grid_scores_, gsearch1.best_params_, gsearch1.best_

As you can see that here we got 60 as the optimal

estimators for 0.1 learning rate. Note that 60 is a
reasonable value and can be used as it is. But it might not
be the same in all cases. Other situations:
1. If the value is around 20, you might want to try
lowering the learning rate to 0.05 and re-run grid
search
2. If the values are too high ~100, tuning the other
parameters will take long time and you can try a higher
learning rate
We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 13/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningTuning

Guidetree-specific
to Parameterparameters
Tuning
32
in Gradient Boosting (GBM) in Python
Now lets move onto tuning the tree parameters. I plan to
do this in following stages:
1. Tune max_depth and num_samples_split
2. Tune min_samples_leaf
3. Tune max_features
The order of tuning variables should be decided carefully.
You should take the variables with a higher impact on
outcome first. For instance, max_depth and
min_samples_split have a significant impact and we’re
tuning those first.
Important Note: I’ll be doing some heavy-duty grid
searched in this section which can take 15-30 mins or
even more time to run depending on your system. You can
vary the number of values you are testing based on what
your system can handle.
To start with, I’ll test max_depth values of 5 to 15 in steps
of 2 and min_samples_split from 200 to 1000 in steps of
200. These are just based on my intuition. You can set
wider ranges as well and then perform multiple iterations
for smaller ranges.
param_test2 = {'max_depth':range(5,16,2), 'min_samples_split
gsearch2 = GridSearchCV(estimator = GradientBoostingClassifi
param_grid = param_test2, scoring='roc_auc',n_jobs=4,iid=Fa
gsearch2.fit(train[predictors],train[target])
gsearch2.grid_scores_, gsearch2.best_params_, gsearch2.best_

Here, we have run 30 combinations and the ideal values

We use cookies on Analytics Vidhyaarewebsites to deliver our services,
9 for max_depth and 1000analyze web traffic, and improve
for min_samples_split. your experience on the site. By using
Note
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 14/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learningthat, 1000toisParameter

Guide an extremeTuning
value which
32
in we tested.
Gradient There is(GBM)
Boosting a in Python
fare chance that the optimum value lies above that. So we
should check for some higher values as well.
Here, I’ll take the max_depth of 9 as optimum and not try
different values for higher min_samples_split. It might not
be the best idea always but here if you observe the
output closely, max_depth of 9 works better in most of the
cases. Also, we can test for 5 values of min_samples_leaf,
from 30 to 70 in steps of 10, along with higher
min_samples_split.
param_test3 = {'min_samples_split':range(1000,2100,200), 'mi
gsearch3 = GridSearchCV(estimator = GradientBoostingClassifi
param_grid = param_test3, scoring='roc_auc',n_jobs=4,iid=Fa
gsearch3.fit(train[predictors],train[target])
gsearch3.grid_scores_, gsearch3.best_params_, gsearch3.best_

Here we get the optimum values as 1200 for

min_samples_split and 60 for min_samples_leaf. Also, we
can see the CV score increasing to 0.8396 now. Let’s fit
the model again on this and have a look at the feature
importance.
modelfit(gsearch3.best_estimator_, train, predictors)

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 15/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningIf Guide

you compare the feature
to Parameter importance
Tuning
32
in of thisBoosting
Gradient model with
(GBM) in Python
the baseline model, you’ll find that now we are able to
derive value from many more variables. Also, earlier it
placed too much importance on some variables but now it
has been fairly distributed.
Now lets tune the last tree-parameters, i.e. max_features
by trying 7 values from 7 to 19 in steps of 2.
param_test4 = {'max_features':range(7,20,2)}
gsearch4 = GridSearchCV(estimator = GradientBoostingClassifi
param_grid = param_test4, scoring='roc_auc',n_jobs=4,iid=Fa
gsearch4.fit(train[predictors],train[target])
gsearch4.grid_scores_, gsearch4.best_params_, gsearch4.best_

Here, we find that optimum value is 7, which is also the

square root. So our initial value was the best. You might
be anxious to check for lower values and you should if you
like. I’ll stay with 7 for now. With this we have the final
tree-parameters as:
min_samples_split: 1200
min_samples_leaf: 60
max_depth: 9
max_features: 7

Tuning subsample and making models with lower

learning rate
The next step would be try different subsample values.
Lets take values 0.6,0.7,0.75,0.8,0.85,0.9.
param_test5 = {'subsample':[0.6,0.7,0.75,0.8,0.85,0.9]}
gsearch5 = GridSearchCV(estimator = GradientBoostingClassifi
param_grid = param_test5, scoring='roc_auc',n_jobs=4,iid=Fa
gsearch5.fit(train[predictors],train[target])
gsearch5.grid_scores_, gsearch5.best_params_, gsearch5.best_

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 16/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningHere,

Guidewetofound 0.85 asTuning
Parameter the optimum
32
in value.Boosting
Gradient Finally, we
(GBM) in Python
have all the parameters needed. Now, we need to lower
the learning rate and increase the number of estimators
proportionally. Note that these trees might not be the
most optimum values but a good benchmark.
As trees increase, it will become increasingly
computationally expensive to perform CV and find the
optimum values. For you to get some idea of the model
performance, I have included the private leaderboard
scores for each. Since the data is not open, you won’t be
able to replicate that but it’ll good for understanding.
Lets decrease the learning rate to half, i.e. 0.05 with
twice (120) the number of trees.
predictors = [x for x in train.columns if x not in [target,
gbm_tuned_1 = GradientBoostingClassifier(learning_rate=0.05
modelfit(gbm_tuned_1, train, predictors)

Private LB Score: 0.844139

Now lets reduce to one-tenth of the original value, i.e.
0.01 for 600 trees.
predictors = [x for x in train.columns if x not in [target,
gbm_tuned_2 = GradientBoostingClassifier(learning_rate=0.01
modelfit(gbm_tuned_2, train, predictors)

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 17/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning Guide to Parameter Tuning

32
in Gradient Boosting (GBM) in Python

Private LB Score: 0.848145

Lets decrease to one-twentieth of the original value, i.e.
0.005 for 1200 trees.
predictors = [x for x in train.columns if x not in [target,
gbm_tuned_3 = GradientBoostingClassifier(learning_rate=0.005
warm_start=True)
modelfit(gbm_tuned_3, train, predictors, performCV=False)

Private LB Score: 0.848112

Here we see that the score reduced very slightly. So lets
run for 1500 trees.
predictors = [x for x in train.columns if x not in [target,
gbm_tuned_4 = GradientBoostingClassifier(learning_rate=0.005
warm_start=True)
modelfit(gbm_tuned_4, train, predictors, performCV=False)

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 18/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning Guide to Parameter Tuning

32
in Gradient Boosting (GBM) in Python

Private LB Score: 0.848747

Therefore, now you can clearly see that this is a very
important step as private LB scored improved from
~0.844 to ~0.849 which is a significant jump.
Another hack that can be used here is the ‘warm_start’
parameter of GBM. You can use it to increase the number
of estimators in small steps and test different values
without having to run from starting always. You can also
download the iPython notebook with all these model
codes from my GitHub account.
If you like this article and want to read a similar
post for XGBoost, check this out – Complete Guide
to Parameter Tuning in XGBoost

End Notes
This article was based on developing a GBM ensemble
learning model end-to-end. We started with an
introduction to boosting which was followed by detailed
discussion on the various parameters involved. The
parameters were divided into 3 categories namely the
tree-specific, boosting and miscellaneous parameters
depending on their impact on the model.
Finally, we discussed the general approach towards
tackling a problem with GBM and also worked out the AV
Data Hackathon 3.x problem through that approach.
I hope you found this useful and now you feel more
We use cookies on Analytics Vidhyaconfident
websites totodeliver
applyourGBM in solving
services, analyzeaweb
datatraffic,
science problem.
and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 19/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningYouGuide

can try this out in out
to Parameter upcoming
Tuning
32
in signatureBoosting
Gradient hackathon
(GBM) in Python
Date Your Data.
Did you like this article? Would you like to share some
other hacks which you implement while making GBM
models? Please feel free to drop a note in the comments
below and I’ll be glad to discuss.
You want to apply your analytical skills and test
your potential? Then participate in our
Hackathons and compete with Top Data Scientists
from all over the world.
bagging Boosting decision tree
Ensemble Methods gbm in python gbm in R
gbm parameter tuning Gradient Boosting
parameter tuning parameter tuning in python
XGBoost

Aarshay Jain
15 Jun 2022
Aarshay graduated from MS in Data Science at Columbia
University in 2017 and is currently an ML Engineer at
Spotify New York. He works at an intersection or applied
research and engineering while designing ML solutions to
move product metrics in the required direction. He
specializes in designing ML system architecture,
developing offline models and deploying them in
production for both batch and real time prediction use
cases.

Algorithm Banking Classification Data Science

Intermediate

Responses From Readers

What are your thoughts?...

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Analytics Vidhya, you agree to our Privacy Policy and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 20/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine Learning Guide to Parameter Tuning

32
in GradientSubmit reply (GBM) in Python
Boosting

anurag
22 Feb, 2016
Great Article!! Can you do this for SVM,XGBoost, deep
learning and neural networks.
1 Show 1 reply
Jignesh Vyas
23 Feb, 2016
Wow great article, pretty much detailed and easy to
understand. Am a great fan of articles posted on this
site. Keep up the good work !
1 Show 1 reply
Pallavi
23 Feb, 2016
absolutely fantastic article. Loved the step by step
approach. Would love to read more of these on SVM,
deep learning Also it would be fantastic to have R

Write for us
Write, captivate, and earn accolades and rewards for your
work

Reach a Global Audience

Get Expert Feedback
Build Your Brand & Audience
Cash In on Your Knowledge
Join a Thriving Community
Level Up Your Data Science Game

Sion Chakrabarti CHIRAG GOY

16 87

Company Discover
About Us Blogs
We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using
Contact
Analytics Us you agree to our PrivacyExpert
Vidhya, Policysession
and Terms of Use. Accept
https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 21/22
20/05/2024, 11:44 Gradient Boosting | Hyperparameter Tuning Python

Complete Machine LearningCareers

Guide to Parameter Tuning
32
Podcasts
in Gradient Boosting (GBM) in Python
Comprehensive Guides
Learn Engage
Free courses Community
Learning path Hackathons
BlackBelt program Events
Gen AI Daily challenges
Contribute Enterprise
Contribute & win Our offerings
Become a speaker Case studies
Become a mentor Industry report
Become an instructor quexto.ai

Download App

Terms & conditions Refund Policy Privacy Policy

21csc305p Machine Learning Unit 5
No ratings yet
21csc305p Machine Learning Unit 5
61 pages
Top 10 Machine Learning Algo PDF
No ratings yet
Top 10 Machine Learning Algo PDF
15 pages
Glass
100% (1)
Glass
30 pages
Connector Layouts CSV Format
No ratings yet
Connector Layouts CSV Format
39 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
26 pages
Fixing Neural Network Course 2 1659759284
No ratings yet
Fixing Neural Network Course 2 1659759284
30 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
Step-by-Step Machine Learning
No ratings yet
Step-by-Step Machine Learning
3 pages
Intro To Machine Learning With PyTorch
No ratings yet
Intro To Machine Learning With PyTorch
48 pages
MKT201 Indiviual Assignment On PRAN Frooto
No ratings yet
MKT201 Indiviual Assignment On PRAN Frooto
23 pages
107 Boostong Models
No ratings yet
107 Boostong Models
27 pages
Unmanned Aerial Vehicle Uav Autopilot Sy
No ratings yet
Unmanned Aerial Vehicle Uav Autopilot Sy
90 pages
ML - Zep
No ratings yet
ML - Zep
94 pages
Study Notes - Lesson 1 - 7 PDF
No ratings yet
Study Notes - Lesson 1 - 7 PDF
25 pages
Optimization
No ratings yet
Optimization
44 pages
LECTURE+NOTES Boosting
No ratings yet
LECTURE+NOTES Boosting
8 pages
ML Unit 3 (Ab22)
No ratings yet
ML Unit 3 (Ab22)
42 pages
Pattern Classification 11. Backpropagation & Time-Series Forecasting
No ratings yet
Pattern Classification 11. Backpropagation & Time-Series Forecasting
78 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Lec5 Boosting v2.7 1
No ratings yet
Lec5 Boosting v2.7 1
46 pages
C4159 - Transistor Sin Damper
No ratings yet
C4159 - Transistor Sin Damper
4 pages
Chapter 2 - 4 Important Techniques
No ratings yet
Chapter 2 - 4 Important Techniques
34 pages
Chapter 02 Overview - 4
No ratings yet
Chapter 02 Overview - 4
43 pages
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
No ratings yet
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
44 pages
Use Machine Learning To Forecast Future Earnings
No ratings yet
Use Machine Learning To Forecast Future Earnings
31 pages
Ensemble
No ratings yet
Ensemble
33 pages
Overfitting & Feature Engineering
No ratings yet
Overfitting & Feature Engineering
37 pages
Boosting
No ratings yet
Boosting
13 pages
Ensemble Methods
No ratings yet
Ensemble Methods
21 pages
AML Slides Indexed 2in1
No ratings yet
AML Slides Indexed 2in1
33 pages
FRP-Enclosures Specification
No ratings yet
FRP-Enclosures Specification
8 pages
Lecture 2
No ratings yet
Lecture 2
31 pages
Quantum Entanglement Presentation
No ratings yet
Quantum Entanglement Presentation
12 pages
Batch Normalization in AIML Accelerating Deep Learning
No ratings yet
Batch Normalization in AIML Accelerating Deep Learning
12 pages
Introduction To Boosting - 2
No ratings yet
Introduction To Boosting - 2
79 pages
Ensemble Learning
No ratings yet
Ensemble Learning
30 pages
6th December Current Affairs With Mcqs by Abhijeet Sir
No ratings yet
6th December Current Affairs With Mcqs by Abhijeet Sir
26 pages
Lesson 8 - Ensemble Learning
No ratings yet
Lesson 8 - Ensemble Learning
61 pages
XGBoost
No ratings yet
XGBoost
22 pages
Module 10 - Part 2 - Boosting Models
No ratings yet
Module 10 - Part 2 - Boosting Models
14 pages
Gradient Boosting
No ratings yet
Gradient Boosting
17 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
Comparing ML Algorithms - Anjali Garg
No ratings yet
Comparing ML Algorithms - Anjali Garg
14 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
MCS224 Dec 2024 Solved
No ratings yet
MCS224 Dec 2024 Solved
22 pages
Plagiarism
No ratings yet
Plagiarism
20 pages
RF Fingerprinting Unmanned Aerial Vehicles With Non-Standard Transmitter Waveforms
No ratings yet
RF Fingerprinting Unmanned Aerial Vehicles With Non-Standard Transmitter Waveforms
14 pages
Bowling Club Rs. 9.71 Million Dec-2019
No ratings yet
Bowling Club Rs. 9.71 Million Dec-2019
17 pages
Tisean
No ratings yet
Tisean
27 pages
GBMs
No ratings yet
GBMs
13 pages
Plagiarism
No ratings yet
Plagiarism
18 pages
Hasnain2020 Refrensi 15
No ratings yet
Hasnain2020 Refrensi 15
15 pages
PyData London 2022 - Unlocking The Power of LightGBM (Summarized)
No ratings yet
PyData London 2022 - Unlocking The Power of LightGBM (Summarized)
28 pages
Dy DX F (X) Dy DX F (X, Y) : Introduction To Differential Equations
No ratings yet
Dy DX F (X) Dy DX F (X, Y) : Introduction To Differential Equations
13 pages
Lista Carros King
No ratings yet
Lista Carros King
15 pages
Al3451 Ia 2 Answer Key
No ratings yet
Al3451 Ia 2 Answer Key
12 pages
Machine Learning: Video 106: Gradient Boosting Explained - How Gradient Boosting Works?
No ratings yet
Machine Learning: Video 106: Gradient Boosting Explained - How Gradient Boosting Works?
6 pages
Machine Learning: Video 106: Gradient Boosting Explained - How Gradient Boosting Works?
No ratings yet
Machine Learning: Video 106: Gradient Boosting Explained - How Gradient Boosting Works?
6 pages
XG Boost
No ratings yet
XG Boost
5 pages
Pas 32, Pas 12 & Pas 33
No ratings yet
Pas 32, Pas 12 & Pas 33
7 pages
Chapter 3 - Boosting Theory
No ratings yet
Chapter 3 - Boosting Theory
7 pages
Understaing Support Vector Machine Example Code
No ratings yet
Understaing Support Vector Machine Example Code
11 pages
Deep Learning
No ratings yet
Deep Learning
3 pages
TotalEnergies Qatar Employment Contract Terms Jonatas Micmas Da Silva
No ratings yet
TotalEnergies Qatar Employment Contract Terms Jonatas Micmas Da Silva
3 pages
About Karnataka
No ratings yet
About Karnataka
13 pages
Hyper-Parameter Tuning Techniques in Deep Learning - Towards Data Science
No ratings yet
Hyper-Parameter Tuning Techniques in Deep Learning - Towards Data Science
14 pages
MSC Psychology Dissertation Examples
100% (2)
MSC Psychology Dissertation Examples
6 pages
Boosting
No ratings yet
Boosting
6 pages
TRN009 Global AnyConnect VPN and VIP Code Instructions
No ratings yet
TRN009 Global AnyConnect VPN and VIP Code Instructions
10 pages
Datagiri: Presented 17 November By: Himanshu Shrivastava
No ratings yet
Datagiri: Presented 17 November By: Himanshu Shrivastava
17 pages
STANDARD BOM D10T LH Track Frame Rebuild Parts
No ratings yet
STANDARD BOM D10T LH Track Frame Rebuild Parts
2 pages
2020 CS182 Section 2 Notes
No ratings yet
2020 CS182 Section 2 Notes
6 pages
Machine Learning (ML) - Comprehensive Summary
No ratings yet
Machine Learning (ML) - Comprehensive Summary
7 pages
Appgecet-21 Detailed Notification (2ND and Final)
No ratings yet
Appgecet-21 Detailed Notification (2ND and Final)
4 pages
Quiz in Business Math Commissions
No ratings yet
Quiz in Business Math Commissions
1 page
Development of A Useful Wind Turbine Emulator Based On Permanent Magnet DC Motor
No ratings yet
Development of A Useful Wind Turbine Emulator Based On Permanent Magnet DC Motor
5 pages
HAVELLS List Price - FAN January 2019
No ratings yet
HAVELLS List Price - FAN January 2019
4 pages
How Diamonds Work
No ratings yet
How Diamonds Work
17 pages
KPI Student Satisfaction and Engagement Survey
No ratings yet
KPI Student Satisfaction and Engagement Survey
4 pages
Complete Guide To Parameter Tuning in Gradient Boosting (GBM) in Python PDF
No ratings yet
Complete Guide To Parameter Tuning in Gradient Boosting (GBM) in Python PDF
5 pages
Strategic Evaluation Documentation of Boots
No ratings yet
Strategic Evaluation Documentation of Boots
13 pages
30 LTE ERAN7.0 PCI Conflict Detection Self-Optimization
No ratings yet
30 LTE ERAN7.0 PCI Conflict Detection Self-Optimization
22 pages
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
No ratings yet
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
1 page
Ombudsperson Decision
No ratings yet
Ombudsperson Decision
3 pages
9032 02b1
No ratings yet
9032 02b1
3 pages