Hyperparameter Tuning
Hyperparameter Tuning
[email protected]
AHT0EZ3ODY
| Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
1. Hyper parameters are like handles available to the modeler to control the
behavior of the algorithm used for modeling
[email protected]
AHT0EZ3ODY 3. To get a list of hyper parameters for a given algorithm, call the function
e.g. to get support vector classifier hyper parameters
1. from sklearn.svm import SVC
2. svc= SVC()
3. svc.get_params()
4. Hyper parameters are not learnt from the data as other model parameters
are. For e.g. attribute coefficients in a linear model are learnt from data while
cost of error is input as hyper parameter.
| Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
8. The testing data should be separately transformed * using the same functions
that were used to transform the rest of the data for model building and hyper
parameter tuning
* Any transformation where rows influence each other. For e.g. using zscore. OneHotCode
transformation does not come into this category. It can be done before splitting the data
| Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
| Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
RandomizedSearchCV
1. Random search differs from grid search. Instead of providing a discrete set of
values to explore on each hyperparameter (parameter grid), we provide a
statistical distribution.
2. Values for the different hyper parameters are picked up at random from this
[email protected]
AHT0EZ3ODY combine distribution
3. The motivation to use random search in place of grid search is that for many
cases, hyperparameters are not equally important.
A Gaussian process analysis of the function from hyper-parameters to validation set performance reveals that
for most data sets only a few of the hyper-parameters really matter, but that different hyper-parameters
are important on different data sets. This phenomenon makes grid search a poor choice for configuring
algorithms for new data sets. - Bergstra, 2012
| Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
[email protected]
AHT0EZ3ODY
4. In contrast to GridSearchCV, not all combinations are evaluated.A fixed number of parameter
settings is sampled from the specifieddistributions.
5. The number of parameter settings that are tried is given byn_iter
6. If all parameters are presented as a list, sampling without replacement is performed. If at least one
parameter is given as a distribution, sampling with replacement is used. It is highly recommended to
use continuous distributions for continuous parameters
7. Randomsearch has higher chance of hitting the right combination thangridsearch.
| Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited