0% found this document useful (0 votes)
5 views9 pages

Model HyperparameterTuning

Uploaded by

Parv In
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views9 pages

Model HyperparameterTuning

Uploaded by

Parv In
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Hyperparameter Tuning

Bert Gollnick
Hyperparameter Tuning
Introduction

Reasons for tuning parameters

training / inference time improving results convergence


Hyperparameter Tuning
Tunable Hyperparameters

▪ Hyper-parameters huge impact on model performance


▪ Intuition…check multiple combinations of parameters and pick the best
▪ available packages: RayTune, Optuna, skorch
▪ Hyperparameters
network number of activation
layer types …
topology nodes functions

network
loss function optimizer
objects

model number
learning rate batch size
training epochs
Hyperparameter Tuning
batch size

Model Training Small batch size Large batch size


Dataset

… …

Model

↓ GPU utilization ↑
Best practice often: 32 ↑ Iterations ↓
↑ Training-Stability ↓
Hyperparameter Tuning
Epochs

Number
Epochs
low high

↓ Training Time ↑
- Inference Time -
↓ Model performance ↑ (but dimishing)
↑ Stability ↓ (instability possible)
Hyperparameter Tuning
Hidden Layers

Few Hidden Layers Many Hidden Layers

Input Hidden Output Input Hidden Output


Layer Layer Layer Layer Layer Layer

↓ Ability to learn complex patterns ↑


↓ Training Time ↑
↓ Inference Time ↑
↓ Risk of Overfitting ↑
Hyperparameter Tuning
Nodes within a Layer

Few Nodes Many Hidden Layers

Input Hidden Output


Layer Layer Layer Input Hidden Output
Layer Layer Layer
↓ Ability to learn complex patterns ↑
↓ Training Time ↑
↓ Inference Time ↑
↓ Risk of Overfitting ↑
Hyperparameter Tuning
Types of Search

▪ grid search
▪ define search space (set of parameters with limiting values)
▪ evaluate each possible combination
▪ e.g. learning_rate = [0.1, 0.2], Run learning_rate batch_size
batch_size = [2, 4, 8] 0 0.1 2
▪ good for checking well-known parameters 1 0.2 2
2 0.1 4
▪ random search 3 0.2 4
▪ picks a point from configuration space 4 0.1 8
▪ good for discovery
5 0.2 8
Hyperparameter Tuning
skorch

▪ skorch…A scikit-learn compatible neural network library that wraps PyTorch.

▪ Repo: https://github.com/skorch-dev/skorch

▪ Works like a scikitlearn wrapper for PyTorch

▪ Can be integrated into


▪ sklearn pipeline
▪ grid search

You might also like