Model HyperparameterTuning
Model HyperparameterTuning
Bert Gollnick
Hyperparameter Tuning
Introduction
network
loss function optimizer
objects
model number
learning rate batch size
training epochs
Hyperparameter Tuning
batch size
… …
Model
↓ GPU utilization ↑
Best practice often: 32 ↑ Iterations ↓
↑ Training-Stability ↓
Hyperparameter Tuning
Epochs
Number
Epochs
low high
↓ Training Time ↑
- Inference Time -
↓ Model performance ↑ (but dimishing)
↑ Stability ↓ (instability possible)
Hyperparameter Tuning
Hidden Layers
▪ grid search
▪ define search space (set of parameters with limiting values)
▪ evaluate each possible combination
▪ e.g. learning_rate = [0.1, 0.2], Run learning_rate batch_size
batch_size = [2, 4, 8] 0 0.1 2
▪ good for checking well-known parameters 1 0.2 2
2 0.1 4
▪ random search 3 0.2 4
▪ picks a point from configuration space 4 0.1 8
▪ good for discovery
5 0.2 8
Hyperparameter Tuning
skorch
▪ Repo: https://github.com/skorch-dev/skorch