SWALR#
- class torch.optim.swa_utils.SWALR(optimizer, swa_lr, anneal_epochs=10, anneal_strategy='cos', last_epoch=-1)[source]#
Anneals the learning rate in each parameter group to a fixed value.
This learning rate scheduler is meant to be used with Stochastic Weight Averaging (SWA) method (see torch.optim.swa_utils.AveragedModel).
- Parameters
optimizer (torch.optim.Optimizer) – wrapped optimizer
swa_lrs (float or list) – the learning rate value for all param groups together or separately for each group.
annealing_epochs (int) – number of epochs in the annealing phase (default: 10)
annealing_strategy (str) – “cos” or “linear”; specifies the annealing strategy: “cos” for cosine annealing, “linear” for linear annealing (default: “cos”)
last_epoch (int) – the index of the last epoch (default: -1)
The
SWALR
scheduler can be used together with other schedulers to switch to a constant learning rate late in the training as in the example below.Example
>>> loader, optimizer, model = ... >>> lr_lambda = lambda epoch: 0.9 >>> scheduler = torch.optim.lr_scheduler.MultiplicativeLR(optimizer, >>> lr_lambda=lr_lambda) >>> swa_scheduler = torch.optim.swa_utils.SWALR(optimizer, >>> anneal_strategy="linear", anneal_epochs=20, swa_lr=0.05) >>> swa_start = 160 >>> for i in range(300): >>> for input, target in loader: >>> optimizer.zero_grad() >>> loss_fn(model(input), target).backward() >>> optimizer.step() >>> if i > swa_start: >>> swa_scheduler.step() >>> else: >>> scheduler.step()
- get_last_lr()[source]#
Get the most recent learning rates computed by this scheduler.
- Returns
A
list
of learning rates with entries for each of the optimizer’sparam_groups
, with the same types as theirgroup["lr"]
s.- Return type
Note
The returned
Tensor
s are copies, and never alias the optimizer’sgroup["lr"]
s.
- get_lr()[source]#
Compute the next learning rate for each of the optimizer’s
param_groups
.Uses
anneal_func
to interpolate between each group’sgroup["lr"]
andgroup["swa_lr"]
overanneal_epochs
epochs. Onceanneal_epochs
is reached, keeps the learning rate fixed atgroup["swa_lr"]
.- Returns
A
list
of learning rates for each of the optimizer’sparam_groups
with the same types as their currentgroup["lr"]
s.- Return type
Note
If you’re trying to inspect the most recent learning rate, use
get_last_lr()
instead.Note
The returned
Tensor
s are copies, and never alias the optimizer’sgroup["lr"]
s.
- load_state_dict(state_dict)[source]#
Load the scheduler’s state.
- Parameters
state_dict (dict) – scheduler state. Should be an object returned from a call to
state_dict()
.
- state_dict()[source]#
Return the state of the scheduler as a
dict
.It contains an entry for every variable in self.__dict__ which is not the optimizer or anneal_func.
- step(epoch=None)[source]#
Step the scheduler.
- Parameters
epoch (int, optional) –
Deprecated since version 1.4: If provided, sets
last_epoch
toepoch
and uses_get_closed_form_lr()
if it is available. This is not universally supported. Usestep()
without arguments instead.
Note
Call this method after calling the optimizer’s
step()
.