Gradient-based Hyperparameter Optimization through Reversible Learning

Maclaurin, Dougal; Duvenaud, David; Adams, Ryan P.

Statistics > Machine Learning

arXiv:1502.03492 (stat)

[Submitted on 11 Feb 2015 (v1), last revised 2 Apr 2015 (this version, v3)]

Title:Gradient-based Hyperparameter Optimization through Reversible Learning

Authors:Dougal Maclaurin, David Duvenaud, Ryan P. Adams

View PDF

Abstract:Tuning hyperparameters of learning algorithms is hard because gradients are usually unavailable. We compute exact gradients of cross-validation performance with respect to all hyperparameters by chaining derivatives backwards through the entire training procedure. These gradients allow us to optimize thousands of hyperparameters, including step-size and momentum schedules, weight initialization distributions, richly parameterized regularization schemes, and neural network architectures. We compute hyperparameter gradients by exactly reversing the dynamics of stochastic gradient descent with momentum.

Comments:	10 figures. Submitted to ICML
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1502.03492 [stat.ML]
	(or arXiv:1502.03492v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1502.03492

Submission history

From: David Duvenaud [view email]
[v1] Wed, 11 Feb 2015 23:52:36 UTC (235 KB)
[v2] Fri, 13 Feb 2015 19:26:39 UTC (235 KB)
[v3] Thu, 2 Apr 2015 17:40:44 UTC (235 KB)

Full-text links:

Access Paper:

view license

Current browse context:

stat.ML

< prev | next >

new | recent | 2015-02

Change to browse by:

cs
cs.LG
stat

References & Citations

1 blog link

(what is this?)

export BibTeX citation

Statistics > Machine Learning

Title:Gradient-based Hyperparameter Optimization through Reversible Learning

Submission history

Access Paper:

References & Citations

1 blog link

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Gradient-based Hyperparameter Optimization through Reversible Learning

Submission history

Access Paper:

References & Citations

1 blog link

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators