Theoretical Analysis of Auto Rate-Tuning by Batch Normalization

Arora, Sanjeev; Li, Zhiyuan; Lyu, Kaifeng

Computer Science > Machine Learning

arXiv:1812.03981 (cs)

[Submitted on 10 Dec 2018]

Title:Theoretical Analysis of Auto Rate-Tuning by Batch Normalization

Authors:Sanjeev Arora, Zhiyuan Li, Kaifeng Lyu

View PDF

Abstract:Batch Normalization (BN) has become a cornerstone of deep learning across diverse architectures, appearing to help optimization as well as generalization. While the idea makes intuitive sense, theoretical analysis of its effectiveness has been lacking. Here theoretical support is provided for one of its conjectured properties, namely, the ability to allow gradient descent to succeed with less tuning of learning rates. It is shown that even if we fix the learning rate of scale-invariant parameters (e.g., weights of each layer with BN) to a constant (say, $0.3$), gradient descent still approaches a stationary point (i.e., a solution where gradient is zero) in the rate of $T^{-1/2}$ in $T$ iterations, asymptotically matching the best bound for gradient descent with well-tuned learning rates. A similar result with convergence rate $T^{-1/4}$ is also shown for stochastic gradient descent.

Comments:	22 pages
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1812.03981 [cs.LG]
	(or arXiv:1812.03981v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1812.03981

Submission history

From: Zhiyuan Li [view email]
[v1] Mon, 10 Dec 2018 18:58:12 UTC (297 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-12

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Sanjeev Arora
Zhiyuan Li
Kaifeng Lyu

export BibTeX citation

Computer Science > Machine Learning

Title:Theoretical Analysis of Auto Rate-Tuning by Batch Normalization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Theoretical Analysis of Auto Rate-Tuning by Batch Normalization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators