Mollifying Networks

Gulcehre, Caglar; Moczulski, Marcin; Visin, Francesco; Bengio, Yoshua

Computer Science > Machine Learning

arXiv:1608.04980 (cs)

[Submitted on 17 Aug 2016]

Title:Mollifying Networks

Authors:Caglar Gulcehre, Marcin Moczulski, Francesco Visin, Yoshua Bengio

View PDF

Abstract:The optimization of deep neural networks can be more challenging than traditional convex optimization problems due to the highly non-convex nature of the loss function, e.g. it can involve pathological landscapes such as saddle-surfaces that can be difficult to escape for algorithms based on simple gradient descent. In this paper, we attack the problem of optimization of highly non-convex neural networks by starting with a smoothed -- or \textit{mollified} -- objective function that gradually has a more non-convex energy landscape during the training. Our proposition is inspired by the recent studies in continuation methods: similar to curriculum methods, we begin learning an easier (possibly convex) objective function and let it evolve during the training, until it eventually goes back to being the original, difficult to optimize, objective function. The complexity of the mollified networks is controlled by a single hyperparameter which is annealed during the training. We show improvements on various difficult optimization tasks and establish a relationship with recent works on continuation methods for neural networks and mollifiers.

Subjects:	Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1608.04980 [cs.LG]
	(or arXiv:1608.04980v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1608.04980

Submission history

From: Çağlar Gülçehre [view email]
[v1] Wed, 17 Aug 2016 14:37:34 UTC (347 KB)

Computer Science > Machine Learning

Title:Mollifying Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Mollifying Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators