Unitary Evolution Recurrent Neural Networks

Arjovsky, Martin; Shah, Amar; Bengio, Yoshua

Computer Science > Machine Learning

arXiv:1511.06464 (cs)

[Submitted on 20 Nov 2015 (v1), last revised 25 May 2016 (this version, v4)]

Title:Unitary Evolution Recurrent Neural Networks

Authors:Martin Arjovsky, Amar Shah, Yoshua Bengio

View PDF

Abstract:Recurrent neural networks (RNNs) are notoriously difficult to train. When the eigenvalues of the hidden to hidden weight matrix deviate from absolute value 1, optimization becomes difficult due to the well studied issue of vanishing and exploding gradients, especially when trying to learn long-term dependencies. To circumvent this problem, we propose a new architecture that learns a unitary weight matrix, with eigenvalues of absolute value exactly 1. The challenge we address is that of parametrizing unitary matrices in a way that does not require expensive computations (such as eigendecomposition) after each weight update. We construct an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned. Optimization with this parameterization becomes feasible only when considering hidden states in the complex domain. We demonstrate the potential of this architecture by achieving state of the art results in several hard tasks involving very long-term dependencies.

Subjects:	Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
Cite as:	arXiv:1511.06464 [cs.LG]
	(or arXiv:1511.06464v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1511.06464

Submission history

From: Martin Arjovsky [view email]
[v1] Fri, 20 Nov 2015 00:37:33 UTC (802 KB)
[v2] Sat, 28 Nov 2015 18:42:08 UTC (854 KB)
[v3] Thu, 18 Feb 2016 00:52:28 UTC (914 KB)
[v4] Wed, 25 May 2016 23:34:38 UTC (919 KB)

Computer Science > Machine Learning

Title:Unitary Evolution Recurrent Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Unitary Evolution Recurrent Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators