Learning Activation Functions: A new paradigm of understanding Neural Networks

Goyal, Mohit; Goyal, Rajan; Lall, Brejesh

Computer Science > Machine Learning

arXiv:1906.09529v1 (cs)

[Submitted on 23 Jun 2019 (this version), latest version 9 Dec 2020 (v3)]

Title:Learning Activation Functions: A new paradigm of understanding Neural Networks

Authors:Mohit Goyal, Rajan Goyal, Brejesh Lall

View PDF

Abstract:There has been limited research in the domain of activation functions, most of which has focused on improving the ease of optimization of neural networks (NNs). However, to develop a deeper understanding of deep learning, it becomes important to look at the non linear component of NNs more carefully. In this paper, we aim to provide a generic form of activation function along with appropriate mathematical grounding so as to allow for insights into the working of NNs in future. We propose "Self-Learnable Activation Functions" (SLAF), which are learned during training and are capable of approximating most of the existing activation functions. SLAF is given as a weighted sum of pre-defined basis elements which can serve for a good approximation of the optimal activation function. The coefficients for these basis elements allow a search in the entire space of continuous functions (consisting of all the conventional activations). We propose various training routines which can be used to achieve performance with SLAF equipped neural networks (SLNNs). We prove that SLNNs can approximate any neural network with lipschitz continuous activations, to any arbitrary error highlighting their capacity and possible equivalence with standard NNs. Also, SLNNs can be completely represented as a collections of finite degree polynomial upto the very last layer obviating several hyper parameters like width and depth. Since the optimization of SLNNs is still a challenge, we show that using SLAF along with standard activations (like ReLU) can provide performance improvements with only a small increase in number of parameters.

Comments:	Article submitted to the Journal of Machine Learning Research (JMLR) before being uploaded on arxiv
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:1906.09529 [cs.LG]
	(or arXiv:1906.09529v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.09529

Submission history

From: Mohit Goyal [view email]
[v1] Sun, 23 Jun 2019 01:54:36 UTC (371 KB)
[v2] Mon, 8 Jul 2019 14:56:10 UTC (876 KB)
[v3] Wed, 9 Dec 2020 04:13:25 UTC (390 KB)

Computer Science > Machine Learning

Title:Learning Activation Functions: A new paradigm of understanding Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning Activation Functions: A new paradigm of understanding Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators