An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks
Abstract
:1. Introduction
2. Related Work
3. Proposed EANAF
3.1. Formulation
3.2. Analysis of Proposed EANAF
3.2.1. Smoothness
3.2.2. Asymmetry
3.2.3. Unsaturation
3.2.4. Non-Monotonicity and Self-Regularity
3.3. Discussions
4. Experimental Results
4.1. Experimental Settings
4.2. Comparison Experiments on ResNet
4.3. Experimental Results on CSPDarknet
4.4. Comparisons of Efficiency
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lecun, Y.; Bottou, L. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Image Style Transfer Using Convolutional Neural Networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Cheng, J.; Dong, L.; Lapata, M. Long Short-Term Memory-Networks for Machine Reading. arXiv 2016, arXiv:1601.06733. [Google Scholar]
- Bishop, C.M. Neural Networks for Pattern Recognition. In Advances in Computers; Clarendon Press: Oxford, UK, 1993; pp. 119–166. [Google Scholar]
- Yukun, S.; Xiaohang, G.; Duoli, Z.; Gaoming, D. The piecewise non-linear approximation of the sigmoid function and its implementation in FPGA. Appl. Electron. Technol. 2017, 43, 49–51. [Google Scholar]
- Apicella, A.; Donnarumma, F.; Isgrò, F.; Prevete, R. A survey on modern trainable activation functions. Neural Netw. 2021, 138, 14–32. [Google Scholar] [CrossRef]
- Szandaa, T. Review and Comparison of Commonly Used Activation Functions for Deep Neural Networks. In Bio-Inspired Neurocomputing; Springer: Singapore, 2020; pp. 203–224. [Google Scholar]
- Bingham, G.; Miikkulainen, R. Discovering Parametric Activation Functions. Neural Netw. 2022, 148, 48–65. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv 2016, arXiv:1512.03385. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. Trans. Pattern Anal. Mach. Intell. 2017, 42, 318–327. [Google Scholar] [CrossRef] [Green Version]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the European Conference on Computer Vision 2014, Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
- Huaguang, Z.; Zhiliang, W.; Ming, L.I.; Quan, Y.-B. Generalized Fuzzy Hyperbolic Model: A Universal Approximator. J. Autom. Sin. 2004, 30, 416–422. [Google Scholar]
- Chang, C.H.; Zhang, E.H.; Huang, S.H. Softsign Function Hardware Implementation Using Piecewise Linear Approximation. In Proceedings of the 2019 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Taipei, Taiwan, 3–6 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–2. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems; Association for Computing Machinery: New York, NY, USA, 2012; pp. 1097–1105. [Google Scholar]
- Nair, V.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning, Haifa, Israel, 21–24 June 2010; Omnipress: Madison, WI, USA, 2010; pp. 807–814. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Maas, A.L.; Hannun, A.Y.; Ng, A.Y. Rectifier nonlinearities improve neural network acoustic models. In Proceedings of the ICML, Atlanta, GA, USA, 16–21 June 2013; p. 3. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. arXiv 2015, arXiv:1502.01852. [Google Scholar]
- Xu, B.; Wang, N.; Chen, T.; Li, M. Empirical evaluation of rectified activations in convolutional network. arXiv 2015, arXiv:1505.00853. [Google Scholar]
- El Jaafari, I.; Ellahyani, A.; Charfi, S. Parametric rectified nonlinear unit (PRenu) for convolution neural networks. Signal Image Video Process. 2021, 15, 241–246. [Google Scholar] [CrossRef]
- Clevert, D.; Unterthiner, T.; Hochreiter, S. Fast and accurate deep network learning by exponential linear units. arXiv 2015, arXiv:1511.07289. [Google Scholar]
- Barron, J.T. Continuously Differentiable Exponential Linear Units. arXiv 2017, arXiv:1704.07483v1. [Google Scholar]
- Klambauer, G.; Unterthiner, T.; Mayr, A.; Hochreiter, S. Self-Normalizing Neural Networks. arXiv 2017, arXiv:1706.02515. [Google Scholar]
- Hendrycks, D.; Gimpel, K. Gaussian error linear units (gelus). arXiv 2016, arXiv:1606.08415. [Google Scholar]
- Chao, Y.; Su, Z. Symmetrical Gaussian Error Linear Units (SGELUs). arXiv 2019, arXiv:1911.03925. [Google Scholar]
- Dugas, C.; Bengio, Y.; Belisle, F.; Nadeau, C. Incorporating second order functional knowledge into learning algorithms. In Advances in Neural Information Processing Systems 13, Proceedings of the 2000 Neural Information Processing Systems (NIPS) Conference, Denver, CO, USA, 28–30 November 2000; MIT Press: Cambridge, MA, USA, 2000; pp. 472–478. [Google Scholar]
- Ramachandran, P.; Zoph, B.; Le, Q.V. Searching for activation functions. arXiv 2017, arXiv:1710.05941. [Google Scholar]
- Misra, D. Mish: A Self Regularized Non-Monotonic Neural Activation Function. arXiv 2020, arXiv:1908.08681. [Google Scholar]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for MobileNetV3. arXiv 2019, arXiv:1905.02244. [Google Scholar]
Property | ReLU [15,16] | LeakyReLU [18] | ReLU6 [17] | ELU [22] | GELU [25] | Sigmoid [5] | Tanh [13] | Softsign [14] | Softplus [27] | Swish [28] | h-Swish [30] | EANAF |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Smoothness | × | × | × | √ | √ | √ | √ | √ | √ | √ | × | √ |
Asymmetry | √ | √ | × | √ | √ | × | × | × | √ | √ | √ | √ |
Unsaturation | √ | √ | × | √ | √ | × | × | × | × | √ | √ | √ |
Non-monotonicity and self-regularity | √ | √ | × | √ | × | × | × | × | √ | √ | √ | √ |
Activation Function | mAP | AP50 | AP75 |
---|---|---|---|
ReLU [15,16] | 32.5% | 50.9% | 34.8% |
Swish [28] | 35.7% | 53.1% | 36.8% |
Proposed EANAF | 37.1% | 60.0% | 37.4% |
Activation Function | mAP | AP50 | AP75 |
---|---|---|---|
ReLU [15,16] | 39.6% | 58.6% | 42.3% |
Swish [28] | 41.2% | 63.8% | 45.3% |
Proposed EANAF | 43.2% | 65.7% | 47.3% |
Activation | Data Type | Forward Pass | Backward Pass |
---|---|---|---|
ReLU | Fp32 | 224.2 μs ± 621.8 ns | 419.3 μs ± 1.238 μs |
Swish | Fp32 | 342.7 μs ± 1.026 μs | 497.3 μs ± 1.357 μs |
EANAF | Fp32 | 372.0 μs ± 1.852 μs | 529.1 μs ± 1.882 μs |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chai, E.; Yu, W.; Cui, T.; Ren, J.; Ding, S. An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks. Symmetry 2022, 14, 1027. https://doi.org/10.3390/sym14051027
Chai E, Yu W, Cui T, Ren J, Ding S. An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks. Symmetry. 2022; 14(5):1027. https://doi.org/10.3390/sym14051027
Chicago/Turabian StyleChai, Enhui, Wei Yu, Tianxiang Cui, Jianfeng Ren, and Shusheng Ding. 2022. "An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks" Symmetry 14, no. 5: 1027. https://doi.org/10.3390/sym14051027
APA StyleChai, E., Yu, W., Cui, T., Ren, J., & Ding, S. (2022). An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks. Symmetry, 14(5), 1027. https://doi.org/10.3390/sym14051027