A Shallow Network with Combined Pooling for Fast Traffic Sign Recognition
Abstract
:1. Introduction
2. Related Works
3. The Shallow CNNs
3.1. Convolutional Layer
3.2. Subsampling Layer
3.3. Full-Connected Layer & Softmax-Loss Layer
3.4. Overall Architecture
4. Experiments
4.1. Dataset
4.2. Experimental Analysis
- HOGv+KELM [15]: It proposes a new method combing the ELM algorithm and the HOGv feature. The features are learned by the HOGv, with improvements compared with the HOG.
- SHOG5-SBRP2 [16]: It proposes a compact yet discriminative SHOG descriptor, and chooses two sparse analytical non-linear classifiers for classification.
- Complementary Features [17]: The extracted 6252-D features are 2560-D HOG feature, 1568-D Gabor filter feature and 2124-D LBP feature.
- HOS-LDA [18]: It extracts the features by HOS-based entropies and textures, and maximizes between class covariance and minimizes within class covariance through LDA.
- Multi-scale CNNs [24]: The output of every stage of automatically learning hierarchies of invariant features is fed to the classifier. Features are learned in these CNNs.
- Committee of CNNs [23]: It is a collection of CNNs in which a single CNN has seven hidden layers. Features are learned in these CNNs.
- Human (best individual) [33]: Eight test persons were confronted with a randomly selected, but fixed subset of 500 images of the validation set. The best-performing one was selected to classify the test set.
- Ensemble CNNs [25]: It proposes a hinge-loss stochastic gradient descent method to train CNNs. Features are learned in these CNNs.
- CNN+ELM [26]: It takes the CNNs as the feature extractor while removing the full-connected layer after training. The ELM is chosen as the classifier. Features are learned in these CNNs.
5. Conclusions
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Wollmer, M.; Blaschke, C.; Schindl, T. Online driver distraction detection using long short-term memory. IEEE Trans. Intell. Transp. Syst. 2011, 12, 574–582. [Google Scholar] [CrossRef]
- Qadri, M.T.; Asif, M. Automatic number plate recognition system for vehicle identification using optical character recognition. In Proceedings of the 2009 International Conference on Education Technology and Computer, Singapore, 17–20 April 2009; pp. 335–338. [Google Scholar]
- Wu, T.; Ranganathan, A. A practical system for road marking detection and recognition. In Proceedings of the 2012 International Conference on the Intelligent Vehicles Symposium (IV), Madrid, Spain, 3–7 June 2012; pp. 25–30. [Google Scholar]
- De la Escalera, A.; Armingol, J.M.; Mata, M. Traffic sign recognition and analysis for intelligent vehicles. Image Vis. Comput. 2003, 21, 247–258. [Google Scholar] [CrossRef]
- Fu, M.Y.; Huang, Y.S. A survey of traffic sign recognition. In Proceedings of the 2010 International Conference on Wavelet Analysis and Pattern Recognition, Qingdao, China, 11–14 July 2010; pp. 119–124. [Google Scholar]
- Gudigar, A.; Chokkadi, S.; Raghavendra, U.; Acharya, U.R. Multiple thresholding and subspace based approach for detection and recognition of traffic sign. Multimed. Tools Appl. 2016, 76, 1–19. [Google Scholar] [CrossRef]
- Geronimo, D.; Lopez, A.M.; Sappa, A.D. Survey of pedestrian detection for advanced driver assistance systems. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1239–1258. [Google Scholar] [CrossRef] [PubMed]
- Mogelmose, A.; Trivedi, M.M.; Moeslund, T.B. Vision-based traffic sign detection and analysis for intelligent driver assistance systems: Perspectives and survey. IEEE Trans. Intell. Transp. Syst. 2012, 13, 1484–1497. [Google Scholar] [CrossRef]
- Gudigar, A.; Chokkadi, S.; Raghavendra, U. A review on automatic detection and recognition of traffic sign. Multimed. Tools Appl. 2016, 75, 333–364. [Google Scholar] [CrossRef]
- Liu, H.; Liu, Y.; Sun, F. Robust exemplar extraction using structured sparse coding. IEEE Trans. Neural Netw. Learn. Syst. 2015, 26, 1816–1821. [Google Scholar] [CrossRef] [PubMed]
- Lowe, D.G. Object recognition from local scale-invariant features. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; pp. 1150–1157. [Google Scholar]
- Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Hawaii, HI, USA, 8–14 December 2001; pp. 511–518. [Google Scholar]
- Liu, H.; Liu, Y.; Sun, F. Traffic sign recognition using group sparse coding. Inf. Sci. 2014, 266, 75–89. [Google Scholar] [CrossRef]
- Sun, Z.L.; Wang, H.; Lau, W.S. Application of BW-ELM model on traffic sign recognition. Neurocomputing 2014, 128, 153–159. [Google Scholar] [CrossRef]
- Huang, Z.; Yu, Y.; Gu, J. An efficient method for traffic sign recognition based on extreme learning machine. IEEE Trans. Cybern. 2016, 47, 920–933. [Google Scholar] [CrossRef] [PubMed]
- Kassani, P.H.; Teoh, A.B.J. A new sparse model for traffic sign classification using soft histogram of oriented gradients. Appl. Soft Comput. 2017, 52, 231–346. [Google Scholar] [CrossRef]
- Tang, S.; Huang, L.L. Traffic sign recognition using complementary features. In Proceedings of the 2013 2nd IAPR Asian Conference on Pattern Recognition, Okinawa, Japan, 5–8 November 2013; pp. 210–214. [Google Scholar]
- Gudigar, A.; Chokkadi, S.; Raghavendra, U. Local texture patterns for traffic sign recognition using higher order spectra. Pattern Recognit. Lett. 2017. [Google Scholar] [CrossRef]
- Lange, S.; Riedmiller, M. Deep auto-encoder neural networks in reinforcement learning. In Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN), Barcelona, Spain, 18–23 July 2010; pp. 1–8. [Google Scholar]
- Hinton, G.E. Deep belief networks. Scholarpedia 2009, 4, 5947. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G. Imagenet classification with deep convolutional neural networks. In Proceedings of the Twenty-Sixth Annual Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
- Lawrence, S.; Giles, C.L.; Tsoi, A.C. Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 1997, 8, 98–113. [Google Scholar] [CrossRef] [PubMed]
- Cireşan, D.; Meier, U.; Masci, J. A committee of neural networks for traffic sign classification. In Proceedings of the 2011 International Joint Conference on Neural Networks (IJCNN), California, CA, USA, 31 July–5 August 2011; pp. 1918–1921. [Google Scholar]
- Sermanet, P.; LeCun, Y. Traffic sign recognition with multi-scale convolutional networks. In Proceedings of the 2011 International Joint Conference on Neural Networks (IJCNN), San Jose, CA, USA, 31 July–5 August 2011; pp. 2809–2813. [Google Scholar]
- Jin, J.; Fu, K.; Zhang, C. Traffic sign recognition with hinge loss trained convolutional neural networks. IEEE Trans. Intell. Transp. Syst. 2014, 15, 1991–2000. [Google Scholar] [CrossRef]
- Zeng, Y.; Xu, X.; Fang, Y. Traffic sign recognition using deep convolutional networks and extreme learning machine. In Proceedings of the International Conference on Intelligent Science and Big Data Engineering, Suzhou, China, 14–16 June 2015; pp. 272–280. [Google Scholar]
- Boureau, Y.L.; Ponce, J.; LeCun, Y. A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 111–118. [Google Scholar]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. Aistats 2011, 15, 315–323. [Google Scholar]
- He, K.; Zhang, X.; Ren, S. Spatial pyramid pooling in deep convolutional networks for visual recognition. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 346–361. [Google Scholar]
- Boureau, Y.L.; Bach, F.; LeCun, Y. Learning mid-level features for recognition. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, NY, USA, 13–18 June 2010; pp. 2559–2566. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 July 2015; pp. 1–9. [Google Scholar]
- Vedaldi, A.; Lenc, K. Matconvnet: Convolutional neural networks for MATLAB. In Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia, 26–30 October 2015; pp. 689–692. [Google Scholar]
- Stallkamp, J.; Schlipsing, M.; Salmen, J. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 2012, 32, 323–332. [Google Scholar] [CrossRef] [PubMed]
Layer | Type | Number of Maps and Neurons | Kernel Size | Stride | Pad |
---|---|---|---|---|---|
1 | input | 3 maps of 32 × 32 neurons | - | - | - |
2 | Convolutional ReLU | 32 maps of 32 × 32 neurons 32 maps of 32 × 32 neurons | 5 × 5 - | 1 - | 2 0 |
3 | average-pooling | 32 maps of 16 × 16 neurons | 3 × 3 | 2 | [0 1 0 1] |
4 | Convolutional ReLU | 32 maps of 16 × 16 neurons 32 maps of 16 × 16 neurons | 5 × 5 - | 1 - | 2 0 |
5 | average-pooling | 32 maps of 8 × 8 neurons | 3 × 3 | 2 | [0 1 0 1] |
6 | Convolutional ReLU | 64 maps of 8 × 8 neurons 64 maps of 8 × 8 neurons | 5 × 5 - | 1 - | 2 0 |
7 | max-pooling | 64 maps of 4 × 4 neurons | 3 × 3 | 2 | [0 1 0 1] |
8 | Convolutional ReLU | 64 maps of 1 × 1 neurons 64 maps of 1 × 1 neurons | 4 × 4 - | 1 - | 0 0 |
9 | full-connected | 64 maps of 1 × 1 neurons | 1 × 1 | 1 | 0 |
10 | softmax-loss | 43 neurons | - | - | - |
Speed Limits | Other Prohibitions | Derestr-iction | Mandatory | Danger | Unique | |
---|---|---|---|---|---|---|
HOGv+KELM [15] | 99.54 | 100 | 98.33 | 99.94 | 98.96 | 99.95 |
Complementary Features [17] | 98.56 | 99.73 | 92.50 | 99.55 | 97.31 | 99.90 |
Multi-Scale CNNs [24] | 98.61 | 99.87 | 94.44 | 97.18 | 98.03 | 98.63 |
Committee of CNNs [23] | 99.47 | 99.93 | 99.72 | 99.89 | 99.07 | 99.22 |
Human (Best Individual) [33] | 98.32 | 99.87 | 98.89 | 100 | 99.21 | 100 |
Our Method | 99.93 | 99.80 | 99.44 | 100 | 99.13 | 99.90 |
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, J.; Huang, Q.; Wu, H.; Liu, Y. A Shallow Network with Combined Pooling for Fast Traffic Sign Recognition. Information 2017, 8, 45. https://doi.org/10.3390/info8020045
Zhang J, Huang Q, Wu H, Liu Y. A Shallow Network with Combined Pooling for Fast Traffic Sign Recognition. Information. 2017; 8(2):45. https://doi.org/10.3390/info8020045
Chicago/Turabian StyleZhang, Jianming, Qianqian Huang, Honglin Wu, and Yukai Liu. 2017. "A Shallow Network with Combined Pooling for Fast Traffic Sign Recognition" Information 8, no. 2: 45. https://doi.org/10.3390/info8020045
APA StyleZhang, J., Huang, Q., Wu, H., & Liu, Y. (2017). A Shallow Network with Combined Pooling for Fast Traffic Sign Recognition. Information, 8(2), 45. https://doi.org/10.3390/info8020045