Paper 3
Paper 3
https://doi.org/10.1007/s00500-020-04860-5 (0123456789().,-volV)(0123456789().
,- volV)
Abstract
The physical movement of the human hand produces gestures, and hand gesture recognition leads to the advancement in
automated vehicle movement system. In this paper, the human hand gestures are detected and recognized using convo-
lutional neural networks (CNN) classification approach. This process flow consists of hand region of interest segmentation
using mask image, fingers segmentation, normalization of segmented finger image and finger recognition using CNN
classifier. The hand region of the image is segmented from the whole image using mask images. The adaptive histogram
equalization method is used as enhancement method for improving the contrast of each pixel in an image. In this paper,
connected component analysis algorithm is used in order to segment the finger tips from hand image. The segmented finger
regions from hand image are given to the CNN classification algorithm which classifies the image into various classes. The
proposed hand gesture detection and recognition methodology using CNN classification approach with enhancement
technique stated in this paper achieves high performance with state-of-the-art methods.
123
15240 P. S. Neethu et al.
2 Literature survey
Zuocai Wang et al. (2018) proposed hand gesture recog- 3 Proposed methodology
nition system using particle filtering approach. The authors
applied this filtering approach on hand gesture images with In this paper, the human hand gestures are detected and
same background. The authors obtained 92.1% of sensi- recognized using CNN classification approach. This pro-
tivity, 84.7% of specificity and 90.6% of accuracy. Suguna cess flow consists of hand ROI segmentation using mask
and Neethu (2017) extracted shape features from hand image, fingers segmentation, normalization of segmented
gesture image for the classification of hand gesture images finger image and finger recognition using CNN classifier.
into various classes. Then, these extracted features were Figure 2 shows the proposed flow of hand gesture recog-
trained and classified using k-means clustering algorithm. nition system.
Marium et al. (2017) proposed hand gesture recognition The proposed algorithm for hand gesture recognition
system using convexity algorithm approach. The authors system is given in the following.
123
An efficient method for human hand gesture detection and recognition using deep learning… 15241
Start;
End;
123
15242 P. S. Neethu et al.
123
An efficient method for human hand gesture detection and recognition using deep learning… 15243
Gaussian filter of size 3 * 3 for 128 filters, fourth convo- feature map have weights that are constrained to be equal;
lutional layer is designed with Gaussian filter of size 3 * 3 however, different feature maps within the same convolu-
for 256 filters, and fifth convolutional layer is designed tional layer have different weights so that several features
with Gaussian filter of size 3 * 3 for 512 filters. The final can be extracted at each location.
fully connected layer is a standard feed forward neural The feature map at ith level can be determined using the
network. This fully connected layer produces final classi- following equation,
fication responses. Yi ¼ f ð W i I Þ
3.3.1 Convolutional layers where W is internal weight and I is the input source image.
Each convolutional layers act as feature extractors, and 3.3.2 Pooling layers
they extract individual feature set from the input source
image for classification process. The feature map is con- The purpose of the pooling layers is to reduce the spatial
structed by integrating the neuron factors which are resolution of the feature maps which are obtained from
obtained from convolutional layers. All neurons within a convolutional layers. There are two pooling techniques
used in pattern recognition as average pooling and max
pooling, as depicted in Fig. 8. The average pooling
demolishes the originality of the source image pixel, and
the max pooling retains the original pixel value in source
hand gesture image. Hence, in this paper, max pooling
aggregation methodology is used which determines the
maximum value from each feature set map and passes these
maximum feature set values to the next layer. This can be
illustrated in the following equation.
Pi ¼ MaxðYiÞ
In this paper, max pooling with a filter of size 2 9 2
with a stride of 2 is commonly used in practice. This paper
uses 4 numbers of max pooling layers in order to obtain the
optimum classification accuracy.
123
15244 P. S. Neethu et al.
Fig. 7 Internal architecture of proposed CNN classifier for hand gesture recognition
The proposed hand gesture detection and recognition where TP is the true positive which represents the total
methodology is simulated using Python version 2.7 open- number of correctly recognized hand gesture images (154
source simulation software. This open-source software is images) and TN is the true negative which represents the
authorized by Python scientific distributions. The Python total number of correctly recognized non-hand gesture
software package includes spyder, Anaconda, keras, panda images (50 images). FP is the false positive which repre-
and theano modules. These modules are license free and sents the total number of wrongly recognized hand gesture
available as open tools. Each module is integrated in images (six images), and FN is the false negative which
Python kernel, and Python programming language is used represents the total number of wrongly recognized non-
to simulate the proposed work. The Python software is hand gesture images (five images). The value of sensitivity,
installed in windows 8 with 4 GB internal memory and specificity and accuracy lies between 0 and 100, and they
executed in core i3 processor. are determined in %. Higher values of these parameters
The proposed hand gesture detection and recognition show that the efficiency of the proposed hand gesture
methodology is applied on the images which are openly detection and recognition methodology is high. Table 1
available in Kawulok et al. (2012) dataset. This dataset shows the analysis of recognition rate of proposed method
123
An efficient method for human hand gesture detection and recognition using deep learning… 15245
with respect to different gesture class images and pooling images for different applications. Hence, the CNN classi-
techniques. fication approach is combined with CC algorithm in order
Table 2 shows the performance analysis of proposed to improve the performance of the hand gesture detection
hand gesture recognition method in terms of sensitivity, system. This integrated approach achieved 96.8% of sen-
specificity and accuracy with recognition rate. The pro- sitivity, 89.2% of specificity and 94.8% of accuracy with
posed hand gesture detection and recognition methodology 96.2% of recognition rate.
using CNN classification approach stated in this paper The CNN with CC methodology improved 5.4% of
achieves 96.8% of sensitivity, 89.2% of specificity, 94.8% sensitivity rate from CNN without CC methodology. The
of accuracy and 96.2% of recognition rate. CNN with CC methodology improved 7.8% of specificity
The proposed hand gesture detection methodology is rate from CNN without CC methodology. The CNN with
applied only with CNN classification approach, and this CC methodology improved 3.4% of accuracy rate from
methodology achieved 91.5% of sensitivity, 82.7% of CNN without CC methodology. The CNN with CC
specificity and 91.6% of accuracy with 90.7% of recogni- methodology improved 6% of recognition rate from CNN
tion rate, as stated in Table 2. These simulation results are without CC methodology. The proposed hand gesture
not suitable for real-time recognition of hand gesture recognition system (AHE ? connected component
123
15246 P. S. Neethu et al.
Table 1 Analysis of recognition rate of proposed method with respect Table 4 Analysis of proposed hand gesture detection methodologies
to different gesture class images and pooling techniques using SVM
Gesture Number of Number of Recognition rate (%) Performance SVM Connected AHE ? connected
class gesture gestures correctly analysis classification component component
images recognized Using Using parameters approach analysis ? SVM analysis ? SVM
average max (%) classification classification
polling polling approach approach
Class 1 200 196 98 99 Sensitivity 89.1 90.5 92.1
Class 2 200 195 97.5 98.5 Specificity 78.7 87.5 89.9
Class 3 200 194 97 98.5 Accuracy 87.5 91.6 93.5
Class 4 200 194 97 99 Recognition 88.2 90.5 91.5
Class 5 200 193 96.5 99 rate
Class 6 200 192 96 98.5
Class 7 200 193 96.5 98.5
Class 8 200 193 96.5 98.5 specificity and 93.5% of accuracy with 91.5% of recogni-
1600 1550 96.8 98.7 tion rate, as depicted in Table 4.
The proposed CNN-based hand gesture recognition
system is compared with SVM classifier. The kernel of this
Table 2 Performance analysis of proposed hand gesture recognition SVM classifier is categorized into two types as linear and
method nonlinear. The regression pattern of the linear kernel SVM
Performance analysis parameters Experimental results (%) is exponentially curved than the regression pattern of the
nonlinear kernel SVM classifier. In this paper, the CNN
Sensitivity 98.1
classification results are compared with linear kernel SVM
Specificity 93.4 type.
Accuracy 96.2 Table 5 shows the performance comparisons of pro-
Recognition rate 98.7 posed hand gesture detection and recognition methodology
with state of arts in terms of sensitivity, specificity and
accuracy. The proposed hand gesture detection and
Table 3 Analysis of proposed hand gesture detection methodologies recognition methodology stated in this paper is compared
using CNN
with the conventional methods as Wang et al. (2018),
Performance CNN Connected AHE ? connected Marium et al. (2017), Rahman and Afrin (2013) and Rao
analysis classification component component
parameters approach analysis ? CNN analysis ? CNN
et al. (2009). Wang et al. (2018) used particle filtering
(%) classification classification method for hand gesture recognition and achieved 92.1%
approach approach of sensitivity, 84.7% of specificity and 90.6% of accuracy.
Marium et al. (2017) used convexity algorithm for hand
Sensitivity 91.5 96.8 98.1
gesture recognition and achieved 90.7% of sensitivity,
Specificity 82.7 89.2 93.4
82.1% of specificity and 87.5% of accuracy. Rahman and
Accuracy 91.6 94.8 96.2
Afrin (2013) used support vector machine for hand gesture
Recognition 90.7 96.2 98.7
rate
recognition and achieved 89.6% of sensitivity, 79.9% of
specificity and 85.7% of accuracy. Rao et al. (2009) used
hidden Markov model for hand gesture recognition and
achieved 90.1% of sensitivity, 82.6% of specificity and
analysis ? CNN classification approach) obtained 98.1% 90.6% of accuracy.
of sensitivity, 93.4% of specificity and 96.2% of accuracy The analysis work is performed using recognition time
with 98.7% of recognition rate. From Table 3, it is very which can be computed by the total execution time for
clear that the proposed hand gesture detection methodology recognizing the single hand gesture image for an automated
using CNN classification approach combined with CC process. It is essential for real-time applications for pro-
algorithms provides optimum results when compared with cessing the hand gestures under different environmental
CNN classification approach alone. conditions. Table 6 shows the performance comparisons of
The proposed methodology is also analyzed using SVM proposed methodology with state of art in terms of
classification approach. The methodology with recognition time (s).
AHE ? connected component analysis? SVM classifica- The proposed CNN classification approach based on CC
tion approach achieved 92.1% of sensitivity, 89.9% of algorithm with enhancement technique stated in this paper
123
An efficient method for human hand gesture detection and recognition using deep learning… 15247
Proposed methodology (in this CNN classification approach based on CC algorithm with 98.1 93.4 96.2
paper) enhancement technique
Sharma et al. (2019) Static hand gesture algorithm 93.2 91.1 91.8
Wang et al. (2019) Super-pixel 94.2 90.8 92.6
Earth mover’s distance classification algorithm
Wang et al. (2018) Particle filtering method 92.1 84.7 90.6
Chaikhumpha and Hidden Markov models 90.5 91.7 91.6
Chomphuwiset (2018)
Marium et al. (2017) Convexity algorithm 90.7 82.1 87.5
Rahman and Afrin (2013) Support vector machine 89.6 79.9 85.7
Rao et al. (2009) Hidden Markov model 90.1 82.6 90.6
Table 6 Performance comparisons of proposed methodology with state of art in terms of recognition time (s)
Authors and year Methodology Recognition time
(s)
Proposed methodology (in this paper) CNN classification approach based on CC algorithm with enhancement 0.356
technique
Proposed methodology (in this paper) SVM classification approach based on CC algorithm with enhancement 0.674
technique
Sharma et al. (2019) Static hand gesture algorithm 0.810
Wang et al. (2019) Super-pixel 0.768
Earth mover’s distance classification algorithm
Wang et al. (2018) Particle filtering method 0.789
Chaikhumpha and Chomphuwiset Hidden Markov models 0.973
(2018)
Marium et al. (2017) Convexity algorithm 1.372
Rahman and Afrin (2013) Support vector machine 2.102
Rao et al. (2009) Hidden Markov model 1.357
consumed 0.356 s as recognition time, whereas the con- classification approach trains and classifies the test hand
ventional methods in Wang et al. (2018) consumed 0.789 s, gesture image which is obtained from open access image
Chaikhumpha and Chomphuwiset (2018) consumed dataset. The performance of the proposed hand gesture
0.973 s, Marium et al. (2017) consumed 1.372 s, Rahman detection and recognition methodology is analyzed in
and Afrin (2013) consumed 2.102 s and Rao et al. (2009) terms of sensitivity, specificity, accuracy and recognition
consumed 1.357 s. From Table 4, it is very clear that the rate. The proposed hand gesture detection and recognition
proposed methodology for gesture recognition consumed methodology using CNN classification approach stated in
less recognition rate when compared with conventional this paper achieves 98.1% of sensitivity, 93.4% of speci-
methodologies. ficity, 96.2% of accuracy and 96.2% of recognition rate.
5 Conclusions
Compliance with ethical standards
In this paper, deep learning convolutional neural network- Conflict of interest All authors state that there is no conflict of
based hand gesture detection and recognition methodology interest.
is proposed. This proposed method segments the finger tips
from the hand gesture image, and then, this finger tips are Human and animal rights Humans/animals are not involved in this
work.
given as input to the CNN classifier. The CNN
123
15248 P. S. Neethu et al.
References Rao J, Gao T, Gong Z, Jiang Z (2009) Low cost hand gesture learning
and recognition system based on hidden markov model. In:
Proceedings of the 2009 2nd international symposium on
Ashfaq T, Khurshid K (2016) Classification of hand gestures using
information science and engineering (ISISE ‘09), IEEE, Shang-
gabor filter with bayesian and naı̈ve bayes classifier. Int J Adv
hai, China, December 2009, pp 433–438
Comput Sci Appl 7(3):276–279
Rautaray SS, Agrawal A (2015) Vision based hand gesture recogni-
Chaikhumpha T, Chomphuwiset P (2018) Real—time two hand
tion for human computer interaction: a survey. Artif Intell Rev
gesture recognition with condensation and hidden Markov
43(1):1–54
models. In: 2018 international workshop on advanced image
Ren Z, Yuan J, Meng J, Zhang Z (2013) Robust part-based hand
technology (IWAIT), Chiang Mai, 2018, pp 1–4
gesture recognition using kinect sensor. IEEE Trans Multimed
Elmezain M, Al-Hamadi A, Michaelis B (2010) A robust method for
15(5):1110–1120
hand gesture segmentation and recognition using forward
Sharma S, Jain S, Khushboo (2019) A static hand gesture and face
spotting scheme in conditional random fields. In: Proceedings
recognition system for blind people. In: 2019 6th international
of the 20th international conference on pattern recognition
conference on signal processing and integrated networks (SPIN),
(ICPR’10), August 2010, pp 3850–3853
Noida, India, 2019, pp 534–539
Hong J, Kim ES, Lee H-J (2012) Rotation-invariant hand posture
Suguna R, Neethu PS (2017) Hand gesture recognition using shape
classification with a convexity defect histogram. In: IEEE
features. Int J Pure Appl Math 117(8):51–54
international symposium on circuits and systems (ISCAS)
Tauseef H, Fahiem MA, Farhan S (2009) Recognition and translation
20–23 May 2012, pp 774–777
of hand gestures to Urdu alphabets using a geometrical
Kapil Y, Bhattacharya J (2016) Real-time hand gesture detection and
classification. In: Proceedings of IEEE international conference
recognition for human computer interaction. In: Berretti S,
on visualisation (VIZ), Barcelona, Spain, pp 213–217
Thampi S, Srivastava P (eds) Intelligent systems technologies
Wang Z, Chen B, Wu J (2018) Effective inertial hand gesture
and applications. Springer, Cham, pp 559–567
recognition using particle filtering based trajectory matching.
Kawulok M, Grzejszczak T, Nelapa J, Knyc M (2012) Database for
J Electric Comput Eng 2018, 6296013
hand gesture recognition. http://sun.aei.polsl.pl/*mkawulok/
Wang Y, Jung C, Yun I, Kim J (2019) SPFEMD: super-pixel based
gestures/. Accessed 12 May 2018
finger earth mover’s distance for hand gesture recognition. In:
Liu K, Chen C, Jafari R, Kehtarnavaz N (2014) Fusion of inertial and
ICASSP 2019–2019 IEEE international conference on acoustics,
depth sensor data for robust hand gesture recognition. Sens J
speech and signal processing (ICASSP), Brighton, United
IEEE 14(6):1898–1903
Kingdom, 2019, pp 4085–4089
Manresa-Yee C, Varona J, Mas R, Perales FJ (2005) Hand tracking
Yao Y, Fu Y (2014) Contour model-based hand-gesture recognition
and gesture recognition for human–computer interaction. Elec-
using the Kinect sensor. IEEE Trans Circ Syst Video Technol
tronic letters on computer vision and image analysis 5(3):96–104
24(11):1935–1944
Marium A, Rao D, Crasta DR, Acharya K, D’Souza R (2017) Hand
Yasukochi N, Mitome A, Ishii R (2008) A recognition method of
gesture recognition using webcam. Am J Intell Syst 7(3):90–94
restricted hand shapes in still image and moving image as a
Mitra S, Acharya T (2007) Gesture recognition: a survey. IEEE Trans
man–machine interface. In: IEEE conference on human system
Syst Man Cybern Part C Appl Rev 37(3):311–324
interactions, pp 306–310
Park S, Yu S, Kim J, Kim S, Lee S (2012) 3D hand tracking using
Yrk E, Konukoglu E, Sankur B, Darbon J (2006) Shape-based hand
Kalman filter in depth space. EURASIP J Adv Signal Process
recognition. IEEE Trans Image Process 15(7):1803–1815
2012(1), 36
Rahman MH, Afrin J (2013) Article: hand gesture recognition using
multiclass support vector machine. Int J Comput Appl Publisher’s Note Springer Nature remains neutral with regard to
74(1):39–43 jurisdictional claims in published maps and institutional affiliations.
123