Hand Gesture Recognition Based Virtual Mouse Events
Hand Gesture Recognition Based Virtual Mouse Events
Dr Radha Shankarmani
Department of Information Technology
Sardar Patel Institute of Technology
Mumbai, India
[email protected]
Abstract—This paper proposes a virtual mouse application The first step towards this is detecting the hand contour,
based on the tracking of different hand gestures. The system i.e., the continuous parts of the hand along the boundary with
eliminates the dependency on any external hardware required the same colour and intensity [2]. This is followed by
to perform mouse actions. A built-in camera tracks the user’s creating a convex hull and identifying the convexity defects.
hands, predefined gestures are recognized and the The resultant image is passed into a model trained by
corresponding mouse events are executed. This system has convolutional neural networks to identify the gesture.
been implemented in Python using OpenCV and PyAutoGUI. Centroid identification is done to track the hand. Lastly,
Researchers have studied background conditions, effects of PyAutoGUI libraries are used to trigger the corresponding
differences in illuminance and skin colour individually.
mouse clicks and cursor movement.
However, the proposed system aims to take into account all the
above factors to build an application most suitable in the real This paper is organized as follows: Section II describes in
world. brief, the literature survey; Section III explains the proposed
system and the methodology in detail; Section IV shows the
Keywords— Image Processing, Computer Vision, Gesture results of the model implementation; Section V finally
Recognition, Color Detection, Virtual Mouse, Human-Computer concludes the paper.
Interaction
II. LITERATURE SURVEY
I. INTRODUCTION
Human-Computer Interaction consists of two main
The most commonly employed method of cursor control approaches for Hand Gesture Recognition, hardware-based
in computer systems is the use of a mouse or a touchpad. and vision-based. One of the first hardware-based approaches,
Although it provides comfort and ease of use, this proposed by Quam in 1990, used data gloves to recognize
technology is not free of hardware. In the proposed system, gestures.
an effort is made to completely replace physical hardware
devices for cursor control with a gesture recognition-based The method required the user to wear a bulky data glove
system. which was inconvenient and made it difficult to perform
certain gestures. Vision-based HCI is further classified into
Gesture recognition, a subdomain of computer vision, marker-based and marker-less approaches. The former
consists of a set of images, which may or may not be a video requires the user to wear colour markers or colour caps, while
sequence, given as input. The predefined gestures, if any, are the latter works on the principle of skin detection and hand
detected as the output [5]. This system consists of segmentation [4].
components such as image segmentation, pattern detection,
statistical modelling, etc [12]. Various approaches have been proposed for the use of
colour caps for the detection of fingertips [1] [6]. The intensity
Skin detection is a primary subcategory of image of these pixels in the grayscale image distinguishes the
segmentation. This feature helps in determining hand regions fingertips from the rest of the frame. However, these systems
in a reasonable amount of time. Identification of skin-like do not perform skin pixel detection and hand segmentation
pixels in an image is a binary classification problem from stored frames. Secondly, the accuracy obtained for a
formulated in [8]. Different colour spaces like RGB and noisy background is much less than that obtained for a plain
HSV have been tested and offered various threshold values background.
to discriminate skin pixels from non-skin ones [9].
Siam et al. have applied a similar procedure involving two
Once the skin pigments are extracted, face detection is basic steps, marker detection, and marker tracking. Coloured
performed. There are mainly two ways to build skin colour markers have resulted in lower computational time and
models: nonparametric skin modelling method and increased the accuracy of gesture detection. They have
parametric techniques [10]. This paper implements the Haar implemented the sliding window algorithm which, although
feature- based classification system where models are trained optimized, requires high powered processors to deliver better
from positive and negative images [11]. Once the faces are results. Tracking accuracy has also been observed to reduce in
identified, we subtract them from the image so that the case of hand movement at a greater speed [4].
system can focus solely on hand gesture recognition.
2
Authorized licensed use limited to: Zhejiang University. Downloaded on September 12,2024 at 22:37:46 UTC from IEEE Xplore. Restrictions apply.
3) Gesture Recognition TABLE I. GESTURES AND THEIR CORRESPONDING MOUSE EVENTS
A dataset containing 6 classes of hand gestures with 1200 Gesture Mouse Event
images each, has been created. This is then accumulated and L Cursor movement
labelled into a csv (comma separated values) file. Using a Paper Left Click, Double Click
Convolutional Neural Network (CNN) architecture as Three Right Click
mentioned in Figure 2, a deep learning model has been Scissor Scroll Down
trained [16]. The model consists of two pairs of convolution Rock On Scroll Up
layers followed by max-pooling layers. The Dropout and Rock Launch Notepad
some Fully Connected layers are added to the model. A
ReLU activation function is used for its initial and hidden
layers and a softmax activation function for its final output
The CNN model has achieved a testing accuracy of
layer. The optimizer used is the Adam optimizer and the loss
98.47%. The gestures recognized are correctly converted to
function used is categorical cross-entropy. As stated above,
the corresponding mouse actions. It has also been observed
the hand region is detected using the endpoints that were
that the gesture ‘Paper’ when held for long emulates the
obtained from contours. This region is preprocessed and
double click operation.
passed as an input to the model to classify the hand gestures
for performing mouse actions.
3
Authorized licensed use limited to: Zhejiang University. Downloaded on September 12,2024 at 22:37:46 UTC from IEEE Xplore. Restrictions apply.
REFERENCES
[1] K. H. Shibly, S. Kumar Dey, M. A. Islam and S. Iftekhar Showrav,
"Design and Development of Hand Gesture Based Virtual Mouse,"
2019 1st International Conference on Advances in Science,
Engineering and Robotics Technology (ICASERT), Dhaka,
Bangladesh, 2019, pp. 1-5, doi: 10.1109/ICASERT.2019.8934612.
[2] Preksha Pareek, R. P. (2017). Event Triggering Using Hand Gesture
Using Open CV. International Journal of Engineering and Computer
Science, 5(2). Retrieved from
https://www.ijecs.in/index.php/ijecs/article/view/377
[3] Onkar Yadav, Sagar Makwana, Pandhari Yadav, Prof. Leena Raut,
“Cursor Movement By Hand Gesture”, International Journal Of
Engineering Sciences & Research Technology, vol. , no. 3, pp. 243–
247, 2017.
Fig. 7. 'Three' gesture for right-click [4] Siam, Sayem & Sakel, Jahidul & Kabir, Md. (2016). Human Computer
Interaction Using Marker Based Hand Gesture Recognition.
[5] Ozturk, O., Aksac, A., Ozyer, T. et al. “Boosting real-time
recognition of hand posture and gesture for virtual mouse operations
with segmentation”. Appl Intell 43, 786–801 (2015).
https://doi.org/10.1007/s10489-015-0680-z
[6] Abhilash S S, Lisho Thomas, Naveen Wilson, Chaithanya C, “Virtual
Mouse Using Hand Gesture”, International Research Journal of
Engineering and Technology (IRJET), vol. 5, no. 4, pp.3903–3906,
2018.
[7] Grif, Horatiu & Turc, Traian. (2018). Human hand gesture based
system for mouse cursor control. Procedia Manufacturing. 22. 1038-
1042. 10.1016/j.promfg.2018.03.147.
[8] Cohen, Ira & Sebe, Nicu & Garg, Ashutosh & Chen, Lawrence. (2003).
Facial expression recognition from video sequences: Temporal and
static modeling. Computer Vision and Image Understanding. 91. 160-
187. 10.1016/S1077-3142(03)00081-X.
Fig. 8. ‘Rock’ gesture for opening the Notepad application [9] Kakumanu, Praveen & Makrogiannis, Sokratis & Bourbakis, NG.
(2007). A survey of skin-color modeling and detection methods.
Pattern Recognition. 40. 1106-1122. 10.1016/j.patcog.2006.06.010.
V. CONCLUSION
[10] Qiong Liu and Guang-zheng Peng, "A robust skin color based face
The virtual mouse based on hand tracking and gesture detection algorithm," 2010 2nd International Asia Conference on
recognition has been successfully implemented. Conversion Informatics in Control, Automation and Robotics (CAR 2010), Wuhan,
China, 2010, pp. 525-528, doi: 10.1109/CAR.2010.5456614.
of colour space from RGB to HSV has provided greater
accuracy in low light conditions. The presence of the user's [11] P. Viola and M. Jones, "Rapid object detection using a boosted cascade
of simple features," Proceedings of the 2001 IEEE Computer Society
face in the background does not affect the results since face Conference on Computer Vision and Pattern Recognition. CVPR 2001,
detection and subsequent face subtraction have been Kauai, HI, USA, 2001, pp. I-I, doi: 10.1109/CVPR.2001.990517.
executed. Hand contour and convex hull have been [12] S. Mitra and T. Acharya, "Gesture Recognition: A Survey," in IEEE
constructed which help in hand detection. The centroid Transactions on Systems, Man, and Cybernetics, Part C (Applications
identification has provided ease of tracking hand movements. and Reviews), vol. 37, no. 3, pp. 311-324, May 2007, doi:
The creation of a custom dataset with predefined gestures has 10.1109/TSMCC.2007.893280.
significantly increased the accuracy of the CNN model in [13] Z. Liu, W. Chen, Y. Zou and C. Hu, "Regions of interest extraction
based on HSV color space," IEEE 10th International Conference on
identifying them. Since factors such as differences in Industrial Informatics, Beijing, China, 2012, pp. 481-485, doi:
illuminance and skin colour have a negligible effect on the 10.1109/INDIN.2012.6301214.
performance of the model, the proposed system can replace [14] Padilla, Rafael & Filho, Cicero & Costa, Marly. (2012). Evaluation of
the existing hardware used for cursor control. Thus, the paper Haar Cascade Classifiers for Face Detection.
is a significant advancement in the field of Human-Computer [15] Khan, Faiz. (2020). Computer Vision Based Mouse Control Using
Interaction. Although this system is currently employed for Object Detection and Marker Motion Tracking. 9. 35-45. Faiz Khan et
basic cursor control, it can be extended into various real- al, International Journal of Computer Science and Mobile Computing,
Vol.9 Issue.5, May- 2020, pg. 35-45
world applications such as Digital Art, Gaming, Virtual and
Augmented Reality. [16] Pinto, Raimundo & Braga Borges, Carlos & Almeida, Antonio & Paula
Jr, Ialis. (2019). Static Hand Gesture Recognition Based on
Convolutional Neural Networks. Journal of Electrical and Computer
ACKNOWLEDGMENT Engineering. 2019. 1-12. 10.1155/2019/4167890.
The authors are thankful to Dr Radha Shankarmani for [17] J. Lee, R. Haralick and L. Shapiro, "Morphologic edge detection," in
mentoring us throughout the research and development of the IEEE Journal on Robotics and Automation, vol. 3, no. 2, pp. 142-156,
Virtual Mouse based on Gesture Recognition. It’s owing to April 1987, doi: 10.1109/JRA.1987.1087088.
her help that the paper has reached its goal to serve the [18] Verma, Rachna. (2017). A Review of Object Detection and Tracking
Methods. International Journal of Advance Engineering and Research
desired purpose. The authors would take this opportunity to Development. 4. 569-578.
thank Sardar Patel Institute of Technology for providing all
the necessary resources which fueled the progress of this
research.
4
Authorized licensed use limited to: Zhejiang University. Downloaded on September 12,2024 at 22:37:46 UTC from IEEE Xplore. Restrictions apply.