Gesture Controlled Virtual Mouse Using AI
Gesture Controlled Virtual Mouse Using AI
https://doi.org/10.22214/ijraset.2023.52100
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
Abstract: This project offers a cursor control system that quickly navigates system controls while using a voice assistant and a
camera to record usermotions. Using the aid of MediaPipe, the user can control the computer cursor with hand gestures. It will
perform actions like left clicking and dragging using a variety of hand motions. Additionally, you have a choice to adjust the
brightness, loudness, and a number of other things. The system is constructed using advanced Python packages like MediaPipe,
OpenCV, etc. All i/o activities are physically controlled by a hand motion and a voice assistance. The research uses advanced
technologies like machine learning and computer vision techniques, which operates well without the use of any additional
computer resources, to recognize hand movements and spoken instructions.
Keywords: MediaPipe, Machine Learning, GestureRecognition, Virtual Mouse, Voice Assistant.
I. INTRODUCTION
Gestures are used to communicate nonverbally and todeliver a certain message. This message can be sent through a person's body,
hands, or face movements. When interacting with others, gestures can be used to express information from easy to highly difficult
handmotions. For illustrative example, we can employstraightforward gestures or motions that are expressedin sign languages and are
included into their syntax topoint to anything (a people or object), or employ range of many other simple gestures or motions. As a
result, employing hand gestures as a tool, humans canengage with one another more efficiently with the aidof computers.
The movement of a visual object is one mouse function that has been replaced by hand movements. The work is designed to be
cheap, and it captures hand gesture via a cam, one of many cheap input devices. Preset command-based movements are modelled
to touch materials. There are numerous current systems. One can move around the monitor using a standard mouse (hardware tool).
The monitor screen cannot be accessed with hand gestures. Another is the gesture system, which recognizes gestures using colored
tapes. Additionally, the functions are static and simple in nature. Usingcurrent technique, we could operate the mouse and do some
basic tasks on a computer or laptop with a web camera and microphone without the need of any other computer hardware. Other
procedures canbe done with a voice assistant.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2411
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
The mouse can be used to scroll, click once on the left side, click twice on the left side, and do other functions. For various processes,
different arrangements of the coloured caps are employed. Depending on the person beingutilised and the lighting environment, the
application can change the range of skin tones. After examining the programme output at various hand motions, an
approximation of the area ratio that the hand is not using in the convex hull is made. As the brightness ranges from 500 to 600 lux the
colour Red has a detection accuracy around 90% which is similar in case of Green and Blue which is typical of offices and well-lit
classrooms. This problem is solved by adapting a hand gesture recognition technology that detects the contours ofthe hand.
An Introduction to Hidden Markov Models by L. R..Rabiner B. H. Juang [4] tells that a key tool for the real-time, dynamic gesture
identification process is the Hidden Markov Model. The HMM approach is practical and designed to function in static settings. The
strategy involves using the HMM's LRB topology in co-occurrence with Baum Welch Algorithm and Forward and Viterbi
Algorithms for training and testing respectively, produces the best recognition of patterns. Although the system in thisstudy looks to
be simpler to use than more recent systems or command-based systems, it is less effective at spotting and recognising patterns. An
Arduino Uno, ultrasonic sensors, and a laptop are used in this study's hand gesture laptop to carry out tasks including managing
media playing and volume. Serial connections are made using Python, Arduino, and ultrasonic sensors. For more engaging and
interactive learning, immersive gaming, and interacting with virtual things on screen, this kind of technology can be used in the
classroom.
Akshaya U Kulkarni, Amit M Potdar proposed a system that is RADAR based Object Detector using Ultrasonic Sensor [5]. The
project entailed developing an ultrasonic sensor based, RADAR based used for object detection, was provided in this publication.
Instead of employing genuine RADAR, which is expensive and difficult to handle, it provides a solution for simple object detection
using ultrasonic technology that functions like RADAR. The work of other authors focuses primarily on either of these subjects.
IoT hardware and connection software were part of the endeavor. The Raspberry Pi 3 computer and Arduino Uno board processed
data. In order toidentify objects, the boards were equipped with an ultrasonic sensor and servo motor. The SIM808 module was then
used to send each object's distance, angle, and timestamp to the chosen number via SMS/message. Sample test cases were included
in the results to verify the object detection's detection range. The study provides a simple approach for object detection since, as
stated in the introductory part, ultrasonic detection has various benefits over RADAR.
D.Ghosh, P.K.Bora, and M.K.Bhuyan Co- articulation Detection in Hand Gestures [6] suggests that one of the biggest problems
with dynamic gesture recognition is co-articulation. For the class of gestures taken into consideration here, there haven't been many
documented vision-based methods for assessing co-articulation. The majority of the algorithms that have been suggested up to this
point have been successful only for a small number of gesture vocabularies and cannot be applied to all types of gestures when used
in various settings. Another significant issue with dynamic gesture identification is the self-articulation ofgesture in the sequence
of gestures. When employing the provided method for recognising co- articulation, the connected gesture sequences in the gesture
vocabulary that are used in light of some particular applications, such as robotic control etc.
Deep Learning-Based Real-Time Artificial Intelligence Virtual Mouse by S. Shriram, B. Nagaraj[7] states that by using a built-in
camera or awebcam that recognises hand movements and finger-tips and frames are detected to carry out certain mouse actions. The
model's outcomes demonstrate the suggested AI virtual mouse system's outstanding performance which is more accurate than the
current models, and also overcomes the bulk of the latter's shortcomings. Since the recommended system is more accurate this
model can be easily put in practice. With the use of this system the usage of actual mouse can be avoided which reduces the spread
of Corona Virus.
On-device Real-time Hand Tracking using MediaPipe by G. R. Fan Zhang, V. Bazarevsky [8] In this post, they recommended
MediaPipe Hands, a complete hand tracking system that operates in real- time on different platforms. The pipeline can be easily
installed on standard devices and predicts 2.5D landmarks without the need for specializedhardware. We open sourced the pipeline
toencourage academics and engineers tocreatecutting-edge gesture control and AR/VRapplications utilizing it.
Real-time virtual mouse system using RGB-D images and fingertip detection by S.-H. Kim, N.-H. Ho, D.-S. Tran, H.-J. Yang, and
G. S. Lee [9] this study introduced a novel virtual-mouse technique based on fingertip detection and RGB-D pictures. Just using
fingertips in front of webcam user can perform certain actions. The method showed off not only extreme accurate gesture estimates
useful applications. The proposed approach gets around the drawbacks of the majority of existing virtual-mouse systems. It has
several benefits, including accurate fingertip tracking at a greater distance and with complicated backgrounds. It also works well in
shifting light conditions. The results of the experiments showed this method is a better one for real-time hand gesture interfaces.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2412
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
Hand gesture recognition for human computer interaction by A. Subramanian, A. Haria, J. S. Nayak N. Asokkumar, and S. Poddar
[10] we were able to create a robust gesture recognition system that was affordable and simple to use without the usage of any
markers. With the help of our gesture detection technology, we aimed tooffer gestures for almost all HCI-related operations, such as
system functioning, application activation, and opening a number of well-known websites. Increasing accuracy and adding
additional movements to incorporate more features are future objectives. In addition, we intend to incorporate our tracking system
intoa variety of hardware, including digital TV and mobile devices, and broaden the possibilities for our domain. Additionally, we
wish to make this mechanism accessible to a wide range of users, including those with impairments.
III. METHODOLOGY
For hand tracking and gesture recognition, the MediaPipe framework is employed, for computer vision we use the OpenCV
library.The technique tracks, recognises hand gestures and finger tips using ML concepts.
1) OpenCV: For images, OpenCV offers object detection techniques. Using the OpenCV module, the best applications of
computer vision can be developed. This library is utilizedfor face and object identification, picture and video data processing,
and video analysis.
2) MediaPipe: A machine learning pipeline employs a Google open-source framework known as MediaPipe. The MediaPipe
framework may be employed for cross-platform computing as it was developed leveraging time series data. The MediaPipe
architecture supports multiple audio and video formats since it is multimodal. The developer uses MediaPipe framework to
build systems for application- related objectives as well as to design and analyze systems that employ graphs.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2413
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
3) The AI-Powered Virtual Mouse System's Camera: This AI virtual mouse system relies on the images from a laptop or PC. The
video capture object is generated by using Python computer vision toolkit OpenCV and web camera used for recording. The
web cam provides frames to virtual AI system that processes them.
4) Video recording and analysing: The webcam will be used by the AI virtual mouse system to record each frame till the
programme is finished. The images are captured and converted to RGB to allow for frame-by-frame identification of the hands.
5) Virtual Screen Matching: This is used to move the hand coordinates between the web cam and the computer’s window to
execute certain mouse functions, the AI virtual mouse technique enables a transformational method. After the recognition of
the finger tips and hands we are notified of which fingers are capable of performing a cursor movement , a rectangle box may
be generated on the computer screen showing the reference ofthe web cam. From there, we can see our cursor movements
around the window.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2414
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
6) Recognizing the finger which is up and performing out the appropriate mouse operation: In order to transfer the hand
coordinates between the webcam to the computer’s window full screen for mouse operation, the AI virtual mouse technique
enables a transformational mechanism. After thehands are detected and we are notified of which finger is capable of performing
a certain mouse movement, a rectangular box is formed in reference to the computer window in the camera region. From that,
we may move the mouse pointer around the window.
IV. IMPLEMENTATION
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2415
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
VI. CONCLUSION
The development of effective human-machine interaction has becomesignificantly impacted by the hand gesture detection and voice
assistant systems. Wide- ranging applications in the technology sector are promised by implementation employing hand gesture
recognition. The MediaPipe, a machine learning framework, has a significant impact on thecreation of this application that uses hand
gesture recognition.
REFERENCES
[1] D. L. Quam, “Gesture recognition with a DataGlove,” IEEE Conference on Aerospace andElectronics, vol. 2, pp. 755–760, 1990.
[2] D.H. Liou, D. Lee, and C.C. Hsieh, “A real time hand gesture recognition system using motion history image,” in Proceedings of the 2010 2nd International
Conference on Signal Processing Systems, July 2010.
[3] V. V. Reddy, T. Dhyanchand, G. V. Krishna and S. Maheshwaram, "Virtual Mouse Control Using Colored Finger Tips and Hand Gesture Recognition," 2020
IEEE-HYDCON, 2020, pp. 1-5, DOI: 10.1109/HYDCON48903.2020.9242677.
[4] L. R.Rabiner B. H. Juang “An Introduction toHidden Markov Models” IEEE ASSP Magazine (Volume: 3, January 1986)
[5] Akshaya U Kulkarni, Amit M Potdar “RADAR based Object Detector using Ultrasonic Sensor” 2019 1st International Conference on Advances in Information
Technology (ICAIT) 10 February 2020
[6] M.K. Bhuyan, D. Ghosh and P.K. Bora “Co-articulation Detection in Hand Gestures” TENCON 2005 - 2005 IEEE Region 10 Conference February 2007.
[7] S. Shriram, B. Nagaraj ,“Deep Learning-BasedReal-Time AI Virtual Mouse” Volume 2021 ArticleID 8133076 https://doi.org/10.1155/2021/8133076
[8] V. Bazarevsky and G. R. Fan Zhang. On-Device, MediaPipe for Real-Time Hand Tracking.
[9] A. Haria, A. Subramanian, N. Asok kumar, S. Poddar, and J. S. Nayak “Hand gesture recognition for human compute interaction, ”Procedia Computer Science,
vol. 115, pp. 367–374, 2017.
[10] D.-S. Tran, N.-H. Ho, H.-J. Yang, S.-H. Kim, and G. S. Lee, “Real-time virtual mouse system using RGB-D images and fingertip detection,” Multimedia
Tools and Applications Multimedia Tools and Applications, vol. 80, no. 7, pp. 10473–10490, 2021
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2416