Final Book VM Fyp
Final Book VM Fyp
By
Nusrat Razzaq
Arooj Riaz
Session 2018-2022
FACULTY OF SCIENCES
UNIVERSITY AZAD JAMMU & KASHMIR
MUZAFFARABAD
i
APPROVAL CERTIFICATE
It is certified that the project work presented in this report entitled “VIRTUAL
MOUSE” submitted by Nusrat Razzaq (Roll No. 16), Reg No. 2018-UMDB-004843
And Arooj Riaz (Roll No. 20), Reg No. 2018-UMDB-004847 of Session (2018-22)
supervised by Dr. Wajid Arshad Abbasi in our opinion is fully adequate in scope and
quality of Bachelors in computer sciences (BSCS).
(Chairman)
Dr. Syed Ali Abbas
Department of CS & IT
University of Azad Jammu & Kashmir
Muzaffarabad
ii
ACKNOWELEDGEMENT
Allah alone deserves praise and thanks for the perfection of His favor upon us. May He
blesses and salutes His slave and Muhammad (SAW), and the teacher mankind ever had,
We would also like to express our gratitude to our dignitary supervisor Dr. Wajid Arshad
Abbasi, for his appreciation and dedication to our project. Our level coordinator, Dr. syed
Ali Abbas, deserves special thanks for her cordial support, valuable suggestions, and
In addition, we are very grateful to our parents for their unconditional support. Hopefully,
iii
DECLARATION
We hereby declare that neither system we design nor any part of it is copied from any
other source. In addition, we are announcing that this project was completed entirely as a
result of our immense efforts and under the supervision of our supervisor. This thesis has
not been submitted to another institute or university for the award or degree or diploma
based on the findings in it. Our report has been written under the university’s guidelines.
In the text of thesis, whenever we have incorporated materials (data, theoretical analysis,
and text) from any other source, we have given them due credit and given their details in
the references.
Nusrat Razzaq
Arooj Riaz
iv
ABSTRACT
The mouse is the wondrous development of people. In this time, remote mouse or a
contact less mouse actually utilizes gadgets and is not liberated from gadget or might be
from outside power sources like battery and gain space and electric power, likewise
during COVID pandemic it is encouraged to make social separating and keep away from
to contact this which gave by various people groups. In the proposed AI virtual mouse
system, this limitation can be overcome by employing webcam or a built-in camera for
capturing of hand gestures and hand tip detection using computer vision. The algorithm
used in the system makes use of the machine learning algorithm. Based on the hand
gestures, the computer can be controlled virtually and can perform left click, right click,
scrolling functions, and computer cursor function without the use of the physical mouse.
The algorithm is based on deep learning for detecting the hands. Hence, the proposed
system will avoid COVID-19 spread by eliminating the human intervention and
v
TABLE OF CONTENTS
Acknowledgement..........................................................................................iii
Declaration.....................................................................................................VI
Abstract............................................................................................................v
Chapter I: Introduction...........................................................................1
1.1 : problem statement.................................................................1
1.2 : Objective of project..............................................................2
1.3 : Limitations of project...........................................................2
1.4 : Future scope of project.........................................................2
vi
Chapter 4: Project Working....................................................................26
4 Introduction............................................................................26
4.1 : Camera Used in AI virtual mouse system.............................28
4.2 : Virtual Screen Matching........................................................29
4.3 : Detecting finger UP...............................................................30
4.4 : Stop Cursor............................................................................31
4.5 : Double click...........................................................................32
4.6 : Right button click..................................................................33
4.7 : Scroll UP function.................................................................34
4.8 : Scroll Down function.............................................................35
Chapter 6: conclusions.............................................................................42
6.1: conclusions.............................................................................42
REFERENCES.............................................................................................44
ABBREVIATIONS......................................................................................48
vii
LIST OF FIGURES
Number Page
viii
Fig 4.6 Right button click.........................................................................33
ix
CHAPTER 1
INTRODUCTION
The most efficient and expressive way of human communication is through hand
gesture system is proposed. The setup of this system uses a web camera that detects
fingertips of our hand by using computer vision, to perform the mouse cursor
operations like right click, left click, scrolling and double clicking.
We have used python programming language for developing this virtual mouse also
OpenCV which is the library for computer vision is used. The models makes use of
Media Pipe package for the tracking of hands and for tracking of tip of hands.
is not feasible such as moist or wet conditions, to aid people who cannot handle a
device and where there is not a space for using a mouse and to eliminate the need of
mouse and cable altogether. In view of the pandemic, there arises a need to avoid
high-contact surfaces out of which the mouse is one. In order to eliminate the need
to avoid contact, the system provides an initiative way to interact with the computer
system with the use of hand gestures and finger tips to emulate mouse-like functions
1
1.2 Objective of project
Our aim is to make the mouse model that can be controlled by finger
gestures. We can make the mouse in which we can move the mouse cursor left
click, right click, scroll and pointing the object through hand gestures.
handle the keyboard functionalities along with the mouse functionalities virtually
2
CHAPTER 2
BACKGROUND OVERVIEW
interpret and understand visual world. Computer vision is the process of teaching
computers to process the image at the pixel level and then comprehend them. In this
process machine retrieves visual information, handles it, and interprets the results
using special algorithms. In the same way that AI makes computer think, computer
vision gives them the ability to see, observe, and understand as shown in figure 2.1.
Human vision and computers vision work similarly, except that human have an
perform these functions. However, they accomplish this much less time using
cameras, data and algorithms as opposed to retains, optic nerves and the visual cortex.
It is possible for the system that inspects products, or watches a production asset, to
vision requires a lot of data. Data is analyzed repeatedly until distinctions are
discerned. Deep learning and convolutional neural networks are two of the key
learning to enable computers to learn from visual data. The computers will learn to
3
recognize one image from another if enough data is fed through the model. In contrast
This picture shows the computer vision working and give detail that how objects are
4
2.1.1 Camera System Interaction
Digital Camera System Controlled from an LCD Touch Panel presents the
design and implementation based on Digital Camera System for image capturing and
real-time image processing. Images captured with a web camera are initially stored
in the system’s memory and then they are displayed on an LCD Touch panel. The
advanced image processing algorithms. Apart of this, the system supports the control
of the image sensor, through the LCD Touch Panel. In addition, it has the ability to
communicate with a PC through an interface for storing the images on it. The Digital
Camera System is used due to the flexibility, mainly targeted as it used as an open
and low cost platform for implementing and testing real-time image processing
algorithms. In addition the exploitation of LCD touch Panel can effectively assist in
the control of more cameras’ parameters. Image processing algorithms can take place
before or after the data storing as system has the ability to be easily modified. Future
plans are to embed and test more advance image processing algorithms. In addition,
we intend to create an extended menu for the LCD touch panel, developing such a
menu the user can use in a friendly manner controlling cameras functionality.
Through this menu the user can also easily select the execution of the desirable image
5
2.1.2 Histogram of Oriented Gradients (HOG)
objects in computer vision and image processing. The HOG descriptor technique
1. Divide the image into small connected regions called cells, and for each cell
2. Discretize each cell into angular bins according to the gradient orientation.
angular bin.
4. Groups of adjacent cells are considered as spatial regions called blocks. The
grouping of cells into a block is the basis for grouping and normalization of
histograms.
parameters:
2. Geometry of splitting an image into cells and grouping cells into a block
6
3. Block overlapping
4. Normalization parameters
Object detection is the process in which we can detect the object like
Object detection recognizes the object with bounding box in the image, where in
the image. These techniques are (HOG) Histogram of oriented gradients, color
learning algorithms such as, CNN (convolution neural networks), Auto Encoder,
variance the feature from the image like edge, shape etc.
[13].
7
o We can also visualize this like two types of problem one is multi label
classification.
Image acquisition is the first step in this system. Images are captured by an off-the
shelf HD webcam with a certain resolution. This webcam is interfaced with the
interface to detect the hand coordinates. Fourteen hand gesture coordinate points are
used in it which are obtained from one hand, and to show the gesture in front of the
camera to capture the image. Then, the coordinate points in the image are resized for
the convenience.
8
2.5 Object Localization
Object localization is the process in which we can predict the object in an
A bounding box can be initialized using the parameters as shown in figure 2.5(a)
which are;
bx , by :
bw :
9
Width of bounding box with respect the image width.
bh :
We predict this for calculating mean IOU and predicting box which localize
an image.
IOU is area of overlap over area of union as the measurement given in figure
2.5(b).
1
Fig 2.5(b): IOU Measurement
This figure above gives the detail about IOU or how it measure.
1
Object detection flow chart:
1
2.6 Algorithms for Object detection
There are many different algorithm that we use for the object detection. Some
of them are;
1. Fast R-CNN
2. Faster R-CNN
by several convolution layer. The CNN architecture for SDD is standard for high-
quality image classification, but it lacks the final classification layer. Features are
feature map size decreases and the depth decreases. They construct more abstract
more abstract representation and cover larger receptive fields. Detecting larger
1
2.8 YOLO
YOLO is an algorithm that utilize neural networks for the detection of real-
time objects. Due to its speed and accuracy, this algorithm is popular. In addition to
detecting traffic signals, parking meters and animals, it has also been used in
The term YOLO means ‘you only look once’. Real time detection and recognition of
various objects within a picture is done with the help of this algorithm. In YOLO,
objects are detected as regression problems, and the class probabilities are computed.
Convolutional neural networks are the technology behind the YOLO algorithm [8].
1
Residual Blocks
This picture shows that there are variety of grids. S x S grid is shown in it.
The residual block in computer vision shows that how the objects are pointed
and displays.
Fig 2.8: Residual Block
As in the figure 2.8 shows that when an object is detected firstly the residual
box are formed to separate the objects [12].
1
Bounding box regression
This picture shows that the objects are in the boxes [12].
In the bounding box regression the object we want to point can be
differentiate from other objects by box around the objects.
1
CHAPTER 3
Virtual mouse is the project that can be performed by using different software tools.
In this project we use different modules and libraries for performing the task and for
hand gestures recognitions. The tools and libraries that we use includes Open CV,
numpy, media pipe, flask and the software tool that we used is Anaconda.
framework is used, and OpenCV library is used for computer vision. The algorithm
makes use of the machine learning concepts to track and recognize the hand gestures
and hand tip. numPY and flask is also used in this process for making the virtual
mouse [9].
3.1. MediaPipe
platform development since the framework is built using the time series data. The
audios and videos. The MediaPipe framework is used by the developer for building
1
and analyzing the systems through graphs, and it also been used for developing the
systems for the application purpose as shown in figure 3.1. The steps involved in the
system that uses MediaPipe are carried out in the pipeline configuration. The pipeline
created can run in various platforms allowing scalability in mobile and desktops. The
evaluation
data flow through. Developers are able to replace or define custom calculators
anywhere in the graph creating their own application. The calculators and streams
combined create a data-flow diagram is created with MediaPipe where each node is a
1
This graph shows the Media Pipe hand recognition.
1
Now Single-shot detector model is used for detecting and recognizing a hand or palm
in real time. The single-shot detector model is used by the MediaPipe as shown in
figure 3.1(a). First, in the hand detection module, it is first trained for a palm
2
These coordinates are same in every algorithm for hand gestures recognitions are as
shown in figure below.
3.2 OpenCV
algorithms for the detection an object. Open CV is the library of python language. It
is real time application. All the real time applications are formed by using this as
shown in figure 3.2. The Open CV library is used in image and video processing and
2
Fig 3.2: Open CV pointing objects
In this figure we see that the people are counting with Open CV and Python.
3.3 NumPy
3.4 Flask
2
3.5 Anaconda Software
We can use anaconda software for making this project called Virtual Mouse.
when you are working on a data science project, we find that we need many different
Anaconda's package manager, conda, or pip to install those packages. This is highly
choose the version of the software and press the downloading button.
3. When the screen demands next then click on the next button.
5. Click on next.
2
7. The important part of the installation process is that when it recommends the
approach is to not check the box to add Anaconda to your path. This means
you will have to use Anaconda Navigator or the Anaconda Command Prompt
(located in the Start Menu under "Anaconda") when you wish to use
Anaconda (you can always add Anaconda to your PATH later if you don't
check the box). If you want to be able to use Anaconda in your command
prompt.
2
Fig 3.5: Graph accuracy
detection
2
Chapter 4
PROJECT WORKING
4. Introduction
computer with a minimal amount of interactive physical peripherals. For the various
glove. This glove, which is powered by a standard battery pack, will be attached with
an accelerometer and microcontroller that has the capability to wireless send data to a
nearby wireless receiver that is interfaced directly with a computer PC. Software will
then be incorporated to collect and analyze the user movement data for precise cursor
readjustment [3].
An example would be a customer who has limited physical mobility in a part of their
arm and wishes to minimize the work effort. An application example would be
changing the interaction experience between a user and CAD software modeling,
where the user can obtain a virtual experience of positioning an object without the use
of a peripheral mouse[4]. The intent of this project serves for various applications or
2
marketable product in a wide variety of markets (such as TV, mobile, gaming, health
care, and so forth) in the near future when virtual experience becomes more prevalent.
Camera of the laptop or PC. By using the python computer vision library OpenCV the
video capture object is created and the Web Camera start capturing the video of hand
as shown in the figure 4.1. Camera captures the frames and passes it to AI virtual
system [2].
2
Our project work in the following steps.
The AI virtual mouse system makes use of the transformational algorithm, and it
converts the co-ordinates of fingertip from the webcam screen to the computer
window full screen for controlling the mouse [14]. When the hands are detected and
when we find which finger is up for performing the specific mouse function, a box is
2
4.3 Detecting Which Finger Is Up and Performing the Particular
Mouse Function
When the Index finger is UP and all the fingers like Thumb, Middle finger, Ring
finger, and Little finger are down then we say that the mouse cursor is ready to
2
4.4 For stop the movement of cursor
When the both index and middle fingers are up and the distance between
them is greater than 40px as shown in the figure than cursor stops moving as shown
in figure 4.4.
3
4.5 For the Mouse to Perform Double Click function
When the Index finger and the Middle finger are UP and at the distance is
less than 40px, the double click function is performed as shown in figure 4.5.
3
4.6 For the Mouse to Perform Right Button Click
When the Index finger, Middle finger, and a Little finger are UP and the Ring
finger is down then the cursor of the virtual mouse perform the right button click
3
4.7 For the Mouse to Perform Scroll up Function
When the Little finger is UP and the remaining fingers are closed, as shown
in figure, then the cursor will perform scroll up function as shown in figure 4.7.
3
4.8 For the Mouse to Perform Scroll down Function
When the Index finger and the Little finger is UP and all the remaining
fingers of the hand are down than the cursor of the mouse will perform the scroll
3
CHAPTER 5
3
This figure shows the accuracy of our project. According to which our project is 99%
Cross comparison of the testing of the AI virtual mouse system is difficult because
only limited numbers of datasets are available. The hand gestures and fingertip
detection have been tested in various illumination conditions and also been tested
with different distances from the webcam for tracking of the hand gesture and hand
tip detection.
AI virtual mouse has performed very well in terms of accuracy when compared to
the other virtual mouse models. The novelty of the proposed model is that it can
perform most of the mouse functions such as left click, right click, scroll up, scroll
down, and mouse cursor movement using fingertip detection, and also, the model is
helpful in controlling the PC like a physical mouse but in the virtual mode.
3
The result shown after checking and experiments are good and accurate.
3
5.1 Future Scope
The proposed AI virtual mouse has some limitations such as small decrease
in accuracy of the right click mouse function and also the model has some difficulties
in executing clicking and dragging to select the text. These are some of the limitations
of the proposed AI virtual mouse system, and these limitations will be overcome in
functionalities along with the mouse functionalities virtually which is another future
5.2 Applications
used to reduce the space for using the physical mouse, and it can be used in situations
where we cannot use the physical mouse. The system eliminates the usage of devices,
3
1) The proposed model has a greater accuracy of 99% which is far greater than
that of other proposed models for virtual mouse, and it has many
applications.
3) The system can be used to control robots and automation systems without the
usage of devices.
4) 2D and 3D images can be drawn using the AI virtual system using the hand
gestures.
5) AI virtual mouse can be used to play virtual reality- and augmented reality-
6) Persons with problems in their hands can use this system to control the mouse
3
7) In the field of robotics, the proposed system like HCI can be used for
controlling robots.
8) In designing and architecture, the proposed system can be used for designing
4
CHAPTER 6
CONCLUSIONS
6.1 Conclusions
The main objective of the AI virtual mouse system is to control the mouse
cursor functions by using the hand gestures instead of using a physical mouse. The
detects the hand gestures and hand tip and processes these frames to perform the
From the results of the model, we can come to a conclusion that the proposed AI
virtual mouse system has performed very well and has a greater accuracy compared
to the existing models and also the model overcomes most of the limitations of the
existing systems. Since the proposed model has greater accuracy, the AI virtual
mouse can be used for real-world applications, and also, it can be used to reduce the
spread of COVID-19, since the proposed mouse system can be used virtually using
4
The model has some limitations such as small decrease in accuracy in right click
mouse function and some difficulties in clicking and dragging to select the text.
Hence, we will work next to overcome these limitations by improving the fingertip
4
REFERENCES
[1] https://xd.adobe.com/ideas/principles/emerging-technology/what-is-
computer-vision-how-does-it-
work/#:~:text=Computer%20vision%20is%20the%20field,pixel%20level
%20and%20understand%20it.
Mar 2014.
Complex
4
Background,” Proceedings of the 2008 International Conference on
Embedded Software and Systems, Sichuan, 29-31 July 2008, pp. 338-343
[5] https://www.alliedvision.com/en/ai-starter-guide/camera-technology/
[7] https://www.frontiersin.org/research-topics/28766/object-
localization#:~:text=Object%20localization%20is%20a%20fundamental,
driving%2C%20and%20automatic%20monitoring%20systems.
[8] https://www.geeksforgeeks.org/difference-between-yolo-and-ssd/
[9] https://www.section.io/engineering-education/creating-a-hand-
tracking-module/
[10] https://en.wikipedia.org/wiki/OpenCV
[11] https://pyimagesearch.com/2020/10/05/object-detection-bounding-
box-regression-with-keras-tensorflow-and-deep-learning/
[12] https://towardsdatascience.com/hog-histogram-of-oriented-
gradients-67ecd887675f?gi=3145908a21df
[13] https://www.javatpoint.com/machine-learning-vs-deep-learning
4
[15] Akshay L Chandra” Mouse Cursor Control Using Facial
4
ABBREVIATIONS