Human Computer Interaction Based Eye Controlled Mouse: Histogram of Oriented Gradients (HOG)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]

IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

Human computer interaction based eye controlled mouse


Vinay S Vasisht 1 Swaroop Joshi2 Shashidhar3 Shreedhar4 C Gururaj5
1,2,3,4
Student, Telecommunication Department, BMSCE, Bangalore
5
Senior Member IEEE, Assistant Professor, Telecommunication Department, BMSCE, Bangalore
1
[email protected] 2 [email protected] 3 [email protected] 5 [email protected]
5
[email protected]

Abstract— With advanced technologies in this digital or in case of those who are unable to moves their
era, there is always scope for development in the field hands, there arises a need for using these hands free
of computing. Hands free computing is in demand as mouse.
of today it addresses the needs of quadriplegics. This Usually , eye movements and facial movements are the
paper presents a Human computer interaction (HCI) basis for hands free mouse.
system that is of great importance to amputees and
those who have issues with using their hands. The II. RELAT ED WORK
system built is an eye based interface that acts as a There are various methods using which this can be
computer mouse to translate eye movements such as achieved. The camera mouse was proposed by Margrit
blinking, gazing and squinting towards the mouse Betke et. al.[1] for people who are quadriplegic and
cursor actions. The system in discussion makes use of nonverbal . The movements of the user are tracked
a simple webcam and its software requirements are using a camera and these can be mapped to the
Python(3.6), OpenCv , numpy and a few other movements of the mouse pointer which is visible on
packages which are necessary for face recognition. The the screen. Yet another method was proposed by
face detector can be built using the HOG (Histogram Robert Gabriel Lupu, et. al.[2] for human computer
of oriented Gradients) feature along with a linear interaction that made use of head mounted device to
classifier, and the sliding window technique. It is track eye movement and to translate it on screen .
hands free and no external hardware or sensors are Another technique by Prof. Prashant salunke et.al [3]
required. presents a techniques of eye tracking using Hough
Keywords: Python(3.6), OpenCv ,Human computer transform.
interaction, numpy,
face recognition, Histogram of Oriented Gradients A lot of work is being done to improve the
(HOG) characteristics of HCI.A paper by Muhammad Usman
Ghani, et. al [4] suggests that the movements of the
eye can be read as an input and used to help the user
access the interfaces without using any other hardware
I. INT RODUCT ION
device such as a mouse or a keyboard[5]. This can be
The computer mouse or moving the finger has been a achieved by using image processing algorithms and
very common approach to move the cursor along the computer vision. One way to detect the eyes is, by
screen in the current technology. The system detects using the Haar cascade feature.The eyes can be
any movement in the mouse or the finger to map it to detected by matching it with templates which would
the movement of the cursor. Some people who do not already be stored as sugges ted by Vaibhav Nangare et.
have their arms to be operational, called as ‘amputees’ al [6]. To get an accurate image of iris an IR sensor can
will not be able to make use of the current technology be used. A gyroscope can be used for the orientation of
to use the mouse. Hence, if the movement of their the head as suggested by Anamika Mali et. al [7]. The
eyeball can be tracked and if the direction towards click operation can be implemented by ‘gazing’ or
which the eye is looking at can be determined, the staring at the screen. Also, by gazing at a fraction of
movement of the eyeball can be mapped to the cursor any portion of the screen(upper or lower), the scroll
and the amputee will be able to move the cursor at function can be implemented as proposed by Zhang et.
will. An ‘eye tracking mouse’ will be of a lot of use to al [8].
an amputee Currently, the eye tracking mouse is not
available at a large scale, and only a few companies Along with eye movements, it becomes easier if we
have developed this technology and have made it incorporate some subtle movements of the face and its
available. We intend to prepare an eye tracking mouse parts as well. A real-time eye blink detection using
where most of the functions of the mouse will be facial landmarks as suggested by Tereza Soukupova
available, so that the user can move the cursor using and Jan ´ Cech [9] brings out how the blink action can
his eye. We try to estimate the ‘gaze’ direction of the be detected using facial landmarks. This is a major
user and move the cursor along the direction along aspect as blinking actions are necessary for translating
which his eye is trying to focus. it into clicking actions. Detecting eyes and other facial
parts can be done using openCv and Python with dlib
The pointing and clicking action of the mouse has [10].Similarly even blink can be detected.[11].the
remained a standard for quite some time. However , paper by Christos Sagonas et.al [12] discusses the
due to some reasons one may find them uncomfortable

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 362


Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

challenges of facial landmark localisation. Akshay


Chandra [13] proposes the same by contolling the i)Resizing:
mouse cursor using facial movements.
The image is first flipped over the y-axis.
Next, the image needs to be resized.
The resize function refers to setting the new
III. METHO DO LO GY resolution of the image to any value as per the
requirement. In this project, the new resolution is 640
The system proposed in this paper works based on the X 480.
following actions : a)Squinting your eyes b)Winking c)
Moving of head around (pitch and yaw) d)Opening the ii) BGR to gray:
mouth
The data that we are using to detect the different
parts of the face requires image of a grayscale format
Requirements: to give more accurate results. Hence, the image, i.e.
the frame of the video from the webcam needs to
The requirements for this project have been listed and undergo the process of converting its format from
further discussed and analyzed. Since the core of this RGB to grayscale.
project is in the software part, we have only mentioned
the software requirement. Once the image is converted to a grayscale format, it
can be used to locate the face and identify the
Software requirement- features of the face.

iii) Detection and Prediction of facial features:


1)Python: To detect the face and the features, a prebuilt model
Python is the software that we have used, for which the is used in the project, which has the available values
interface is Jupyter Notebook. that can be interpreted by python to make sure that
It is a very common tool, for some basic computations, the face is located in the image.
and is one of the most user-friendly tools. There is a function called ‘detector()’, made
Some of the libraries used are: available by the models, which helps us to detect the
face.
 Numpy - 1.13.3 After the face is detected, the features of the face can
 OpenCV - 3.2.0 now be detected using the function ‘predictor’. The
function helps us to locate 68 points on any 2D
 PyAutoGUI - 0.9.36
image. These points correspond to different points
 Dlib - 19.4.0
on the face near the required parts such as eyes,
 Imutils - 0.4.6 mouth, etc.

The values of the function that are obtained are in the


The methodology is as follows: form of 2D coordinates. Every one of the 68 points
1)Since the project is based on detecting the features of are essentially values of the x and y coordinates that,
the face and mapping them to the cursor, the webcam when connected, will roughly form an identifiable
needs to be accessed first, which means that the face.
webcam will be opened.
Once the webcam is opened, the program needs to They are then stored as an array of values so that
extract every frame from the video. they can be arranged and used in the next step to
The frame-rate of the video is generally around 30 connect any of the coordinates and draw a boundary
frames per second, so a frame at every 1/30th of a to represent the required regions of the face.
second will be used to be processed. Four sets of arrays are taken as 4 different parts of
This frame undergoes a set of processes before the these values which are stored in the array, to
features of the frame are detected and mapped to the separately be stored as the coordinates to be
cursor. And this process continuously takes place for connected to represent the required regions, those are
every frame as a part of a loop. the: Left eye, Right eye, nose and the mouth.

2) Once the frame is extracted, the regions of the face Once the 4 arrays are prepared, boundaries, or
need to be detected. Hence, the frames will undergo a ‘contours’ are drawn around the points using 3 of
set of image-processing functions to process the frame these arrays by connecting these points, using the
in a suitable way, so that it is easy for the program to ‘drawcontour’ function and the shape formed is
detect the features such as eyes, mouth, nose, etc. around the two eyes and the mouth.

The processing techniques are:

=
978-1-7281-0167-5/19/$31.00 ©2019 IEEE 363
Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

iv) Mouth and Eye aspect ratios: Once the contours Mouth-Aspect-Ratio (MAR)
are drawn, it is necessary to have a reference for the
shapes, which, when compared with, gives the
program any information about any action made by
these regions such as blinking, yawing, etc.

These references are understood as ratios, between


the 2D coordinates, and a change in the coordinates,
essentially tell us that, the parts of the region of the
face have moved from the regular position and an
action has been performed.

The system is built on predicting facial landmarks of


the face. The Dlib prebuilt model helps in fast and
accurate face detection along with 68 2D facial
landmarks as explained already. Here, Eye-Aspect-
Ratio (EAR) and mouth-aspect-ratio ( MAR) are
MAR= || || || || || ||
used to detect blinking/winking and yawing
|| ||
respectively. These actions are translated into mouse
actions.
Similarly, The MAR goes up when the mouth opens.
This is used as an action to start and switch off the
Eye-Aspect-Ratio (EAR) mouse. For example, if the ratio has increased, it can
mean that the distances between the points
representing the region of the face have changed and
an action has been performed by the person.
This action is supposed to be understood as the person
trying to perform an operation using the mouse.

Hence, for these functionalities to be made


operational, there need to be some defined
‘aspect_ratios’, which when cross a defined limit,
interprets an action being performed.

v) Detection of actions performed by the face: After


|| || || || the ratios are defined, the frame can now compare
EAR= the ratios of the parts of the face with the ratios
|| ||
defined for different actions, of the current frame
being processed. It is done using the ‘if’ statement.
The actions which the program identifies are:
The graph shows that EAR value drops drastically 1) For activating the mouse: The user needs to ‘yaw’
when the eye is closed. This trigger can be used for which is opening his mouth, vertically, in turn
clicking action. increasing the distance between the corresponding
2D points of the mouth. The algorithm detects the
change in the distance by computing the ratio, and
when this ratio crosses a specified threshold, the
system is activated and the cursor can be moved.
The user needs to place his nose towards, either the
top, bottom, left or right of a rectangle that appears, to
move the cursor in the corresponding direction. The
more he is away from the rectangle, the faster is the
movement of the cursor
2) Left/Right Clicking: For clicking, he needs to
close any one of his eye, and make sure to keep the
other open. The program first checks whether the
magnitude of the difference is greater than the
prescribed threshold by using the difference between
the ratios of the two eyes, to make sure that the user
wants to perform either the left or right click, and
does not want to scroll(For which both the eyes need
to squint)

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 364


Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

3) Scrolling: The user can scroll the mouse, either


upwards or downwards. He needs to squint his eyes in
such a way that the aspect ratio of both the eyes is less
than the prescribed value. In this case, when the user
places his nose outside the rectangle, the mouse
performs scroll function, rather than moving the
cursor. He can move his nose either above the
rectangle to scroll upwards, or move it below the
rectangle to scroll downwards.

IV. RESULT

The mouse control program will be operational, and


the user can move the cursor, scroll, or click at his will.
The amount of change of the position of the cursor Fig.2-Squinting to go to scroll mode
along any axis can be changed as per the needs of the
user. The mouse control is activated by opening the
mouth when the MAR value crosses a certain
threshold.

Fig.3-Winking the right eye to right-click

The sensitivity of the mouse can be changed


accordingly as per the needs of the user. Overall, the
project works as required . though the comfort is not
the same as in case of hands controlled mouse ,this
project can be used with some ease with some practice.
Fig.1-Opening the mouth to activate ‘input mode’
The cursor is moved by moving the eye right, left, top
and down as per the requirement. V.CONCLUSION

The scroll mode is activated by squinting. The This work can be extended to improve the speed of the
system by using better trained models. Also, the
scrolling can be done by moving the head up-down system can be made more dynamic by making the
which is called as pitching and by sideways called as change in the position of the cursor, proportional to the
yawing. Scroll mode is deactivated by squinting again. amount of rotation of the user’s head, i.e., the user can
The clicking action takes place by winking the eye. decide, at what rate he wants the position of the cursor
Right wink corresponds to right wink and left click to change. Also, future research work can be done on
corresponds to left wink making the ratio more accurate, since the range of the
values are the result of the aspect ratios, which is
usually small. Hence, to make the algorithm detect the
actions more accurately, there can be some
modification in the formulae for the aspect ratios used.
Also, to make the process of detection of the face more
easy, some image processing techniques can be used
before the model detects the face and features of the
face.

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 365


Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

21st Computer Vision Winter Workshop, February


A CKNOWLEDGMENT 2016.
[10] Adrian Rosebrock. “Detect eyes, nose, lips,
The work reported in this paper is supported by and jaw” with dlib, OpenCV, and Python.”
the BMS College of Engineering through the [11]Adrian Rosebrock. Eye blink detection with
Technical Education Quality Improvement OpenCV, Python, and dlib.
Programme [TEQIP-III] of the MHRD, [12] C.Sagonas, G. Tzimiropoulos, S. Zafeiriou, M.
Government of India. Pantic. 300 Faces in-the-Wild Challenge: The first
facial landmark localization Challenge.
Proceedings of IEEE Int’l Conf. on Computer
Vision (ICCV-W), 300 Faces in-the-Wild
VI. REFERENCES Challenge (300-W). Sydney, Australia, December
2013
[13] Akshay Chandra Lagandula. Mouse Cursor
[1] Margrit Betke,James Gips, “The Camera Control Using Facial Movements.
Mouse: Visual Tracking of Body Features to https://towardsdatascience.com/c16b0494a971.
Provide Computer Access for People With
Severe Disabilities”,in IEEE transactions on
neural systems and rehabilitation engineering, A BOUT T HE A UTHORS
vol.10, no.1, March 2002
[2] Robert Gabriel Lupu, Florina Ungureanu, [1] Vinay S Vasisht at the time of writing this paper
Valentin Siriteanu, “Eye Tracking Mouse for is pursuing his graduate degree in
Human Computer Interaction”,in The 4th IEEE Telecommunication Engineering
International Conference on E-Health and from BMS College of
Bioengineering - EHB 2013. Engineering, Bengaluru, India. He
[3] Prof. Prashant Salunkhe, Miss. Ashwini R. would graduate in 2019. His areas
Patil , “A Device Controlled Using Eye of interest are Wireless
Movement”,in International Conference on communication and image
Electrical, Electronics, and Optimization processing
Techniques (ICEEOT) - 2016.
[4] Muhammad Usman Ghani, Sarah Chaudhry,
Maryam Sohail, Muhammad Nafees Geelani , [2] Swaroop Joshi at the time of writing this paper
“GazePointer: A Real Time Mouse Pointer Control is pursuing his graduate degree
Implementation Based On Eye Gaze Tracking”,in in Telecommunication
INMIC 19-20 Dec. 2013. Engineering from BMS College
[5] Alex Poole and Linden J. Ball, “Eye Tracking of Engineering, Bengaluru, India.
in Human-Computer Interaction and Usability He would graduate in 2019.
Research: Current Status and Future Prospects,”in His areas of interest are Wireless
Encyclopedia of Human Computer Interaction (30 communication and image
December 2005) processing
Key: citeulike:3431568, 2006, pp. 211-219.
[6] Vaibhav Nangare, Utkarsha Samant, Sonit
Rabha , “Controlling Mouse Motions Using Eye
Blinks, Head Movements and Voice [3] Shreedhar Jolad at the time of writing this paper
Recognition”,in International Journal of Scientific is pursuing his graduate degree in
and Research Publications, Volume 6, Issue 3, Telecommunication Engineering
March 2016. from BMS College of
[7] Anamika Mali, Ritisha Chavan, Priyanka Engineering, Bengaluru, India.
Dhanawade, “Optimal System for Manipulating He would graduate in 2019. His
Mouse Pointer through Eyes”,in International area of interest is image
Research Journal of Engineering and Technology processing
(IRJET) March 2016.
[8] Xuebai Zhang, Xiaolong Liu, Shyan-Ming
Yuan,Shu-Fan Lin, “Eye Tracking Based Control
System for Natural Human-Computer
Interaction”,in Hindawi Computational Intelligence
and Neuroscience Volume 2017, Article ID
5739301, 9 pages.
[9] Tereza Soukupova´ and Jan Cˇ ech. “Real-Time [3] Shashidhar Haravi at the time of writing this
Eye Blink Detection using Facial Landmarks .” In paper is pursuing his graduate

366
Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

degree in Telecommunication Engineering from


BMS College of Engineering, Bengaluru, India. He
would graduate in 2019. His area of interest is image
processing

[4] C. Gururaj received his B.E


degree from
Visvesvaraya Technological
University, Belagavi.
M.Tech degree in electronics, from
Visvesvaraya Technological
University and his PhD from Jain
University, Bengaluru. He is
currently an Assistant Professor in the department of
Telecommunication engineering at BMS College of
Engineering, Bengaluru. His research interests
include image processing and VLSI design. He is a
Senior Member of IEEE, Life member of ISTE and
Life member IAENG professional bodies.

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 367

You might also like