Ai Virtual Mouse
Ai Virtual Mouse
Ai Virtual Mouse
BACHELOR OF TECHNOLOGY
by
RAHUL SHARMA (Roll no. 171190101023)
NITISH PANDEY (Roll no. 171190101022)
NAVNEET NEGI (Roll no. 171190101019)
KAMLESH PANDEY(Roll no. 681190101003)
to the
Certified that Rahul Sharma (roll no. 171190101023), Nitish Pandey (roll no.
171190101022), Navneet Negi (roll no. 171190101019), Kamlesh Pandey (roll no.
681190101003) has carried out the research work presented in this project entitBlue "AI
VIRTUAL PEN" for the award of Bachelor of Technology from Nanhi Pari Seemant
original work, and studies are carried out by the student themselves and the contents of the
thesis do not form the basis for the award of any other degree to the candidates or to
Signature
MOHD. MURSLEEN
(Head Of Department)
NPSEI, PITHORAGARH
Date:
ii
SELF DECLARATION
We hereby declare that the work is being presented in the major project entitle “AI
VIRTUAL PEN”, submitted in the Department of Computer Science and Engineering
Nanhi Pari Seemant Engineering Institute, Pithoragarh is an authentic record of my own
work carried out during the period of minor project under the guidance of Mohd.
Mursleen, Department of Computer Science and Engineering in fulfillment of the
requirement of degree of Bachelor of Technology, Nanhi Pari Seemant Engineering
Institute, Pithoragarh.
Rahul Sharma(171190101023)
Nitish Pandey(171190101022)
Navneet Negi(171190101019)
Kamlesh Pandey(681190101003)
iii
ABSTRACT
AI virtual pen which can draw anything on it by just capturing the motion of a colored
marker with a camera. Here a colored object at the tip of the finger is used as the marker.
We will be using the computer vision techniques of OpenCV to build this project. The
preferred language is Python due to its exhaustive libraries and easy to use syntax but
understanding the basics it can be implemented in any OpenCV supported language.
Writing in air has been one of the most fascinating and challenging research areas in field
of image processing and pattern recognition in the recent years. It contributes immensely to
the advancement of an automation process and can improve the interface between man and
machine in numerous applications. Several research works have been focusing on new
techniques and methods that would reduce the processing time while providing higher
recognition accuracy. Object tracking is considered as an important task within the field of
Computer Vision. The invention of faster computers, availability of inexpensive and good
quality video cameras and demands of automated video analysis has given popularity to
object tracking techniques. Generally, video analysis procedure has three major steps:
firstly, detecting of the object, secondly tracking its movement from frame to frame and
lastly analysing the behaviour of that object. For object tracking, four different issues are
taken into account; selection of suitable object representation, feature selection for
tracking, object detection and object tracking. In real world, Object tracking algorithms are
the primarily part of different applications such as: automatic surveillance, video indexing
and vehicle navigation etc. The project takes advantage of this gap and focuses on
developing a motion-to-text converter that can potentially serve as software for intelligent
wearable devices for writing from the air. This project is a reporter of occasional gestures.
It will use computer vision to trace the path of the finger. The generated text can also be
used for various purposes, such as sending messages, emails, etc. It will be a powerful
means of communication for the deaf. It is an effective communication method that
reduces mobile and laptop usage by eliminating the need to write.
iv
ACKNOWLEDGEMENTS
We would like to express our gratitude to Mohd. Mursleen Faculty of Computer Science
and Engineering Department for guidance and support throughout this project work. He
has been a constant source of inspiration to us throughout the period of this work. We
consider ourselves extremely fortunate for having the opportunity to learn and work under
his guidance over the entire period.
I also express my sincere thanks to all the teachers of Computer Science and Engineering
Department, Nanhi Pari Seemant Engineering Institute, Pithoragarh, who gave me the
golden opportunity to do this wonderful project on the topic “AI VIRTUAL PEN”, which
also helped me in doing a lot of research and I came to know about so many new things I
am really thankful to them.
Last but not least I would like to thank all my friends and family members who were
involved directly or indirectly in my project work.
Rahul Sharma
Nitish Pandey
Navneet Negi
Kamlesh Pandey
v
TABLE OF CONTENTS
CERTIFICATE ..........................................................................................................................ii
SELF DECLARATION ............................................................................................................iii
ABSTRACT.............................................................................................................................. iv
ACKNOWBLUEGEMENTS .................................................................................................... v
TABLE OF CONTENTS.......................................................................................................... vi
TABLE OF FIGURE ...............................................................................................................vii
CHAPTER 1 INTRODUCTION ............................................................................................... 1
1.1 INTRODUCTION........................................................................................................ 1
CHAPTER 2 LITERATURE REVIEW .................................................................................... 2
2.1 Robust Hand Recognition ............................................................................................ 2
2.2 Blue colored fitted finger movements .......................................................................... 2
2.3 Augmented Desk Interface ........................................................................................... 2
CHAPTER 3 CHALLENGES IDENTIFIED ............................................................................ 3
3.1 Fingertip detection ....................................................................................................... 3
3.2 Lack of pen up and pen down motion .......................................................................... 3
3.3 Controlling the real-time system .................................................................................. 3
CHAPTER 4 METHODOLOGY .............................................................................................. 4
4.1 Fingertip Detection Model: .......................................................................................... 4
4.2 Techniques of Fingertip Recognition Dataset Creation: .............................................. 4
4.2.1 Video to Images:........................................................................................................ 4
4.2.2 Take Pictures in Distinct Backgrounds: .................................................................... 5
4.2.3 Fingertip Recognition Model Training:..................................................................... 6
CHAPTER 5 ALGORITHM OF WORKFLOW....................................................................... 7
CONCLUSION & FUTURE WORK ........................................................................................ 8
REFERENCES .......................................................................................................................... 9
vi
TABLE OF FIGURE
vii
CHAPTER 1 INTRODUCTION
1.1 INTRODUCTION
In the era of digital world, traditional art of writing is being replaced by digital art. Digital
art refers to forms of expression and transmission of art form with digital form. Relying on
modern science and technology is the distinctive characteristics of the digital
manifestation. Traditional art refers to the art form which is created before the digital art.
From the recipient to analyse, it can simply be divided into visual art, audio art, audio-
visual art and audio-visual imaginary art, which includes literature, painting, sculpture,
architecture, music, dance, drama and other works of art. Digital art and traditional art are
interrelated and interdependent. Social development is not a people's will, but the needs of
human life are the main driving force anyway. The same situation happens in art. In the
present circumstances, digital art and traditional art are inclusive of the symbiotic state, so
we need to systematically understand the basic knowBluege of the form between digital art
and traditional art. The traditional way includes pen and paper, chalk and board method of
writing. The essential aim of digital art is of building hand gesture recognition system to
write digitally. Digital art includes many ways of writing like by using keyboard, touch-
screen surface, digital pen, stylus, using electronic hand gloves, etc. But in this system, we
are using hand gesture recognition with the use of machine learning algorithm by using
python programming, which creates natural interaction between man and machine. With
the advancement in technology, the need of development of natural ‘human – computer
interaction (HCI)’ [10] systems to replace traditional systems is increasing rapidly.
This paper's remainder is categorized as follows: Section 2 presents the other pieces of
literature that we referred to before working on this project. Section 3 describes the
challenges we faced while making this system. In Section 4, we define the problem
statement we were solving. Section 5 provides the system methodology and workflow that
we followed. The subsections of section 5 include - Fingertip Recognition Dataset Creation
and Fingertip Recognition Model Training. Section 6 algorithm of workflow.
1
CHAPTER 2 LITERATURE REVIEW
2
CHAPTER 3 CHALLENGES IDENTIFIED
3
CHAPTER 4 METHODOLOGY
This system needs a dataset for the Fingertip Detection Model. The Fingertip Model's
primary purpose is used to record the motion, i.e., the air character.
4
Figure 3 Video to Images
5
4.2.3 Fingertip Recognition Model Training:
Once the dataset was ready and labelBlue, it is divided into train and dev sets (85%-15%).
We used Single Shot Detector (SSD) and Faster RCNN pre-trained models to train our
dataset. Faster RCNN was much better in terms of accuracy as compared to SSD. Please
refer to the results Section for more information. SSDs combine two standard object
detection modules – one which proposes regions and the other which classifies them.
These speeds up the performance as objects are detected in a single shot. It is commonly
used for real-time object detections. Faster RCNN uses an output feature map from Fast
RCNN to compute region proposals. They are evaluated by a Region Proposal Network
and passed to a Region of Interest pooling layer. The result is finally given to two fully
connected layers for classification and bounding box regression [15]. We tuned the last
fully connected layer of Faster RCNN to recognize the fingertip in the image.
6
CHAPTER 5 ALGORITHM OF WORKFLOW
This is the most exciting part of our system. Writing involves a lot of functionalities. So,
the number of gestures used for controlling the system is equal to these number of actions
involved. The basic functionalities we included in our system are
1. Writing Mode - In this state, the system will trace the fingertip coordinates and stores
them.
2. Colour Mode – The user can change the colour of the text among the various available
colours.
3. Backspace - Say if the user goes wrong, we need a gesture to add a quick backspace.
7
CONCLUSION & FUTURE WORK
The system has the potential to challenge traditional writing methods. It eradicates the need
to carry a mobile phone in hand to jot down notes, providing a simple on the-go way to do
the same. It will also serve a great purpose in helping especially abBlue people
communicate easily. Even senior citizens or people who find it difficult to use keyboards
will able to use system effortlessly. Extending the functionality, system can also be used to
control IoT devices shortly. Drawing in the air can also be made possible. The system will
be an excellent software for smart wearables using which people could better interact with
the digital world. Augmented Reality can make text come alive. There are some limitations
of the system which can be improved in the future. Firstly, using a handwriting recognizer
in place of a character recognizer will allow the user to write word by word, making
writing faster. Secondly, hand-gestures with a pause can be used to control the real-time
system as done by [1] instead of using the number of fingertips. Thirdly, our system
sometimes recognizes fingertips in the background and changes their state. Air-writing
systems should only obey their master's control gestures and should not be misBlue by
people around. Also, we used the EMNIST dataset, which is not a proper air-character
dataset. Upcoming object detection algorithms such as YOLO v3 can improve fingertip
recognition accuracy and speed. In the future, advances in Artificial Intelligence will
enhance the efficiency of air-writing
8
REFERENCES
[1] Y. Huang, X. Liu, X. Zhang, and L. Jin, "A Pointing Gesture Based Egocentric
Interaction System: Dataset, Approach, and Application," 2016 IEEE Conference on
Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, pp. 370-
377, 2016.
[2] P. Ramasamy, G. Prabhu, and R. Srinivasan, "An economical air writing system is
converting finger movements to text using a web camera," 2016 International Conference
on Recent Trends in Information Technology (ICRTIT), Chennai, pp. 1-6, 2016.
[3] Saira Beg, M. Fahad Khan and Faisal Baig, "Text Writing in Air," Journal of
Information Display Volume 14, Issue 4, 2013
[4] Alper Yilmaz, Omar Javed, Mubarak Shah, "Object Tracking: A Survey", ACM
Computer Survey. Vol. 38, Issue. 4, Article 13, Pp. 1-45, 2006)
[5] Yuan-Hsiang Chang, Chen-Ming Chang, "Automatic Hand-Pose Trajectory Tracking
System Using Video Sequences", INTECH, pp. 132- 152, Croatia, 2010
[6] Erik B. Sudderth, Michael I. Mandel, William T. Freeman, Alan S. Willsky, "Visual
Hand Tracking Using Nonparametric Belief Propagation", MIT Laboratory For
Information & Decision Systems Technical Report P2603, Presented at IEEE CVPR
Workshop On Generative Model-Based Vision, Pp. 1-9, 2004
[7] T. Grossman, R. Balakrishnan, G. Kurtenbach, G. Fitzmaurice, A. Khan, and B.
Buxton, "Creating Principal 3D Curves with Digital Tape Drawing," Proc. Conf. Human
Factors Computing Systems (CHI' 02), pp. 121- 128, 2002.
[8] T. A. C. Bragatto, G. I. S. Ruas, M. V. Lamar, "Real-time Video-Based Finger Spelling
Recognition System Using Low Computational Complexity Artificial Neural Networks",
IEEE ITS, pp. 393-397, 2006
[9] Yusuke Araga, Makoto Shirabayashi, Keishi Kaida, Hiroomi Hikawa, "Real Time
Gesture Recognition System Using Posture Classifier and Jordan Recurrent Neural
Network", IEEE World Congress on Computational Intelligence, Brisbane, Australia, 2012
[10] Ruiduo Yang, Sudeep Sarkar, "CoupBlue grouping and matching for sign and gesture
recognition", Computer Vision and Image Understanding, Elsevier, 2008
[11] R. Wang, S. Paris, and J. Popovic, "6D hands: markerless hand-tracking for computer-
aided design," in Proc. 24th Ann. ACM Symp. User Interface Softw. Technol., 2011, pp.
549–558.
[12] Maryam Khosravi Nahouji, "2D Finger Motion Tracking, Implementation For
Android Based Smartphones", Master's Thesis, CHALMERS Applied Information
Technology,2012, pp 1-48
[13] EshedOhn-Bar, Mohan ManubhaiTrivedi, "Hand Gesture Recognition In Real-Time
For Automotive Interfaces," IEEE Transactions on Intelligent Transportation Systems,
VOL. 15, NO. 6, December 2014, pp 2368-2377
9
[14] P. Ramasamy, G. Prabhu, and R. Srinivasan, "An economical air writing system is
converting finger movements to text using a web camera," 2016 International Conference
on Recent Trends in Information Technology (ICRTIT), Chennai, 2016, pp. 1- 6.
[15] Kenji Oka, Yoichi Sato, and Hideki Koike, "Real-Time Fingertip Tracking and
Gesture Recognition," IEEE Computer Graphics and Applications, 2002, pp.64-71.
[16] H.M. Cooper, "Sign Language Recognition: Generalising to More Complex Corpora",
Ph.D. Thesis, Centre for Vision, Speech and Signal Processing Faculty of Engineering and
Physical Sciences, University of Surrey, UK, 2012
[17] Y. Huang, X. Liu, X. Zhang, and L. Jin, "A Pointing Gesture Based Egocentric
Interaction System: Dataset, Approach, and Application," 2016 IEEE Conference on
Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, pp. 370-
377, 2016
[18] Vladimir I. Pavlovic, Rajeev Sharma, and Thomas S. Huang, "Visual Interpretation of
Hand Gestures for Human-Computer Interaction: A Review," IEEE Transactions on
Pattern Analysis and Machine Intelligence, VOL. 19, NO. 7, JULY 1997, pp.677-695
[19] Guo-Zhen Wang, Yi-Pai Huang, Tian-Sheeran Chang, and Tsu-Han Chen, "Bare
Finger 3D Air-Touch System Using an Embedded Optical Sensor Array for Mobile
Displays", Journal Of Display Technology, VOL. 10, NO. 1, JANUARY 2014, pp.13-18
[20] Napa Sae-Bae, Kowsar Ahmed, Katherine Isbister, NasirMemon, "Biometric-rich
gestures: a novel approach to authentication on multi-touch devices," Proc. SIGCHI
Conference on Human Factors in Computing System,2005, pp.977-986
[21] W. Makela, "Working 3D Meshes and Particles with Finger Tips, towards an
Immersive Artists' Interface," Proc. IEEE Virtual Reality Workshop, pp. 77-80, 2005.
[22] A.D. Gregory, S.A. Ehmann, and M.C. Lin, "inTouch: Interactive Multiresolution
Modeling and 3D Painting with a Haptic Interface," Proc. IEEE Virtual Reality (VR' 02),
pp. 45-52, 2000.
[23] W. C. Westerman, H. Lamiraux, and M. E. Dreisbach, “Swipe gestures for touch
screen keyboards,” Nov. 15 2011, US Patent 8,059,101
[24] S. Vikram, L. Li, and S. Russell, "Handwriting and gestures in the air, recognizing on
the fly," in Proceedings of the CHI, vol. 13, 2013, pp. 1179–1184.
[25] X. Liu, Y. Huang, X. Zhang, and L. Jin. "Fingertip in the eye: A cascaded CNN
pipeline for the real-time fingertip detection in egocentric videos," CoRR, abs/1511.02282,
2015.
10