Project Phase 2
Project Phase 2
•Researches have found interesting results experimenting upon the two fields which has lead to
creation of many other interesting fields such as object detection, object tracking, image processing
and more
•Here in our project keeping “Vision Based Human Detection Techniques: A descriptive review", by
Shahriar Shakir Sumit, Dayang rohaya amang Rambli and Seyedali Mirjalili as the base paper, we
are implementing a project of hand tracking and gesture recognition to combine AI with computer
vision to achieve complete control over computer without the use of Keyboard or mouse.
• The project involves simple and predefined gestures which is recognised by the AI via camera in the
computer and performs tasks accordingly.
• This project can be helpful to keep a wide step ahead to overcome the increasing problem of e-waste
as the project completely eliminates the use of keyboards and mouse resulting in less production and
hence less e-waste further this project is also helpful for patients who are bedridden and has no way
of communicating other than gestures, this project helps them to communicate with others as they
can type and point to icons in system.
LITERATURE SURVEY
SL no: Author Tittle of Journal Major Focus Year of
Publication
1 C.Papageorgiote A Trainable System for Example based learning approach 2000
T.Poggio Object Detection towards a model of an object class by
training a support vector machine
classifier using a large set of positive and
negative examples.
2 G.Shu, G.Fu, Violent behavior To avoid fighting and violence occurred in 2014
P.Li and H.Geng detection based the elevator, this paper proposed an
On SVM in the elevator abnormal behavior detection method
based on SVM to achieve real-time
monitoring.
3 S.Satheesh, S.Ma, ImageNet large Scale Challenges of collecting large scale 2014
O.Russakovsky, Visual Recognition annotations and to compare the
J.Deng, H.Su, Challenge computer vision accuracy with human
J.Krause accuracy
Z.Huang, A.Khosla,
A.Karpathy,
M.Bernstien, A.C.Berg,
And L.Fei -Fie
4 Shahriar Shakir Vision Based Humans To discuss pros and 2020
Sumit, Detection Techniques: cons of different
Dayang Rohaya A descriptive Reviews real time object
Amang Rambli, detection
Seyedali Mirjalili techniques such as
Viola-Jones, SIFA
and more
PROBLEM IDENTIFICATION
As the new technology advances it also has its drawbacks and e-waste is one such drawback which is
at an alarming sate.
Annually 53.6 million tons of e-waste is produced globally i.e 7.3 kg per person and around 7.71
million tons of e-waste is produced alone by India and is increasing at the rate of 8.86% per year.
Major part of this e-waste are damaged electronic devices, devices which are old and which cannot
project helps them to communicate with others as they can type and point to icons on the screen.
METHODOLOGY
• The idea is to use to camera integrated to the system and start capturing the video of the user from
which the background is not processed but the user's hands are detected from the video and the rest are
neglected. To perform this task we use an open source framework developed by Google in 2019 using
Machine Learning by manually annotating thirty thousand hands, and this framework is known as
Media pipe.
• Media Pipe Hands utilizes an ML pipeline consisting of multiple models working together: A palm
detection model that operates on the full image and returns an oriented hand bounding box. A hand
landmark model that operates on the cropped image region defined by the palm detector and returns
high-fidelity 3D hand key points.
• Initially the data/result obtained from tracking is though to understand as the values won't be in
proper format and the framework returns us the X,Y and Z position values of the hand landmarks,
so we use "List" concept to list the obtained values in an understandable format and each node of the
list contains the associated landmark number and it X and Y positions (we neglect the Z values).
• The X and Y values are distance of the hand and height at which hand is position respectively
keeping camera as the origin point.
• We further use these positions and integrate it to the cursor so that the X and Y position is used as the
input for the cursor for its next location on the screen. Initially there might be a lot of abrupt position
changes due to fast movements of hand but this problem can be tackled by formatting the X and Y
values before its given as input thereby having a gradual movement of the cursor.
• A virtual keyboard is needed to be displayed on the screen so that user's can use it instead of the
traditional keyboard, and to perform the typing operation we use the same method discussed above
i.e use X and Y values to hover over the keys and then perform the click operation
RESULTS
• The results obtained after developing and executing both modules i.e "HandTracking" and
"Mouse" has been as expected. From the "HandTracking" module we are able to get the X and Y
position values of the landmarks of the hand with respect to camera and represented in the format
[landmark_number, X_position_value, Y_position-value].
• And from the "Mouse" module we are able to move the cursor through gestures and perform a
click function.
• Though working, but still has few positioning issues which needs to be fixed during further
development of the project.
SNAPSHOTS