Hussain 2017
Hussain 2017
Soeb Hussain and Rupal Saxena Xie Han, Jameel Ahmed Khan, Prof. Hyunchul Shin
Department of Chemistry Dept. of Electronics and Communication Engineering
Indian Institute of Technology, Guwahati Hanyang University
Guwahati, India Sangnok-gu, Korea
[email protected] and [email protected] [email protected] and [email protected]
Abstract—In order to offer new possibilities to interact with Each frame after resizing and padding is entered to the
machine and to design more natural and more intuitive classifier. If the classified hand is a static gesture then it
interactions with computing machines, our research aims at the immediately passes to commanding phase. Otherwise, it passes
automatic interpretation of gestures based on computer vision. In to hand tracing phase. The block diagram of our proposed
this paper, we propose a technique which commands computer method is shown in Fig.2.
using six static and eight dynamic hand gestures. The three main
steps are: hand shape recognition, tracing of detected hand (if
dynamic), and converting the data into the required command.
Experiments show 93.09% accuracy.
I. INTRODUCTION
Gesture recognition is the mathematical interpretation of a
human motion by a computing device. Modern research of the
control of computers changes from standard peripheral devices
to remotely commanding computers through speech, emotions
and body gestures [1]. Our application belongs to the domain of
hand gesture recognition which is generally divided into two Fig. 2. Workflow
categories i.e. contact-based and vision-based approaches. The
second type is simpler and intuitive as it employs video image II. HAND SHAPE RECOGNITION USING TRANSFER LEARNING
processing and pattern recognition. For hand shape recognition, the classifier is trained through
the process of transfer learning [3] over a pretrained CNN that
The aim is to recognize six static and eight dynamic is initially trained on a large dataset.
gestures while maintaining accuracy and speed of the system.
The recognized gestures are to command the computer. Transfer learning is transferring learned features of a
pretrained network to a new problem. The initial layers of the
Division of hand gestures are explained in the block pretrained network can be fixed, the last few layers must be
diagram shown in Fig. 1. fine-tuned to learn the specific features of the new data set.
2 Gestures
Multidirectional
2 shapes
Pointer In our work, VGG16 a CNN architecture is used as the
pretrained model. It consists of 13 convolution layers
8 Gestures Cursor
Dynamic 5 shapes
followed by 3 fully connected layers. A convolutional neural
network (CNN) is a type of feed-forward artificial neural
14 Gestures 6 Gestures Swap Left
11 shapes Unidirectional 3 shapes Swap Right network in which the connectivity pattern between its neurons
Zoom In
is inspired by the organization of the animal visual cortex. We
Static 6 Gestures
6 shapes Zoom Out need to recognize eleven hand shapes, hence CNN is
Scroll Up trained as a classifier using transfer learning method. To reach
Scroll Down the desired output, network model needs to be altered.
Therefore, two layers of the model were replaced with a set of
Fig. 1. Division of eleven shapes into fourteen gestures layers that can classify 11 classes. All other layers remained
unaltered. To avoid over fitting, the Regularization along with
For hand shape recognition, a CNN based classifier is
a more diverse dataset was introduced. Regularization involves
trained through the process of transfer learning over a
modifying the performance function which is normally chosen
pretrained convolutional neural net which is initially trained on
to be the sum of the square of the network errors on the training
a large dataset. We are using VGG16 [2] as the pretrained
set. The Classifier used over 55 thousand self-created image
model.