0% found this document useful (0 votes)
43 views15 pages

What Is Gesture1

The document discusses gesture recognition for human-computer interaction. It describes gesture recognition as a complex task involving motion modeling, analysis, and pattern recognition. It discusses using vision-based methods and glove-based methods for gesture input. The document then covers preprocessing techniques for gesture recognition, including detecting skin regions and motion. It also discusses training a neural network to recognize gestures for applications like sign language translation.

Uploaded by

vijaykumar950
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views15 pages

What Is Gesture1

The document discusses gesture recognition for human-computer interaction. It describes gesture recognition as a complex task involving motion modeling, analysis, and pattern recognition. It discusses using vision-based methods and glove-based methods for gesture input. The document then covers preprocessing techniques for gesture recognition, including detecting skin regions and motion. It also discusses training a neural network to recognize gestures for applications like sign language translation.

Uploaded by

vijaykumar950
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 15

What is Gesture:-

A movement of a limb or the body as an expression of thought or


feeling.

Gesture Recognition = Complex Task:-


� Motion modeling
�Motion analysis
�Pattern recognition
�Machine learning
�Psycholinguistic studies
�…

Introduction:-

�Interaction with computers are not comfortable experience

�Computers should communicate with people with body


language.

�Hand gesture recognition becomes important

�Interactive human-machine interface and virtual


environment

�Two common technologies for hand gesture recognition

�glove-based method

�Using special glove-based device to extract hand


posture

�Annoying

�vision-based method

�3D hand/arm modeling

�Appearance modeling
Background and Trends:-

� In Today’s world:

• Many devices with integrated cameras

• Many personal webcams

• Our Goal:

• To understand how to take advantage of these one


camera systems

Mood, emotion:-

• Mood and emotion are expressed by body language

• Facial expressions

• Tone of voice

• Allows computers to interact with human beings in a more


natural way

Tasks to be performed:-

• Making of gestures in front of the camera

• Gesture detection at a suitable frame rate

• Capturing the gestures and storing them in a .jpg file


• System training to recognize the gestures with a low error
rate

• Execution of events upon the successful gesture recognition


on the webpage

Human Gesture Representation:-

�Psycholinguistics research by Stokoe:


� Hand shape
�Position
�Orientation
�Movement

�Application scenarios of gestures


�Conversational
�Controlling
�eg: vision-based interfaces
�Manipulation
�eg: Interact with virtual objects
�Communication
�eg: Sign language → Highly structured

CSL and Pre-processing:-

�Sign Language

�Rely on the hearing society

�Two main elements:

�Low and simple level signed alphabet, mimics the


letters of the native spoken language

�Higher level signed language, using actions to


mimic the meaning or description of the sign
�CSL is the abbreviation for Chinese Sign Language

�30 letters in CSL alphabet ßà Objects in recognition

Human Computer Interface using Gesture:-

• Replace mouse and keyboard

• Pointing gestures

• Navigate in a virtual environment

• Pick up and manipulate virtual objects

• Interact with a 3D world

• No physical contact with computer

• Communicate at a distance

Gesture Making:-

 Usage of a small set of gestures (fingers).

 Every finger raised will perform some predefined navigation


of the webpage

 System capabilities can be programmed to accommodate


other human gestures as well
 Error in detection can be reduced by training

Pre-processing of Hand Gesture Recognition:-

�Detection of Hand Gesture Regions

�Aim to fix on the valid frames and locate the hand region
from the rest of the image.

�Low time consuming à fast processing rate à real time


speed

�Detect skin region from the rest of the image by using color.

�Each color has three components

�hue, saturation, and value

�chroma consists of hue and saturation is separated


from value

�Under different condition, chroma is invariant.

�Color is represented in RGB space, also in YUV and YIQ


space.

�In YUV space

�saturation à displacement
C= | U |2 + | V |2
�hue -> amplitudeθ = tan
−1
(V / U )

�In YIQ space

�The color saturation cue I is combined with Θto


reinforce the segmentation effect

�Skins are between red and yellow

�Transform color pixel point P from RGB to YUV and YIQ


space

�Skin region is:

�105 º <= Θ<= 150 º

�30 <= I <= 100

�Hands and faces


�On-line video stream containing hand gestures can be
considered as a signal S(x, y, t)

� (x,y) denotes the image coordinate

�t denotes time

�Convert image from RGB to HIS to extract intensity signal


I(x,y,t)

�Based on the representation by YUV and YIQ, skin pixels can


be detected and form a binary image sequence M’(x,y,t) –
region mask
�Another binary image sequence M’’(x,y,t) which reflects the
motion information is produced between every consecutive pair
of intensity images – motion mask

�M(x,y,t) delineating the moving skin region by using logical


AND between the corresponding region mask and motion mask
sequence

�Normalization

�Transformed the detection results into gray-scale


images with 36*36 pixels.

Gesture Detection:-

 Gestures are detected at a suitable frame rate.


 The camera captures the hand gesture and we apply canny
edge detection algorithm to store the gestures in the
following format

Locally Linear Embedding:-

�Sparse data vs. High dimensional space

�30 different gestures, 120 samples/gesture

�36*36 pixels

�3600 training samples vs. d = 1296

�Difficult to describe the data distribution

�Reduce the dimensionality of hand gesture images

�Locally Linear Embedding maps the high-dimensional data


to a single global coordinate system to preserve the
neighbouring relations.

�Given n input vectors {x1, x2, …, xn},

è LLE algorithm

è {y1, y2, …, yn} (m<<d)

�Find the k nearest neighbours of each point xi


�Measure reconstruction error from the approximation of
each point by the neighbour points and compute the
reconstruction weights which minimize the error

� Compute the low-embedding by minimizing an embedding


cost function with the reconstruction weights

Sign Language:-

• 5000 gestures in vocabulary

• each gesture consists of a hand shape, a hand motion and a


location in 3D space

• facial expressions are important

• full grammar and syntax

• each country has its own Sign language

• Irish Sign Language is different from British Sign Language


or American Sign Language

A B C
System Training:-

 System training is done using “Neuroph” an open source


Image Recognition tool that takes images as input and
produces a neural network.

 This Neural network can be trained to recognize the gestures

 This can be used with Java Classes to be integrated in our


application, using plug-in provided with the tool

Datagloves:-

• Datagloves provide very accurate measurements of hand-


shape

• But are cumbersome to wear

• Expensive

• Connected by wires- restricts freedom of movement


Datagloves - the future:-

• Will get lighter and more flexible

• Will get cheaper ~ $100

• Wireless?

Our vision-based system:-

• Our vision-based system

Wireless & Flexible No specialised hardware

Single Camera Real-time


Coloured Gloves:-

• User must wear coloured gloves

• Very cheap

• Easy to put on

• BUT get dirty

• Eventually we wish to use natural skin

Feature Space:-

Each point represents a different image


Clusters of points represent different hand-shapes

Distance between points depends on how similar the images are

A continuous gesture creates a trajectory in feature space

We can project a new


image onto the trajectory

Experiments:-

�4125 images including all 30 hand gestures

�60% for training , 40% for testing

�For each image:

�320*240 image, 24b color depth

�Taken from camera with different distance and


orientation
Experiment Results:-

Data # of Samples Recognized Recognition Rate


Samples (%)

Training 2475 2309 93.3

Testing 1650 1495 90.6

Total 4125 3804 92.2

Conclusion:-

�Robust against similar postures in different light conditions


and backgrounds

�Fast detection process, allows the real time video


application with low cost sensors, such as PC and USB camera

You might also like