Paper Template1
Paper Template1
Recognition
Sabeshini.M, Prashanti.M, Priyanka.K
Information Technology, SRM Valliammai Engineering College, SRM University, Chengalpattu, India
Information Technology, SRM Valliammai Engineering College, SRM University, Chengalpattu, India
Information Technology, SRM Valliammai Engineering College, SRM University, Chengalpattu, India
Abstract
With the most recent developments, hand gesture recognition is becoming a popular and
efficient means of communication for the deaf and non-verbal community. Individuals with
hearing impairments require assistance globally, yet only a small percentage of people
worldwide are able to read sign language. Sign languages, sometimes referred to as signed
languages, are languages in which meaning is expressed visually rather than verbally. It
becomes difficult to extract motions from images taken by cameras because of a number of
issues.
Problems such as intensity change provide difficulties, noise and other inputs slow down
calculation, and complicated backgrounds make gesture extraction even more difficult.
Directional pictures are used to introduce the pre-processed image region of interest.
landmarks following manual extraction Convex hulls are used to extract landmarks after
manual extraction. Then, the Convolutional Neural Network (CNN) classifier used the
retrieved features to assist in gesture detection and recognition.
As a result, a hand recognition system that recognizes hand gestures regardless of
background noise and noise was created using CNN classifier.
Keywords: CNN, ISL
1. Introduction
To address this issue, we are presenting a design based on hand gesture recognition,
which emphasizes recognizing the hand gesture using many algorithms and the output
would be in text format. The main aim of this design is to make communication trouble-
free between people from all communities using advanced machine-learning techniques.
The main difference among others work is overcoming challenges like generalization,
accuracy, and efficiency.
2. Related Work
3. Proposed Work
The proposed model uses CNN and Convex Hull algorithm to translate the ISL sign
languages into a text format.
Firstly, the proposed work mainly focuses on CNN with a minimal use of Convex Hull
algorithm. CNN, abbreviated as Convoluted Neural Network were a quite useful when it
comes to Computer Vision. Computer Vision is based on getting resources from images. Static
images were used, and data is collected from it.
There are furthermore process which were done step by step to make our model work
efficiently. Those models were explained on upcoming sub-divisions.
Image Collection
Image Pre-Processing
In this phase, the CNN algorithm comes in help as it recognizes patterns from images
and videos. It consists of multiple layers like Input layer, Convolutional layer, Max Pooling
layer, Dense layer, Output layer.
Figure 2. Pictorial representation of CNN
The input layer processes the image one give and make it go through multiple layers to
get the final output. In CNN, generally the input will be an image or a sequence of images that
we upload.
The input image of a hand signing a letter is taken and it was made to pass through the
program. The system trains to recognizes the distinct features of the images and stores it under
a common name. The images turn into grayscale images so that the distinct features and
outline of the hand can be viewed without interrupted by things such as skin texture,
background things and so on.
Feature Extraction
Unlike the image in figure 3 which has clear outline of hand, the image in figure 4 have
noises. The noises are occurred due to non-blank background or background with obstacles.
Thus, making sure that the image is taken in a blank background is an inevitable step.
Classification
After the image has been identified using the algorithm, the subsequent task involves
categorizing the images. This categorization process is achieved through machine learning
techniques applied to the trained datasets. Consequently, it enables the detection of gestures
performed within a video.
The proposed work has been implemented in a series of steps as given below and the output is
obtained from the same.
1.Data collection - Gather a diverse dataset of hand gestures commonly used in ISL. As
this dataset represent a wide range of signs and expressions within the language.
2.Preprocessing - Clean and preprocess the collected data. This may involve techniques
such as noise reduction, normalization, and resizing to standardize the input data for the
recognition system.
3.Feature extraction - Identify and extract relevant features from the pre-processed data. In
the context of ISL hand gesture recognition, features could include the position of fingers,
hand shape, and motion patterns.
4.Training the model - Train the selected model using the labeled dataset. The model
learns to associate the extracted features with specific ISL hand gestures during the
training process.
5.Validation & testing - Evaluate the model's performance on separate datasets not used
during training. This helps assess the model's ability to generalize to new data and ensures
its reliability in real-world scenarios.
6.User interface - Develop a user-friendly interface that allows users to interact using
recognized ISL hand gestures. The interface may include visual feedback or text
translations of the gestures to enhance communication.
5. Conclusion
In summary, the use of Convolutional Neural Networks for ISL hand gesture
recognition holds great promise, with the potential to facilitate communication for
individuals with hearing impairments. As technology advances and datasets expand,
this approach can contribute significantly to the development of accessible and
inclusive applications in the field of sign language recognition.
References
[1] S. C. Agrawal, A. S. Jalal, and C. Bhatnagar. Recognition of indian sign language using
feature fusion. In 2012 4th International Conference on Intelligent Human Computer
Interaction (IHCI), pages 1–5, 2012.
[2] Herbert Bay, TinneTuytelaars, and Luc Van Gool. Surf: Speeded up robust features. In
AleˇsLeonardis, Horst Bischof, and Axel Pinz, editors, Computer Vision – ECCV 2006,
pages 404–417, Berlin, Heidelberg, 2006. Springer Berlin Heidelberg.