IRJAEH0202016 - Real-Time Sign Language Recognition and Translation Using Deep Learning Techniques
IRJAEH0202016 - Real-Time Sign Language Recognition and Translation Using Deep Learning Techniques
e ISSN: 2584-2137
Vol. 02 Issue: 02 February 2024
Page No: 93 - 97
https://irjaeh.com
Abstract
Sign Language Recognition (SLR) recognizes hand gestures and produces the corresponding text or speech.
Despite advances in deep learning, the SLR still faces challenges in terms of accuracy and visual quality. Sign
Language Translation (SLT) aims to translate sign language images or videos into spoken language, which is
hampered by limited language comprehension datasets. This paper presents an innovative approach for sign
language recognition and conversion to text using a custom dataset containing 15 different classes, each class
containing 70-75 different images. The proposed solution uses the YOLOv5 architecture, a state-of-the-art
Convolutional Neural Network (CNN) to achieve robust and accurate sign language recognition. With careful
training and optimization, the model achieves impressive mAP values (average accuracy) of 92% to 99% for
each of the 15 classes. An extensive dataset combined with the YOLOv5 model provides effective real-time
sign language interpretation, showing the potential to improve accessibility and communication for the
hearing impaired. This application lays the groundwork for further advances in sign language recognition
systems with implications for inclusive technology applications.
Keywords: Sign Language Recognition (SLR), Sign Language Translation (SLT), YOLO V5 architecture,
Convolution Neural Network (CNN), mAP values
1. Introduction
The main form of communication for the deaf and of a Sign Language Recognition (SLR) system for
dumb is sign language (SL), which differs from hard-of-hearing and speech-impaired people is
spoken or written language in terms of vocabulary, emphasised in the study. Current SLRs frequently
meaning, and grammar. There are between 138 and depend on several depth sensor cameras or pricey
300 distinct forms of sign language used worldwide; wearable sensors. The suggested method presents a
in India, where there are 7 million deaf people, there framework for multilingual sign language
are only around 250 licenced interpreters. It is recognition that is based on vision and tracks and
difficult to teach sign language to the community extracts multi-semantic manual co-articulations, such
because of this lack. To overcome communication as one- and two-handed signals, in addition to non-
hurdles, sign language recognition uses computer manual components like body language and facial
vision and deep learning to identify hand motions and emotions. The objective is to isolate and extract
transform them into text or voice [1]. The importance different signals and non-manual motions to create a