A Tool To Convert Audio/Text To Sign Language Using Python Libraries

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Turkish Journal of Physiotherapy and Rehabilitation; 32(2)

ISSN 2651-4451 | e-ISSN 2651-446X

A TOOL TO CONVERT AUDIO/TEXT TO SIGN LANGUAGE USING PYTHON


LIBRARIES

DR M MADHUBALA1,N V KRISHNA RAO2,DR.Y MOHANA ROOPA3, D KALYANI4,M PRANAY5,


CH KRISHNAM RAJU6
1,2,3,4,5,6
Department of Computer Science and Engineering ,Institute of Aeronautical
Engineering,Dundigal,Hyderabad-500043.

ABSTRACT:

Deaf people's mother tongue is sign language, which is a visual language. Unlike acoustically transmitted
sound patterns, sign language employs body language and manual communication to express a person's
thoughts. It is done by integrating hand forms, hand orientation and movement, and facial expressions all at
the same time. It can be used by people who have difficulty in listening, people who can hear but cannot
speak, and regular people to communicate with people who are deaf or hard of hearing. It is important for a
deaf person's psychological, mental, and linguistic advancement to have access to a sign language.Deaf
people's first language should be accepted, and their schooling should be conducted bilingually in sign
language and written or spoken language. Deaf and hard-of-hearing people use Indian Sign Language to
communicate by making various body gestures.There are various groups of deaf people all over the world,
and their languages will be different as well. American Sign Language (ASL) is used in the United States;
British Sign Language (BSL) is used in the United Kingdom; and Indian Sign Language (ISL) is used in
India for transmitting feelings and communicating. Manual speech and body language (non-manual
communication) are used to express emotions, concepts, and feelings in "Indian Sign Language (ISL)." One
handed, two handed, and non-manual ISL signals can both be grouped into three groups.Manual signs, such
as one-handed and two-handed signs, are made with the signer's hands to communicate information. By
altering body posture and facial expressions, non-manual signs are produced. The website we are going to
createmainly focuses on Manual signs and converts English text into Sign language, enabling hearing-
impaired people in India to communicate with others.

Keywords: Sign Language, Natural Language Processing, Django, Asynchronous server gateway
interface(asgi), Web server gateway interface(wsgi).

I. INTRODUCTION
To communicate with each other and with other deaf people, deaf people require sign language. Furthermore,
some ethnic groups with very distinct phonologies (such as Plain Indians Sign Language and Plateau Sign
Language) have used sign languages to communicate with other ethnic groups.The study of physical sounds in
human speech is alluded as phonology. Sign language's phonology can be established. Phonemes are separate
signs in a row of hand signs, rather than sounds. The following variables are taken into consideration:

1. Hand shape when rendering the sign is the first configuration.

2. Hand Orientation: This determines the direction in which the palm of the hand faces.

3. Placement: This refers to the location where the sign is over.

4. Hand motion when making the symbol.

5. Line of interaction: The part of the hand that creates contact with the body.

6. Plane: The sign is determined by the distance between the body and the object.

www.turkjphysiotherrehabil.org                  1767 
Turkish Journal of Physiotherapy and Rehabilitation; 32(2)
ISSN 2651-4451 | e-ISSN 2651-446X

The fundamentals of sign languages have been presented in a concise manner. The Phonology section contains
the core linguistic features that the scheme uses. The conditions that are taken into consideration are as follows:

1. Location: This is the location where the sign appears.

2. Movement: When making a symbol, the hand moves (straight, swaying, circularly).

3. Plane: The distance seen between body.

The parameters applied to these characteristics in relation of position, motion, and plane have been taken from the
ASL dictionary's various simple signs.

Fig.1: Predefined gestures


Deafness and deaf people are as old as mankind, but the contact and education of deaf people was first recorded
in the 16th century. In Spain, deaf children from wealthy families were put in the care of a monk who taught
them how to write. It was important to speak in order to achieve money. The argument about oral vs. sign
language, which has raged for decades, started with this case. The oral approach requires instructing or
communicating with Deaf people using spoken words. In Germany, this method was refined to the level that it
became known as the "German Method."The "French Method" gets its name from the fact that sign language was
developed and used extensively in French deaf schools. In 1880, an effort was made to scrub Sign Language from
the face of the earth. Hearing teachers and educationalists attended a conference in Milan (Italy) that issued a
statement prohibiting further use of Sign Language in Deaf schools. Sign Language has developed into a hidden
language. Outside of the school, Deaf students used Sign Language, but it became a living and universal
language.

Four critical things need to be stated in the creation of a symbol. They are modes of hands, position, action and
guidance.In Sign Language, finger spelling refers to how the alphabet's 26 letters are formed on the fingertips.
Finger spelling is used to spell people's names, locations, ideas, and terms for which there are no signs or for
which one has forgotten. Finger spelling is not the same as Sign Language. It's a code switch strategy in which
you represent the written English word in space at the time you finger spell a term in English. Finger Spelling is
restricted to individuals (deaf or hearing) who have been introduced to written English or some other spoken
language.

Software Details:
1. Programming Language: Python3, SQL
2. Packages Used: NLP, python libraries (like Django)
3. Tools: Blender
Hardware Details:
1. Device (Mobile/Laptop)
2. Mic
3. Monitor to view output

www.turkjphysiotherrehabil.org                  1768 
Turkkish Journal of
o Physiotherrapy and Rehaabilitation; 322(2)
SSN 2651-44551 | e-ISSN 2651-446X
IS 2

4. Internett for voice reccognition


5. Web brrowser

Fiig.2: Block diaagram for Text//Audio to sign Language


L Connversion

Thhe Existing soolutions: Hum mans are makking clever invventions everry year to assiist themselves and others who
w are
afffected by anyy disability, ass technology advances
a at a breakneck pace. We wantt to make it easier
e for deaff people
to communicate, so we've createdc a sign interpreter that convertts audio to siign language automaticallly. Sign
nguage is the only means of communiccation for thee deaf. Physiccally handicappped people use
lan u sign langguage to
communicate th heir feelings to others. Because comm mon people struggle
s to learn the particcular sign lannguage,
communication n becomes im mpossible. Because sign language consists of a vvariety of haand movemennts and
gestures, achievving the rightt precision at a low cost haas become a mammoth
m task. We now have physical devices
and software th hat can transsform audio to t sign languuage, so we'rre improving the tool with Natural Laanguage
Proocessing. Thee word librarry can be ennlarged to incclude the vasst majority off frequently usedu English words.
Ussing various NLP
N algorithm ms, speech to text conversiion can be im
mproved and teext processingg can be optim mized.

Ouur objective is to comprehhend the challlenges that sppecially abledd people facee on a daily basis
b and to devise
d a
sollution that is a. cost-effecttive, b. adaptaable by peoplle, and c. sim
mple to implem
ment. Undersstanding the needs
n of
thee disabled commmunity andd finding a sollution to them m is crucial too making a diffference. To increase
i the physical
p
and mental heallth of people with disabilitties, as well as their overall quality of life.

II. RE
ELATED WOR
RK
Deeploying with ASGI
A and WS
SGI

Fig.3: Asynchronous server gateway interface

Assynchronous Server
S Gatewway Interface is an acronym for ASGI. It enhances the functionaality of WSG
GI (Web
Seerver Gatewaay Interface),, which is a standardized method off communicaation between the serverr and a
weebapplications in most Py ython framewworks, includiing Django. Both ASGI aand WSGI area specificatiions for
prooviding a staandard interfa
face between Python web servers, app plications, annd frameworkks.The only concern
c
designers had when develo oping WSGI was to creaate a protoco ol that provides a comm mon ground forf web
Turkish Journal of Physiotherapy and Rehabilitation; 32(2)
ISSN 2651-4451 | e-ISSN 2651-446X

development so that users can easily switch between multiple web frameworks without worrying about the
specifics of how the new framework communicates with the server. And while WSGI did a good job of dealing
with these concerns, when the relatively new protocols other than HTTP (HyperText Transfer Protocol),
particularly WebSocket, began to gain popularity among web developers, WSGI failed to provide a method of
creating applications that could handle these protocols.

Because WSGI applications can only accept requests from the server and return replies to the client/server, it is
intrinsically suited to only handle HTTP protocol.WSGI applications are single, synchronous callable that take a
request as input and return a response, allowing you to:

 Connections are short-lived, which is ideal for HTTP but not for long-polling HTTP or WebSocket,
which have long-lived connections.

 Because requests in applications only have one route to follow, protocols with multiple incoming events
(such as receiving WebSocket frames) cannot be handled by a WSGI application.

ASGI is separated into two sections, each with its own set of responsibilities:

1. A protocol server that suspends sockets and maps them into connections and event messages for each
connection.
2. An application that runs within a protocol server is created only once per connection and treats event
messages as they occur.
The server, like WSGI, hosts and runs the application within it, as well as passing incoming requests to it in a
standardized format. Applications, unlike WSGI, are objects that accept events instead of just simple callable and
must run as coroutines capable of handling asynchronous I/O operations.In contrast to WSGI, a relationship
consists of two parts:

1. A connection scope is a representation of a protocol connection with a user that lasts for the period of the
connection.
2. When anything occurs on that connection, events are sent to the application.

Applications are started by passing a connection scope, and then they run in an event loop, handling events and
sending data back to the client in the form of events. A single incoming socket/connection is mapped to an
application instance that lasts for the period of the connection, or maybe a little longer if any cleanup is
required.A single asynchronous callable defines an ASGI application. It approves scope, which contained data
about the user requests, send, which enables you to send events to the client, and receive, which allows you to
receive events from the client.

The ASGI application can now accept both incoming and outgoing events as a matter of the reordering of the
design of the model, removing the restriction of the WSGI application having a single path for incoming requests.
Not only that, but the ASGI application can also run a background coroutine, allowing it to do more than just
handle requests in the background.

In an ASGI application, every event you send or receive is a Python dict with a predefined format. ASGI
applications can be easily swapped between different web servers thanks to these predefined event formats.These
events still have a type root level key, which can be helpful in determining the structure of the event.

www.turkjphysiotherrehabil.org                  1770 
Turkkish Journal of
o Physiotherrapy and Rehaabilitation; 322(2)
SSN 2651-44551 | e-ISSN 2651-446X
IS 2

Fig.4: ASSGI applicationn

Wh here applicattion is a WSGI applicatioon that takes two parameters: environ, which holdds informationn about
inccoming requeests, and start response, whhich is a callaable that returrns the HTTP header.To make
m ASGI baackward
compatible with h WSGI applications, we need to alloow for it to ru un WSGI synnchronous appplications wiithin an
asyynchronous coroutine.
c Addditionally, ASGI
A receivees incoming request inforrmation via scope, whilee WSGI
acccepts environnment. As a reesult, we musst map environn to scope.

F
Fig.5: WSGI with ASGI Appliication

III. PROCEDURE AND METH


HODOLOGY
Y
Turkkish Journal of
o Physiotherrapy and Rehaabilitation; 322(2)
SSN 2651-44551 | e-ISSN 2651-446X
IS 2

Fig.6: Prrocedure for siggn language website


w creationn

A. A2SL FOLDER:

1. ASG
GI.py and WS
SGI.py:
To make ASGI backward coompatible withh WSGI appllications, we need to allow w for it to runn WSGI
synnchronous appplications witthin an asyncchronous coro outine. Addittionally, ASGGI receives inncoming
reqquest informattion via scope, while WSG GI accepts ennvironment. As
A a result, wew must map environ
to scope.Asyncchronous Serrver Gatewayy Interface is an acronyym for ASG GI. It enhancces the
funnctionality off WSGI (Weeb Server G Gateway Inteerface), whichh is a standdardized metthod of
commmunication between thee server and a webapplication in mosst Python fraameworks, inncluding
Djaango. Both ASGI
A and WSGI are speciffications for providing
p a sttandard interfface between Python
web b servers, app
plications, andd frameworkss.

2. Hom
mepage view.p
py, URL’s and
d Settings:
Thee website vieew consists of
o “Home”, ““about”, “Log gin”, “signup””, “contact” and “converttor”. To
makke these view
ws possible, all
a the html fifiles related too those termss are program
mmed in vieww.py and
when you click on
o “login” in website it should open loggin page and settings are done
d accordingly.

B. ASSETS FOLDER:

Alll the recorded


d and created sign languagge animations are stored inn this folder. If
I the input given
g by
the user matchess with the nam
me of the signn language an
nimation that ooutput is dispplayed to the user.
u

C. TEMPL
LATES FOLD
DER:
Websitee Designing isi done here. Every html file
f which are added in vieew.py has its own
o descriptiion. For
examplle, home.htmll consists infoormation relaated to homeppage of the wwebsite and abbout.html connsists of
informaation related to
t the peoplee who createdd the website. Similarly, Login.html connsists of inforrmation
related to login pagee.
Turkkish Journal of
o Physiotherrapy and Rehaabilitation; 322(2)
SSN 2651-44551 | e-ISSN 2651-446X
IS 2

D. SQLite33 file:
The datta related to username
u and password in the login pagge stored in a database file format sqlite3.

E. Main.py
y:
This is the maain program m and wee have to import execute_from_ e _command_liinefrom
django..core.managem mentand run this main.pyy executable python
p file inn anaconda prrompt. Promppt gives
server information
i thhrough URL as an output.PPaste that UR
RL in a browser and websitte is shown.

Affter website crreation is donee, process goess in this way:

Fig.7: Scheematic diagram


m for sign langguage conversioon

he methodologgy follows ass input is giveen to the tool in the form of


Th o text/Audio.. If input is given as audioo, it gets
converted into text
t using speeech recognitiion tools in pyython based ono Natural lannguage processing.

A.. Audio to Text


T Converssion:
Au
udio input is taken
t using pyython PyAuddiomodule.

Co
onversion of audio
a to text using
u microphhone

Deependency parrser is used foor analyzingggrammar of thhe sentence annd obtaining relationship
r b
between wordds.

Fig.8: Audioo to text converrsion

B.. Splitting:
If the
t word is foound send viddeo to the useer as output ottherwise breaak the sentencce into words and fetch thee videos
of that words annd combine th
hem into a sinngle video and show the viideo to the user as output.
Turkish Journal of Physiotherapy and Rehabilitation; 32(2)
ISSN 2651-4451 | e-ISSN 2651-446X

Fig.9: Sentence splits into found word Fig.10: words that do not found
in database splits into letters

C. Searching:
Create animations for datasets using blender tool. The word library has been expanded to include most of the
commonly used word in English Language as well as technical words and others which are available. Speech to
text conversions can be made more accurate and text processing can be optimized using various NLP algorithms
which are available. Search for the entered text/Audio in the database created.Create another database file
through SQLite to store the usernames who gets signup/login into the website.Client and server communicates
through Django modules (ASGI & WSGI)

D. Fetching Sign Language Animation:


After clicking on submit button, sign language animation is displayed.

IV. IMPLEMENTATION AND RESULTS


Output generation

The equivalent sign language representation of a given English text is used to generate output. The system's
output would be a clip of ISL vocabulary. Each and every distinct word will have a video in the predefined
database, and the output video will be a fused video of those words.

Fig.11: Execution

Fig.12: Front End Fig.13: Login Page

www.turkjphysiotherrehabil.org                  1774 
Turkish Journal of Physiotherapy and Rehabilitation; 32(2)
ISSN 2651-4451 | e-ISSN 2651-446X

Fig.14: Speech/Text Input

It is not easy for a machine to learn our language, but it is possible with the assistance of NLP.

Here's how it works in practice:


The machine receives audio as feedback.That audio input is recorded by the machine.The audio is then converted
to text and displayed on the screen by the machine.The NLP system breaks down the text into components,
determines the context of the conversation and the person's purpose, and then chooses which command to execute
based on the NLP results.Actually, NLP is the process of developing an algorithm that converts text into words
and labels them based on their position and function in sentences.

When you speak “hello” and “Thank You” as input into the microphone, the following output pops up

Fig.15: “Hello” in sign Language Fig.16: Speech/Text Input

www.turkjphysiotherrehabil.org                  1775 
Turkish Journal of Physiotherapy and Rehabilitation; 32(2)
ISSN 2651-4451 | e-ISSN 2651-446X

Fig.17: “Thank” in sign Language Fig.18: “You” in sign Language

V. CONCLUSION AND FUTURE SCOPE


A sign language translator comes in handy in a variety of situations.Anyone should use this system to learn and
communicate in sign language in schools, colleges, hospitals, universities, airports, and courts. It facilitates
contact between those who have normal hearing and those who have difficulty hearing. The future work is to
modify the Website UI which can be improved and new functionalities can be added.Various front-end options
are available such as .net or android app, that can be used to make the system cross platform and increase the
availability of the system.Although it is well recognized that facial expressions communicate an essential part of
sign language, this project did not focus on them.We are excited to continue the project by incorporating facial
expressions into the system.

REFERENCES
1. S. Shrenika and M. Madhu Bala, "Sign Language Recognition Using Template Matching Technique," 2020 International Conference on Computer
Science, Engineering and Applications (ICCSEA), Gunupur, India, 2020, pp. 1-5, doi: 10.1109/ICCSEA49143.2020.9132899.
2. Ankita Harkude#1, Sarika Namade#2, Shefali Patil#3, Anita Morey #4, Department of Information Technology, Usha Mittal Institute of Technology,
Audio to Sign Language Translation for Deaf People, (IJEIT) Volume 9, Issue 10, April 2020
3. M. Sanzidul Islam, S. Sultana SharminMousumi, N. A. Jessan, A. Shahariar Azad Rabby and S. Akhter Hossain, "Ishara-Lipi: The First Complete
MultipurposeOpen Access Dataset of Isolated Characters for Bangla Sign Language," 2018 International Conference on Bangla Speech and
Language Processing (ICBSLP), Sylhet, Bangladesh, 2018, pp. 1-4, doi: 10.1109/ICBSLP.2018.8554466.
4. S. Tornay, M. Razavi and M. Magimai.-Doss, "Towards Multilingual Sign Language Recognition," ICASSP 2020 - 2020 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 6309-6313, doi:
10.1109/ICASSP40776.2020.9054631.
5. M. Xie and X. Ma, "End-to-End Residual Neural Network with Data Augmentation for Sign Language Recognition," 2019 IEEE 4th Advanced
Information Technology, Electronic and Automation Control Conference (IAEAC), Chengdu, China, 2019, pp. 1629-1633, doi:
10.1109/IAEAC47372.2019.8998073.
6. S. He, "Research of a Sign Language Translation System Based on Deep Learning," 2019 International Conference on Artificial Intelligence and
Advanced Manufacturing (AIAM), Dublin, Ireland, 2019, pp. 392-396, doi: 10.1109/AIAM48774.2019.00083.
7. K. Saija, S. Sangeetha and V. Shah, "WordNet Based Sign Language Machine Translation: from English Voice to ISL Gloss," 2019 IEEE 16th India
Council International Conference (INDICON), Rajkot, India, 2019, pp. 1-4, doi: 10.1109/INDICON47234.2019.9029074.
8. H. Muthu Mariappan and V. Gomathi, "Real-Time Recognition of Indian Sign Language," 2019 International Conference on Computational
Intelligence in Data Science (ICCIDS), Chennai, India, 2019, pp. 1-6, doi: 10.1109/ICCIDS.2019.8862125.
9. D. Pahuja and S. Jain, "Recognition of Sign Language Symbols using Templates," 2020 8th International Conference on Reliability, Infocom
Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 2020, pp. 1157-1160, doi:
10.1109/ICRITO48877.2020.9198001.
10. F. M. Shipman and C. D. D. Monteiro, "Crawling and Classification Strategies for Generating a Multi-Language Corpus of Sign Language Video,"
2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), Champaign, IL, USA, 2019, pp. 97-106, doi: 10.1109/JCDL.2019.00023.
11. M. Aktaş, B. Gökberk and L. Akarun, "Recognizing Non-Manual Signs in Turkish Sign Language," 2019 Ninth International Conference on Image
Processing Theory, Tools and Applications (IPTA), Istanbul, Turkey, 2019, pp. 1-6, doi: 10.1109/IPTA.2019.8936081.
12. M. Yeasin et al., "Design and Implementation of Bangla Sign Language Translator," 2019 5th International Conference on Advances in Electrical
Engineering (ICAEE), Dhaka, Bangladesh, 2019, pp. 815-820, doi: 10.1109/ICAEE48663.2019.8975458.
13. K. Revanth and N. S. M. Raja, "Comprehensive SVM based Indian Sign Language Recognition," 2019 IEEE International Conference on System,
Computation, Automation and Networking (ICSCAN), Pondicherry, India, 2019, pp. 1-4, doi: 10.1109/ICSCAN.2019.8878787.
14. N V Krishna Rao,Y HarikaDevi,N Shalini,A Harika,V Divyavani,Dr.N Manga Thayaru“Credit Card Fraud Detection Using Spark and Machine
Learning Techniques”, ICACECS2020,Machine Learning Technologies and Applications pp 163-172, Algorithms for Intelligent Systems book
series (AIS),march 2021.

www.turkjphysiotherrehabil.org                  1776 
Turkish Journal of Physiotherapy and Rehabilitation; 32(2)
ISSN 2651-4451 | e-ISSN 2651-446X

15. N.V.KrishnaRao,KayiramKavitha, AshokaDeepthiManukonda, R.V.S.Lalitha, B.Abhishek Reddy” Semi-Equalizing Load in Multi-hop Wireless
Networks” ”,International Journal of Innovative Technology and Exploring Engineering’ at Volume-9 Issue-1, November 2019 pp.3047-3051
(ISSN:2278-3075).
16. N V Krishna Rao,S. Laxman Kumar, K. Kavitha,PuduPravalika,KotteSruthi,R.V.S. Lalitha, “Fashion Compatibility using Convolutional Neural
Networks”, ACCES September,2020.

www.turkjphysiotherrehabil.org                  1777 

You might also like