0% found this document useful (0 votes)
31 views

Batch - 01 Report

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Batch - 01 Report

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

REAL-TIME SIGN LANGUAGE TRANSLATOR FOR

SPECIALLY ABLED
A PROJECT WORK PHASE II REPORT

Submitted by

DEVANAND M (113119UG03019)
DHINESH M (113119UG03020)
KOMESH S (113119UG03051)
VELURU BALAJI (113119UG03112)
In partial fulfilment for the award of the degree of

BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING
VEL TECH MULTI TECH Dr. RANGARAJAN Dr. SAKUNTHALA
ENGINEERING COLLEGE, ALAMATHI ROAD, AVADI, CHENNAI-62
ANNA UNIVERSITY, CHENNAI 600 025.
MAY 2023
ANNA UNIVERSITY, CHENNAI 600 025
BONAFIDE CERTIFICATE

Certified that this project report of title “REAL-TIME SIGN LANGUAGE


TRANSLATOR FOR SPECIALLY ABLED” is the bonafide work of DEVANAND
M (113119UG0319), DHINESH M (113119UG03020), KOMESH S
(113119UG03051), VELURU BALAJI (113119UG03112) who carried out the project
work under my supervision.

SIGNATURE SIGNATURE
HEAD OF THE DEPARTMENT SUPERVISOR
Dr.R.Saravanan, B.E, M.E(CSE)., Ph.D. Dr.R.Saravanan, B.E, M.E(CSE)., Ph.D.
PROFESSOR, PROFESSOR,
Department of Computer Science and Department of Computer Science and
Engineering, Engineering,
Vel Tech Multi Tech Dr. Rangarajan Vel Tech Multi Tech Dr. Rangarajan
Dr. Sakunthala Engineering College, Dr. Sakunthala Engineering College,
Avadi, Chennai-600 062 Avadi, Chennai-600 062
CERTIFICATE FOR EVALUATION

This is to certify that the project entitled “REAL-TIME SIGN LANGUAGE


TRANSLATOR FOR SPECIALLY ABLED” is the bonafide record of work
done by following students to carry out the project work under our guidance
during the year 2022-2023 in partial fulfilment for the award of Bachelor of
Engineering degree in Computer Science and Engineering conducted by Anna
University, Chennai.

DEVANAND M (113119UG03019)

DHINESH M (113119UG03020)

KOMESH S (113119UG03051)

VELURU BALAJI (113119UG03112)

This project report was submitted for viva voce held on __________
At Vel Tech Multi Tech Dr. Rangarajan and Dr.Sakunthala Engineering College.

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT
We wish to express our sincere thanks to Almighty and the people who extended their
help during the course of our work. We are greatly and profoundly thankful to our
honourable Chairman, Col. Prof.Vel. Shri Dr.R.Rangarajan B.E.(ELEC), B.E.
(MECH), M.S.(AUTO)., D.Sc., &Vice Chairman, Dr.Mrs.Sakunthala Rangarajan
M.B.B.S., for facilitating us with this opportunity. We also record our sincere thanks
to our honorable Principal, Dr.V.Rajamani M.E.,Ph.D., for his kind support to take up
this project and complete it successfully. We would like to express our special thanks
to our Head of the Department, Dr.R.Saravanan, B.E, M.E(CSE)., Ph.D. Department
of Computer Science and Engineering and our project supervisor Dr.R.Saravanan,
B.E, M.E(CSE)., Ph.D. for their moral support by taking keen interest on our project
work and guided us all along, till the completion of our project work and also by
providing with all the necessary information required for developing a good system
with successful completion of the same.Further, the acknowledgement would be
incomplete if we would not mention a word of thanks to our most beloved Parents for
their continuous support and encouragement all the way through the course that has
led us to pursue the degree and confidently complete the project work.

(DEVANAND.M) (DHINESH.M) (KOMESH.S) (VELURU BALAJI)


ABSTRACT
"Real-Time Sign Language Translator for specially abled" is a project
that aims to develop a system for accurately translating American Sign
Language (ASL) into text or speech in real-time. The system utilizes a
combination of machine learning frameworks and libraries, including
Tensorflow, Mediapipe, Pytorch, YOLO v5, and OpenCV, to process video
input of ASL signs and convert them into output that can be easily understood
by hearing individuals.To achieve a high level of accuracy, the project team
utilized a variety of techniques and technologies. First, they collected a large
dataset of ASL signs and corresponding translations, which was used to train
the machine learning models that power the translation system. This dataset
was augmented with additional data and carefully preprocessed to ensure that it
was suitable for training the models.Once the dataset was prepared, the team
used Tensorflow and Mediapipe to build and train machine learning models for
recognizing and interpreting ASL signs. These models were then integrated into
a pipeline that could process video input in real-time and generate translations
in the form of text or speech.To improve the accuracy of the translation system,
the team also used YOLOv5 and PyTorch to fine-tune the models and optimize
their performance. These tools allowed them to refine the models based on their
performance on the test dataset and make adjustments as needed to improve
their accuracy.Overall, the project was able to achieve an accuracy of 90.20%
on a test dataset, demonstrating the effectiveness of the machine learning
models and the overall system. This level of accuracy represents a significant
improvement over previous approaches to real-time ASL translation and has the
potential to significantly improve communication accessibility for deaf and
hard of hearing individuals. In the future, the project team hopes to continue
refining and improving the system, with the goal of reaching an even higher
level of accuracy and enabling even more seamless communication between
individuals who use ASL and those who do not.

KEYWORDS: AMERICAN SIGN LANGUAGE, Tensorflow, Mediapipe,


Opencv, Nvidia CNN, Nvidia CUDA, Pytorch, YOLO v5.

|
TABLE OF CONTENTS

CHAPTER TITLE PAGE NO.


ABSTRACT I
1. INTRODUCTION 1
1.1 OVERVIEW 2
1.2. OBJECTIVE 2
1.3. EXISTING SYSTEM 3
1.4 .PROPOSED SYSTEM 3
2. LITERATURE REVIEW 5
2.1 Introduction 6
2.2 Merits and Demerits 6
3. SYSTEM DESIGN 9
3.1 ARCHITECTURE DIAGRAM 10
3.2 USECASE DIAGRAM 11
3.3 SEQUENCE DIAGRAM 12
3.4 COLLABORATION DIAGRAM 13
3.5 ARCHITECTURE FLOW DIAGRAM 14
3.6 DATA FLOW DIAGRAM 15
4. MODULES 17
4.1 MODULES LIST 18
4.2 MODULES DESCRIPTION 18
4.2.1 Creating CNN Model 18
4.2.2 Testing CNN model 20
4.2.3 Creating Flask app 21
5. SYSTEM SPECIFICATION 22
5.1 SOFTWARE REQUIREMENT SPECIFICATION 23
5.2 SYSTEM REQUIREMENTS 23
5.2.1 HARDWARE REQUIREMENTS 23
5.2.2 SOFTWARE REQUIREMENTS 23

||
6. SOFTWARE DESCRIPTION 24
6.1 Design and Implementation Constraints 25
6.1.1 Constraints in Analysis 25
6.1.2 Constraints in Analysis 25
6.2 System Features 25
6.2.1 User Interfaces 25
6.2.2 Hardware Interfaces 25
6.2.3 Software Interfaces 26
6.2.4 Communications Interfaces 26
6.3 User Documentation: 26
6.4 Software Quality Attributes: 26
6.4.1 User-friendliness 26
6.4.2 Reliability 26
6.4.3 Maintainability 26
6.5 Other Non-functional Requirements 27
6.5.1 Performance Requirements 27
6.5.2 Safety Requirements 29
6.5.3 Product Features: 29
6.5.4 Test Cases 29
7. CONCLUSION 31
7.1 CONCLUSION 32
7.2 FUTURE ENHANCEMENT 32
APPENDICES 33
APPENDIX-1 33
SCREENSHOTS 33
APPENDIX-2 38
IMPLEMENTATION CODE 42
REFERENCES 59

|||
LIST OF FIGURES

FIGURE NO. NAME PAGE NO.

1.1 Yolo v5 workflow 4

3.1 Architecture Diagram 10

3.2 Use case Diagram 11

3.3 Sequence Diagram 12

3.4 Collaboration Diagram 13

3.5 Architecture flow diagram 14

3.6 User Diagram 15

3.7 Level 0 Diagram 15

3.8 Level 1 Diagram 16

3.9 Level 2 Diagram 16

3.10 Level 3 Diagram 16

4.1 Train Model 19

4.2 Dataset Collection 20

4.3 Sample Code for dataset collection 21

IV
FIGURE NO. NAME PAGE NO.

A.1 Landing Page 33

A.2 Choice Page 33

A.3 Translation Module-1 34

A.4 Translation Module-2 34

A.5 Login Page 35

A.6 Sign Up Page 35

A.7 Profile Page 36

A.8 About Page 36

Sign Language Translator used as a virtual


A.9 37
cam in Video conference (Google meet)

A.10 Variation of Accuracy 38

A.11 Variation of Losses 38

A.12 CNN Model Summary 39

A.13 Accuracy Per Class 39

A.14 Confusion Matrix Scores 40

A.15 Precision and Recall Confidence Score 40

A.16 Validation - Batches (Testing) 41

A.17 Train - Batches (Testing) 41

V
LIST OF TABLES

FIGURE NO. NAME PAGE NO.

6.1 Functional Requirements 27

6.2 Non-Functional Requirements 28

6.3 Test Cases 29

VI
CHAPTER 1
INTRODUCTION

1
1.1 OVERVIEW
Sign Language is a gesture-based language which involves hand movements,
hand orientation and facial expression instead of acoustic sound patterns. This
type of language has varying patterns according to the people and is not
universal. Nepali sign language differs from Indian, American and also with in
different parts of Nepal. However, since most people don’t have prior
knowledge of sign language of any sort, it becomes harder and harder for deaf-
mute people to communicate without a translator, thus they feel ostracized. Sign
Language Recognition has been accepted as a widely recognized
communication model between deaf-mute people and normal people.
Recognition models are categorized under computer vision-based and sensor-
based systems. In computer vision-based gesture recognition, camera is used for
input and image processing of input gestures is done before recognition. The
processed gestures then are recognized using various algorithms like Hidden
Markov Model and Neural network techniques. The main drawback of vision-
based sign language recognition system image acquisition process has many
environmental apprehensions such as the place of the camera, background
condition and lightning sensitivity. But it is easier and more economical than
using sensor and tracker for data. However, Neural Network techniques and
Hidden Markov Model are used together with sensor data for more accuracy.
The major dataset up-to now is American Sign Language alphabets. The
datasets of gesture are preprocessed using Python libraries and packages like
OpenCV and skimage, then trained using CNN VGG-16 model. The recognized
input is converted into speech. This will provide one-way communication as a
person who doesn’t understand sign language will get the meaning of the hand
signs shown to him/her. Furthermore, to make two-way communication
possible, this paper also presents text to sign language conversion, which allows
a person, who doesn’t understand sign language, to convert text to sign
language finger spelling that the signer would understand.

1.2. OBJECTIVE
The recognition of sign language gestures from real time video and successfully
classifying it into either one from a list of categories have been a popular and

2
challenging field of research. Many researchers have been working on this field
for a long time, so we have also thought of contributing to this field as by
working on it in our final year major project. Liang et al. [6] have also put their
research on this concept which has guided us throughout the implementation.
The process of recognizing a sign language gesture and classifying it is the one
line definition of the task performed by this proposed system. Along with this, a
text to ASL finger spelling feature is also available that makes the two-way
communication from sign to text and text to sign possible. The following steps
were taken while working on this project.Many vision-based and sensor-based
techniques have been used for sign language recognition. Pavlovic et al. The
paper published on 1997 emphasizes on the advantages and shortcomings and
important differences in the gesture interpretation approaches depending on
whether a 3D model of the human hand or an image appearance model of the
human hand is used. As of the time, this survey was done 3D hand models
offered a way of more elaborate modeling of hand gestures but lead to
computational hurdles that had not been overcome given the real-time
requirements of HCI

1.3. EXISTING SYSTEM


ASL recognition is not a new computer vision problem. Over the past 20 years,
researchers have used classifiers from a variety of categories that we can classify roughly
into linear classifiers, neural networks and Bayesian networks. A real-time sign language
translator is an important milestone in facilitating communication between the deaf
community and the general public.Brandon Garcia and Sigberto Alarcon Viesca[1] were
able to implement a robust model for the letters a-e and a modest one for letters a-k.The
Surrey and Massey datasets on ASL along with the GoogleNet architecture are used to
train the system with.The system take

1.4 .PROPOSED SYSTEM


The objective of this project is to develop a real-time sign language translator using a
webcam for communication applications such as Google Meet, Microsoft Teams, and
others. Our approach is to enhance the accuracy of our Convolutional Neural Network
(CNN) model by utilizing YOLOv5 and PyTorch, and by reducing the featured extraction
process time. We aim to increase the program's responsiveness by implementing GPU-
based neural network processing using PyTorch and Nvidia CNN with the aid of the
Nvidia CUDA toolkit. Our model has a smaller size and better accuracy than the previous
method due to the advantages of the Darknet architecture of YOLOv5.

3
We will implement this program as a virtual camera on a device, allowing us to use
the sign language translator to translate hand gestures in real-time feeds from the
primary camera and output the translated video with subtitles in the virtual camera
of OBS software. By doing so, our proposed system will enable effective
communication between hearing-impaired individuals and those who do not
understand sign language. To evaluate the performance of our proposed system, we
will conduct several experiments to compare its accuracy and response time with
other state-of-the-art sign language translation systems. The results of these
experiments will help us further improve our system to achieve better performance.
In conclusion, our proposed work aims to develop a realtime sign language
translator using YOLOv5 and PyTorch with the aid of Nvidia CUDA toolkit, 3
implemented as a virtual camera on a device. The system will enable effective
communication between hearingimpaired individuals and others who do not
understand sign language. The proposed system's performance will be evaluated
through experiments, and the results will be used to further improve the system.

FIG 1.1 Yolo v5 workflow

4
CHAPTER 2
LITERATURE REVIEW

5
2.1 Introduction
Disability impacts negatively on human life. Each disability presents their specific
barriers. These latter cause scarcity of people with disabilities from appropriate services
that facilitate their specific tasks using interactive systems as they find difficulties in
communicating with the user interfaces of digital applications (web, mobile, desktop, tv,
etc). Different solutions were proposed, but they still insufficient and not efficient
considering the pervasive environment and the bunch of contextual information that
contains. Otherwise, Artificial Intelligence (AI) is an emergent imitator technology to
represent the human brain thinks by the integration of the machine’s from computing
systems. computational power and speed with human perception and intelligence . AI is
in growing and possesses the necessary tools that could help users with disability
experience in accessing information In fact, users with disability have to use interactive
systems as well-bodied users. But, they are unable to do it, because user interfaces of
interactive systems are not adapted to their capabilities. Therefore, we need to improve
adaptive interactive systems in order to make them accessible to disabled users.
Accessibility of User Interfaces(UI)s is also an emergent and important domain that needs
more and more investment . The solutions given are insufficient, superficial and limited
to elementary disability. Therefore, to overcome all difficulties and challenges, we need
to propose solutions that cover almost of users with disability from different cultural
environments, considering almost of platforms used for the interaction. This paper
consolidates research findings in collaboration between accessibility, user interfaces and
artificial intelligence. In the end, we present a solution integrating accessibility, user
interface and artificial intelligence.The transformative impact of artificial intelligence on
our society will have farreaching economic, legal, political and regulatory implications
that we need to be discussing and preparing for. Determining who is at fault if an
autonomous vehicle hurts a pedestrian or how to manage a global autonomous arms race
are just a couple of examples of the challenges to be faced

2.2 J.J. DUDLEY, AND P.O.KRISTENSSON, A REVIEW OF USER


INTERFACE DESIGN FOR INTERACTIVE MACHINE LEARNING,ACM
TRANS. INTERACT. INTELL. SYST. 1, 1, 2018

2.2.1 MERITS:

Make task goals and constraints explicit


Support user understanding of model uncertainty and confidence
Capture intent rather than input

6
2.2.2 DEMERITS:
Users can be imprecise and inconsisten
There is typically a degree of uncertainty in the relation between user input and
user intent.

2.3 GUANGLIANG LI, RANDY GOMEZ, KEISUKE NAKAMURA, BO HE,


HUMANCENTERED REINFORCEMENT LEARNING: A SURVEY, IEEE
TRANSACTIONS ON HUMAN -MACHINE SYSTEMS, VOL. 49, NO. 4, AUGUST
2019
2.3.1 MERITS
It is used to send the data packets securely from source to destination,without any
interruption.

2.3.2 DEMERITS
Cryptography is not enough to defend against adversaries and insiders,careful
protocol design is needed.

2.4 N. MEZHOUDI, USER INTERFACE ADAPTATION BASED ON USER


FEEDBACK AND MACHINE LEARNING, PP 25-28, 2013.

2.4.1 MERITS
It is used to send the data packets securely from source to destination,without any
interruption.

2.4.2 DEMERITS
Cryptography is not enough to defend against adversaries and insiders,careful
protocol design is needed to protect the user information.

2.5 T. LAVIE AND J. MEYER, BENEFITS AND COSTS OF ADAPTIVE USER


INTERFACES, INTERNATIONAL JOURNAL OF HUMAN-COMPUTER
STUDIES VOL 68,PP 508-524, 2010
2.5.1 MERITS
Four different levels of adaptivity (ranging from manual to fully adaptive with
intermediate levels) routine (familiar) and non-routine (unfamiliar) situations; and
different user age groups.
2.5.2 DEMERITS
We may find one application's interface clogged with controls that hamper the user
from using it properly or another's interface designed without taking care of universal
usability.
7
2.6 N.INDURKHYA, F.J. DAMERAU, HANDBOOK OF NATURAL LANGUAGE
PROCESSING, CHAPMAN AND HALL/CRC; 2 EDITION, 2010.
2.6.1 MERITS
It is used to send the data packets securely from source to destination,without any
interruption.

2.6.2 DEMERITS

Although logistic regression and naive Bayes share the same conditional class
probability model, a major advantage of the logistic regression method is that it does
not make any assumption on how x is generated

2.7 Y.BENDALY HLAOUI AND L. ZOUHAIER AND L.JEMNI BEN AYED,


MODEL DRIVEN APPROACH FOR ADAPTING USER INTERFACES TO THE
CONTEXT OF ACCESSIBILITY: CASE OF VISUALLY IMPAIRED USERS,
JOURNAL ON MULTIMODAL USER INTERFACES, 2018.

2.7.1 MERITS
This paper presents a generic approach for the adaptation of UIs to the accessibility
context based on meta-model transformations

2.7.2 DEMERITS
Infact, the adaptation process has to be automatic and dynamic to free the users with
disabilities from the UI change control.

8
CHAPTER 3
SYSTEM DESIGN

9
3.1 ARCHITECTURE DIAGRAM
Infact, the adaptation process has to be automatic and dynamic to free the users with
disabilities from the UI change control.

FIG 3.1 Architecture diagram

A data-flow diagram is a way of representing a flow of data through a process or a


system (usually an information system). The DFD also provides information about the
outputs and inputs of each entity and the process itself. A data-flow diagram has no
control flow there are no decision rules and no loops. Specific operations based on the
data can be represented by a flowchart.

10
3.2 USECASE DIAGRAM
A use case is a set of scenarios that describing an interaction between a user and a
system. A use case diagram displays the relationship among actors and use cases. Here
source is a user and destination is the receiver. The source sending the beacon signal to
the receiver. Then the receiver send the response to the source node.

FIG 3.2 Use case diagram

A use case diagram is a graphical depiction of a user's possible interactions with a system.
A use case diagram shows various use cases and different types of users the system has
and will often be accompanied by other types of diagrams as well. The use cases are
represented by either circles or ellipses. The actors are often shown as stick figures.

11
3.3 SEQUENCE DIAGRAM

Sequence diagrams show a detailed flow for a specific use case or even just part of
a specific use case. The vertical dimension shows the sequence of messages/calls in
the time order that they occur; the horizontal dimension shows the object instances to
which the messages are sent.The consisting objects are Localizability testing,structure
analysis,network adjustment and localizability aided localization.

Fig 3.3 sequence diagram

A sequence diagram or system sequence diagram (SSD) shows process interactions


arranged in time sequence in the field of software engineering. It depicts the processes
involved and the sequence of messages exchanged between the processes needed to carry
out the functionality.

12
3.4 COLLABORATION DIAGRAM

Collaboration diagrams are a technique for defining external object behavior. It


consists of Ad Hoc Sensor networks to deploy the network nodes and also send
acknowledgement to localizability and to transmit information to the neighbour node .
Collaboration diagrams show how objects collaborate by representing objects by icons.

Fig 3.4 Collaboration diagram

Sequence diagrams are typically associated with use case realizations in the 4+1
architectural view model of the system under development. Sequence diagrams are
sometimes called event diagrams or event scenarios.Event diagrams and event scenarios
are other names for sequence diagrams. A sequence diagram depicts multiple processes or
things that exist simultaneously as parallel vertical lines (lifelines), and the messages
passed between them as horizontal arrows, in the order in which they occur. This enables
for the graphical specification of simple runtime scenarios..

State : It is a condition or situation in life cycle of an object during which it’s satisfies same
condition or performs some activity or waits for some event.

Transition: It is a relationship between two states indicating that object in first state
performs some actions and enters into the next state or event.

13
3.5 ARCHITECTURE FLOW DIAGRAM

The wireless Ad hoc and sensor networks to deploy the localization and
non_localization nodes.In localizability aided localization would consists of localization
testing,structure analysis,network adjustment by using tree prediction mobility.It is used
to increase the mobility values.It is used to find the ranging measured.

Fig 3.5 Architecture flow diagram

Class diagrams model class structure and contents using design elements such as classes,
packages and objects. Class diagram describe the different perspective when designing a
system-conceptual, specification and implementation. Classes are composed of three
things: name, attributes, and operations. Class diagram also display relationships such as
containment, inheritance, association etc. The association relationship is most common
relationship in a class diagram. The association shows the relationship between instances
of classes

14
3.6 DATA FLOW DIAGRAM
The data flow diagram is a graphic tool used for expressing system requirements in a
graphical form. The data flow diagram would consists step by step process it contains
source nodes ,channel splitting,Beacons used to receive the signal in narrow areas and to
give acknowledgement to the sender.Localizability aided localization carried the process
mobile networks behavior and to achieve the threshold.

Fig 3.6 user diagram


LEVEL 0:

Fig 3.7 level 0 diagram


15
LEVEL 1:

Fig 3.8 level 1 diagram


LEVEL 2:

Fig 3.9 level 2 diagram


LEVEL 3:

Fig 3.10 level 3 diagram

16
CHAPTER 4
MODULES

17
4.1 MODULES LIST
Creating CNN Model
Testing CNN model
Creating Flask app

4.2 MODULES DESCRIPTION


4.2.1 CREATING CNN MODEL
1. Creating datacollection program to collect data from the user and detect the hands in
the image frames add crop the detected hands and adding a background of size 300 x
300 px.
2. Saving the collected image in a Data directory which contain Train and Test folder.
3. Do this for all the dataset labels.
4. With the help of mediapipe module all the dataset of hand contain tracking pipeline
which helps to increase the detection accuracy
5. Import the Tensorflow library to the python program for using the(.py)keras module
to create CNN.
6. Use Sequential module to build a CNN model and add the required layers such as
Conv2D , Maxpool , Flatten , Dense layers.
7. To increase the accuracy and reduce the size of our model we use the YOLOv5 to
train our model.
8. The YOLO v5 produce high accuracy se model.

18
Fig 4.1 Train Model (sample code)

19
4.2.2 TESTING CNN MODEL

1. By using the YOLOv5 model we can predict the output of our project.
2. We use the Opencv and Pytorch library to predict the Hand Gestures and we also use
this module to detect the hand in our image frame.
3. Using this, our image predicting model can detect the intended output with higher
accuracy.
4. Our CNN Model is trained to localize the hand of the person who is trying to
communicate, even in a messy background.
5. It is trained to get the most desired aspect to detect the hand tracking with higher
precision.
6. The '0' in the VideoCapture() method in the below program denotes the external
camera connected to the device.
7. The maxHands=1 in the HandDetector() method in the above program denotes that
we are detecting just the one hand of the person/user

Fig 4.2 Dataset Collection

20
4.2.3 CREATING FLASK APP

1. Create a flask app. Command Line: from flask import Flask, ,render_template,
Response, jsonify.
2. Adding required functions to render the templates.
3.Imported the prediction program to flask app import opencv.
4.Imported train list to the flask app. import trainlist
5.Adding required functions to render the templates.
6.Run the app.py (Flask App).

Fig 4.3 Sample Code for data collection


21
CHAPTER 5
SYSTEM SPECIFICATION

22
5.1 SOFTWARE REQUIREMENT SPECIFICATION
The software requirements specification is produced at the culmination of the analysis
task. The function and performance allocated to software as part of system engineering
are refined by establishing a complete information description as functional
representation of system behavior, an indication of performance requirements and design
constraints, appropriate validation criteria.

5.2 SYSTEM REQUIREMENTS


5.2.1 HARDWARE REQUIREMENTS:

Processor : Intel i3 9100f


Video card : NVIDIA Ge Force GTX 1650 super
Memory : 8GB RAM
Resolution : 1024*768 minimum display resolution
Webcam : Min. 720 pixels

5.2.2 SOFTWARE REQUIREMENTS:

Software Tool : VSCode: Intel i3 9100f

Operating System : Windows 10

Processors : Any Intel or AMD X86-64 processor

RAM : 8GB

Graphics Card : GTX 1050 ti graphics card required

Packages Used : PyTorch , YOLOv5, Flask , Open CV ,

MediaPipe, MongoDB , Nvidia CNN,

Nvidia CUDA.

23
CHAPTER 6
SOFTWARE DESCRIPTION

24
6.1 DESIGN AND IMPLEMENTATION CONSTRAINTS
6.1.1 CONSTRAINTS IN ANALYSIS

Constraints as Informal Text.


Constraints as Operational Restrictions Constraints Integrated in Existing
Model.
Concepts Constraints as a Separate Concept Constraints Implied by the Model
Structure.

6.1.2 CONSTRAINTS IN ANALYSIS

Determination of the Involved functions.


Determination of the Involved Objects.
Determination of the Require Clauses.
Global actions and Constraint Realization.

6.2 SYSTEM FEATURES

6.2.1 USER INTERFACES

Graphical User Interfaces not in this product.


Users are communicated with Buttons with network animator.

6.2.2 HARDWARE INTERFACES


Linux and Unix environment of system and basic need of system feature like random
access memory etc. The Free Software Foundation views Linux distributions that use
GNU software as GNU variants and they ask that such operating systems be referred to
as GNU/Linux or a Linux-based GNU system. The media and common usage, however,
refers to this family of operating systems simply as Linux, as do many large Linux
distributions. Some distributions, notably , use GNU/Linux. The naming issue remains
controversial. A router is an internetworking device that forwards packets between
networks by processing information found in the datagram or packet (Internet protocol
information from Layer 3 of the OSI Model). In many situations, this information is
processed in conjunction with the routing table (also known as forwarding table).

25
Routers use routing tables to determine what interface to forward packets (this can
include the "null" also known as the "black hole" interface because data can go into it,
however, no further processing is done for said data).

6.2.3 SOFTWARE INTERFACES


The application will be developed using NS 2.34 as front-end. The Operating system
used will be Linux Ubuntu. as supporting software to use NS 2.34 in Linux ubunutu
11.04.
This software is interacted with the TCP/IP protocol.
• This product is interacted with the Linux
• This product is interacted with the Server Socket
• This product is interacted with TCL

6.2.4 COMMUNICATIONS INTERFACES

The TCP/IP protocol will be used to facilitate communications between the


nodes .

6.3 USER DOCUMENTATION:

The application will be having a user manual for helping and guiding the users
on how to interact with system and perform various functions. The core components and
its usage will be explained in detail.

6.4 SOFTWARE QUALITY ATTRIBUTES:

6.4.1 USER-FRIENDLINESS
The proposed system will be user-friendly, designed to be easy to use through simple
interface. The software could be used by anyone with necessary computer knowledge.
The software is created by an easy look and feel concept.

6.4.2 RELIABILITY

The system will never crash and fail. But in case of system failure, recovery could be
done by using advance backup features.

6.4.3 MAINTAINABILITY

All code shall be fully documented. Each function shall be commented with pre- and
post-conditions. All program files shall include comments concerning date of last change.
The code should be modular, to permit future modifications. Here for defects the system
maintains its solution database.

26
6.5 OTHER NON-FUNCTIONAL REQUIREMENTS

6.5.1 PERFORMANCE REQUIREMENTS

FUNCTOINAL REQUIREMENTS

FUNCTIONAL SUB REQUIREMENTS


FR.NO
REQUIREMENTS (STORY / SUB TASK)

Registration through Form


FR-1 User Registration Registration through Gmail
Registration through LinkedIn

Confirmation via Email


Confirmation via OTP
FR-2 User Authentication
Confirmation via Voice
Recognition

Any Problems faced by


FR-3 Reporting customer should be reported
Automatically

StreamLine Through Audit


FR-4 Audio Tracking processes and Comply with
regulations or internal policy

Collected data form the past


FR-5 Historical Data events must be used improve the
further transaction

Table 6.1: Functional Requirements

27
NON-FUNCTOINAL REQUIREMENTS
The non-functoinal requirements are:

NON-FUNCTIONAL
NFR.NO DESCRIPTION
REQUIREMENT

The User Interface should be


contrast enough for the partially
NFR-1 Usability
blind people and also should be
colorblind friendly UI.

Should be resistive to cyberattacks as


NFR-2 Security the information shared is very
confidential.

Support should be provided for


NFR-3 Reliability in-house or remote accessibility
for external resources if required.

The site should load in 5 seconds


NFR-4 Performance when the number of simultaneous
users are greater than 50000

Continuous availability of our


NFR-5 Availability service must be provided all the
time

The application should run


NFR-6 Scalability seamlessly with more than 50000
users at the same time

Table 6.2: Non Functional Requirements

28
6.5.2 Safety Requirements

The software may be safety-critical. If so, there are issues associated with its integrity
level. The software may not be safety-critical although it forms part of a safety-critical
system. For example, software may simply log transactions. If a system must be of a high
integrity level and if the software is shown to be of that integrity level, then the hardware
must be at least of the same integrity level. There is little point in producing 'perfect' code
in some language if hardware and system software (in widest sense) are not reliable. If a
computer system is to run software of a high integrity level then that system should not at
the same time accommodate software of a lower integrity level. Systems with different
requirements for safety levels must be separated. Otherwise, the highest level of integrity
required must be applied to all systems in the same environment

6.5.3 Product Features:

The product features are listed below,

Highly secure -> provides high throughout in a secured manner.

Communication -> communication occurs and data was sent in a data forwarding path.

User friendly -> The Architecture is simple and allows users to access the project easily.

6.5.4 Test Cases

29
Table 6.3 Test Cases

30
CHAPTER 7
CONCLUSION AND
FUTURE ENHANCEMENT

31
7.1 CONCLUSION

In conclusion, the integration of artificial intelligence techniques, such as machine


learning and deep learning, has enabled the development of adaptive user interfaces that
can empower users with disabilities, particularly those with hearing impairments. Our
proposed project leverages the power of TensorFlow and Mediapipe to create a feasible
communication channel between hearing-impaired and normal individuals. With an
accuracy of 90.2% under good lighting conditions, we believe that our approach has the
potential to revolutionize the way hearing-impaired individuals interact with the world.
As a next step, we plan to extend our system to real-time data to make it more practical
and useful in real-life scenarios. This project is just the beginning of what could be a
transformative technology for the disabled community, and we are excited to see where it
leads in the future.

7.2 FUTURE ENHANCEMENT

The future work for this project will involve several stages, starting with the
development of a robust sign language recognition system. This will require the
collection of a large dataset of sign language gestures and the training of a machine
learning model to recognize these gestures accurately. The system will also need to be
able to distinguish between different dialects of sign language and adapt to the user's
individual signing style.

Once the sign language recognition system is developed, the next stage will involve the
integration of the system with a speech or text translation engine. This will allow the
system to translate sign language gestures into spoken or written language in real-time,
enabling seamless communication between people who use sign language and those who
do not.

The final stage of the project will involve testing and refining the system to ensure its
accuracy and usability. This will involve user testing with individuals who use sign
language and those who do not to ensure that the system is effective in facilitating
communication between the two groups.

Overall, the development of a real-time sign language translator has the potential to
transform the lives of people with hearing and speech disabilities, enabling them to
communicate more effectively with others and breaking down barriers to communication
and social interaction.

32
APPENDIX I
SCREEN SHOTS 1
LANDING PAGE:

Fig A.1 Loading page


CHOICE PAGE :
In this page user can choose the translation module

Fig A.2 Choice page

33
SCREEN SHOT 2

Video Translate Page:


This shows the video output and the translated text in this screen

Fig A.3 Translation Module-1

AUDIO TRANSLATE PAGE:


In this page, user can translate his voice to text

Fig A.4 Translation Module-2


34
SCREEN SHOT 3

LOGIN PAGE :

Fig A.5 Login Page

SIGN UP PAGE :

Fig A.6 Signup Page

35
SCREEN SHOT 4

PROFILE PAGE :

Fig A.7 Profile page

ABOUT PAGE :

Fig A.8 About page

36
SCREEN SHOT 5

PROFILE PAGE :

Fig A.9 sign language translator used as a virtual cam in


Video conference (Google meet)

37
APPENDIX II

GRAPH:
The graph explained the ratio of the data packets to be lossed in the y axis and the
corresponding time will be mentioned in the x axis.

RESULT – VARIATION OF ACCURACY

Fig A.10 Result – Variation of Accuracy

VARIATION OF LOSSES

Fig A.11 Result – Variation of Accuracy

38
CNN MODEL SUMMARY

Fig A.12 CNN Model Summary

Table A.13 Accuracy per class

39
Table A.14 Confusion Matrix Scores

Table A.15 Precision and Recall Confidence Score

40
TESTING CNN MODEL ACCURACY USING DIFFERENT TEST IMAGES

Fig A.16 Train - Batches (Testing)

Fig A.17 Validation - Batches (Testing)

41
IMPLEMENTATION CODE :

App.py (Main file)


This is the main program containing all the required functions for accessing
the web application (interface) of our Real-time Sign language translator.

CODE :

from flask import Flask, render_template, request,jsonify,Response


import backend.mongo as mongo
import backend.fpwd as fpwd
import body
import cv2
import webbrowser

username, email, password, disability, role, dob, slink, llink, glink, bio,gender =
"", "", "", "", "", "", "", "", "", "", ""
video_camera = None
global_frame = None
image, pred_img,original,crop_bg,label = None, None, None, None, None

body.cap.release()

app = Flask(__name__)

#Index Page
@app.route('/home')
def index():
body.cap.release()
return render_template('index.html')

@app.route("/label") #for label


def label_text():
return jsonify(label)

42
@app.route("/translate") #for translation
def translate():
body.cap = body.cv2.VideoCapture(0, cv2.CAP_DSHOW)
txt=label_text()
return render_template('video_out.html',txt=txt.json)

def gen_vid(): # Video Stream


global video_camera
global global_frame
global label
while True:
global image, pred_img,original,crop_bg,name
image, pred_img,original,crop_bg,name= body.collectData()
if name != None:
label = name
else:
label = "--"
frame = cv2.imencode('.jpg', image)[1].tobytes()
if frame != None:
global_frame = frame
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')
else:
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + global_frame + b'\r\n\r\n')

@app.route("/video") # Video page


def video():
return Response(gen_vid(),
mimetype='multipart/x-mixed-replace; boundary=frame')

@app.route("/about") # About page


def about():
body.cap.release()
return render_template('about_us.html')

43
@app.route("/signup") # Sign up page
def sign_up():
body.cap.release()
return render_template('sign_up.html')

@app.route("/") #for login page


def login():
body.cap.release()
return render_template('login.html',userinfo=None,accept=None)

@app.route("/profile") # Profile page


def profile():
body.cap.release()
userinfo=mongo.show(email)
return render_template('profile.html',userinfo=userinfo)

@app.route("/choice") # Choice page


def choice():
return render_template('choice.html')

@app.route("/audio") # Audio page


def audio():
body.cap.release()
return render_template('audio_out.html')

@app.route('/validate', methods=['POST'])
def validate_sign():
email = request.form['email']
password = request.form['password']
if mongo.validate(email,password):
userinfo=mongo.show(email)
return
render_template('login.html',accept="success",userinfo=userinfo,email=email,
password=password)
else:
return
render_template('login.html',accept="failed",userinfo=None,email=email,
password=None)
44
@app.route('/signup', methods=['POST'])
def getvalue():
global username, email, password, disability, role,gender
username = request.form['name']
email = request.form['email']
password = request.form['password']
disability = request.form['inputDisability']
role = request.form['inputRole']
gender = request.form['gender']
print(username, email, password, disability, role,gender)
if username and email and password and disability and role:
if not mongo.check(email):
print("Email does exist")
accept="exist"
return render_template('sign_up.html',accept=accept,email=email,
username=username)
else:
fpwd.otp(email)
print("Email does not exist")
return render_template('sign_up.html',accept="otp",email=email,
username=username)
else:
accept="failed"
return render_template('sign_up.html',accept=accept,email=email,
username=username)

@app.route('/otp', methods=['POST'])
def otp():
print("OTP: ")
print(fpwd.otp)
otp = request.form['otp']
print(otp)
print(username, email, password, disability, role,gender)
slink = "https://www.facebook.com/"

45
llink = "https://www.linkedin.com/"
glink = "https://www.github.com/"
if otp == str(fpwd.otp):
if mongo.insert(username,email,password,disability,role,dob, slink, llink,
glink, bio,gender):
accept="success"
return render_template('sign_up.html',accept=accept,email=email,
username=username)
else:
accept="otp-failed"
print("OTP failed")
return render_template('sign_up.html',accept=accept,email=email,
username=username)

@app.route('/forgot')
def forgot():
return render_template('login.html',accept="forgot",userinfo=None)

@app.route('/forgotpass', methods=['POST'])
def forgotpass():
email = request.form['f_email']
if not mongo.check(email):
fpwd.sendmail(email)
print("Email sent")
accept="sent"
return
render_template('login.html',accept=accept,userinfo=None,email=email,
password=None)
else:
print("Email not sent")
accept="not"
return
render_template('login.html',accept=accept,userinfo=None,email=email,
password=None)

46
@app.route('/changepass', methods=['POST'])
def changepass():
password = request.form['new_password']
email = request.form['chg_email']
print(password,email)
mongo.updatepwd(email,password)
userinfo=mongo.show(email)
return render_template('profile.html',userinfo=userinfo)

@app.route('/update', methods=['POST'])
def update():
name = request.form['name']
email = request.form['up_email']
role = request.form['role']
disability = request.form['disability']
dob = request.form['dob']
bio = request.form['bio']
slink = request.form['slink']
llink = request.form['llink']
glink = request.form['glink']
gender = request.form['gender']
print(name,email,role,disability,dob,bio,slink,llink,glink,gender)
mongo.update(name,email,disability,role,dob,slink,llink,glink,bio,gender)
userinfo=mongo.show(email)
return render_template('profile.html',userinfo=userinfo)

webbrowser.open('http://127.0.0.1:5000/')

if __name__ == '__main__':
app.run(host='0.0.0.0', threaded=True ,port=5000,debug=False)

47
body.py (prediction file)
The "body.py" program utilizes a trained machine learning model to make
predictions about the hand signs being made by a user. This program uses
advanced algorithms to analyze the input data and accurately identify the
specific hand signs being performed.

CODE :

import cv2
import mediapipe as mp
import numpy as np
import torch

#create mediapipe instance


mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_holistic = mp.solutions.holistic

#initialize mediapipe instance


holistic = mp_holistic.Holistic(
min_detection_confidence=0.3,
min_tracking_confidence=0.3)

# #initialize webcam

cap = cv2.VideoCapture(0,cv2.CAP_DSHOW)

model = torch.hub.load('ultralytics/yolov5',
'custom',
path='signLang/weights/best.pt' ,force_reload=True)

48
# Draw landmarks on the image
def draw_landmarks(image, results):
# Drawing landmarks for left hand
mp_drawing.draw_landmarks(
image,
results.left_hand_landmarks,
mp_holistic.HAND_CONNECTIONS,
mp_drawing_styles.get_default_hand_landmarks_style())

# Drawing landmarks for right hand


mp_drawing.draw_landmarks(
image,
results.right_hand_landmarks,
mp_holistic.HAND_CONNECTIONS,
mp_drawing_styles.get_default_hand_landmarks_style())

#Drawing landmarks for pose


mp_drawing.draw_landmarks(
image,
results.pose_landmarks,
mp_holistic.POSE_CONNECTIONS,
mp_drawing_styles.get_default_pose_landmarks_style())

return image

letter = ""
offset = 1

#collect data
def collectData():
ret, frame = cap.read()
global letter

# Make a copy of the original image and empty black images


original = frame.copy()
crop_bg = np.zeros(frame.shape, dtype=np.uint8)
black = np.zeros(frame.shape, dtype=np.uint8)

49
image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
image.flags.writeable = False
# Holistic of the given image is stored in result variable
results = holistic.process(image)
image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
# Draw landmarks on the image
draw_landmarks(image, results)
draw_landmarks(black, results)
# Show output of the image with landmarks
results = model(black)
pred_img = np.squeeze(results.render())
#crop the predicted image bbox from the pytorch result
#print the class when 70% confidence is reached in pytorch
for result in results.pandas().xyxy[0].iterrows():
if result[1]['confidence'] > 0.8:

#To get the coordinates of the bounding box


x1 = int(result[1]['xmin'])-offset
y1 = int(result[1]['ymin'])-offset
x2 = int(result[1]['xmax'])+offset
y2 = int(result[1]['ymax'])+offset
#To prevent the coordinates from going out of bounds
if x1 < 0 or y1 < 0 or x2 < 0 or y2 < 0:
x1 = int(result[1]['xmin'])
y1 = int(result[1]['ymin'])
x2 = int(result[1]['xmax'])
y2 = int(result[1]['ymax'])
#To crop the image
crop_img = pred_img[y1:y2, x1:x2]
crop_bg[y1:y2, x1:x2] = crop_img

#To print the class


name=result[1]['name']
ret = True
return image, pred_img,original,crop_bg,name
return image, pred_img,original,crop_bg,letter

50
mongo.py (Database file)
This program contain all the necessary function for database connectivity
and functions for CRUD operations.
CODE :
from urllib.parse import quote_plus
from pymongo import MongoClient

# MongoDB connection string


user=quote_plus("USER_PASSWORD")

client=MongoClient("mongodb+srv://[USER_NAME]:"+user+"@[DBNAME]
.mi8y86o.mongodb.net/?retryWrites=true&w=majority")
db=client["login_info"]
col=db["login0"]

#Validate email and password


def validate(email,password):
if email and password:
data=data=col.find_one({"email":email,"password":password})
if data:
return True
else:
return False
else:
return False

#Insert data into database


def insert(name,email,password,disability,role,dob,slink,glink,llink,bio,gender):
if name and check(email) and password and disability and role:
query={"name":name,"email":email,
"password":password,"disability":disability,"role":role,"dob":dob,
"slink":slink,"glink":glink,"llink":llink,"bio":bio,"gender":gender}
col.insert_one(query)
return True
else:
return False

51
#Update data in database
def update(name,email,disability,role,dob,slink,llink,glink,bio,gender):
if not check(email):
if name and email :
col.update_many({"email":email}, {"$set":
{"name":name,"disability":disability,"role":role,"dob":dob,"slink":slink,"glink":
glink,"llink":llink,"bio":bio,"gender":gender}})
return True
else:
return False
else:
return False

def updatepwd(email,password):
if not check(email):
if password and email :
col.update_many({"email":email}, {"$set":{"password":password}})
return True
else:
return False
else:
return False

#Check if email already exists


def check(email):
if email:
data=col.find_one({"email":email})
if data:
return False
else:
return True
#show all data in database
def show(email):
if email:
data=col.find_one({"email":email},{"_id":0})
return data
else:
return False
52
fpwd.py (OTP and password functions)
This program provides access to security features, such as One-Time
Password (OTP) generation, password change, and password reset
functionality.
CODE :
from email.message import EmailMessage
import smtplib
import backend.mongo as mongo
import random

sender = "[email protected]"
pwd = "mchsweaeifjwqxxr"
receiver = ""
otp=""
user_otp =""

def sendmail(email):
global receiver
receiver = email
details = mongo.show(email)
print(email)

subject = "Password Reset Request"


msg = "Hello "+details["name"]+", \n\nWe received a request of forget
password for your account. \n\nThis is your password ' "+details["password"]+"
' of the requested Account registered to "+details["email"]+" .\n\nPlease note
that this is a confidential matter and we advise you to keep your password safe.
\n\nThank You,\nTeam RTSLT"
message = EmailMessage()
message['From'] = "RTSLT Server <[email protected]>"
message['To'] = receiver
message['Subject'] = subject
message.set_content(msg)
try:
server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
server.login(sender, pwd)
53
server.sendmail(sender, receiver, message.as_string())
print("Email sent successfully")
except Exception as e:
print("Error: unable to send email")
print(e)
finally:
server.quit()
def otp(email):
global receiver,otp
receiver = email
print(email)
otp = random.randint(000000,999999)
subject = "OTP for Account Creation"
msg="Hello, \n\nThis is your OTP for account creation "+str(otp)+" of the
created Account registered to "+email+" .\n\nPlease note that this is a
confidential matter and we advise you to keep your password safe. \n\nThank
You,\nTeam RTSLT"
message = EmailMessage()
message['From'] = sender
message['To'] = receiver
message['Subject'] = subject
message.set_content(msg)
try:
server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
server.login(sender, pwd)
server.sendmail(sender, receiver, message.as_string())
print("Email sent successfully")
except Exception as e:
print("Error: unable to send email")
print(e)

finally:
server.quit()
print("OTP sent successfully")

54
SignCam.py (Virtual Camera for sign language translator)
Our real-time sign language translator was integrated into a virtual camera
using the pyvirtualcam library. This virtual camera was then connected to the
OBS (Open Broadcasting Software) platform to enable its usage in various
meeting and communication applications. This integration allows our translator
to work seamlessly with other software applications, making it more accessible
and convenient for users.

CODE :
import cv2
import mediapipe as mp
import numpy as np
import torch
import pyvirtualcam as pvc

#create mediapipe instance


mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_holistic = mp.solutions.holistic

#initialize mediapipe instance


holistic = mp_holistic.Holistic(
min_detection_confidence=0.3,
min_tracking_confidence=0.3)

# #initialize webcam
cap = cv2.VideoCapture(0,cv2.CAP_DSHOW)
cap.set(cv2.CAP_PROP_FPS, 30)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)

#model loading
# model = torch.hub.load('ultralytics/yolov5',
# 'custom',
# path='modelN800/weights/best.pt' ,force_reload=True)
model = torch.hub.load('ultralytics/yolov5', 'custom',
path='modelS/weights/best.pt' ,force_reload=True)

55
# Draw landmarks on the image
def draw_landmarks(image, results):
# Drawing landmarks for left hand
mp_drawing.draw_landmarks(
image,
results.left_hand_landmarks,
mp_holistic.HAND_CONNECTIONS,
mp_drawing_styles.get_default_hand_landmarks_style())

# Drawing landmarks for right hand


mp_drawing.draw_landmarks(
image,
results.right_hand_landmarks,
mp_holistic.HAND_CONNECTIONS,
mp_drawing_styles.get_default_hand_landmarks_style())

#Drawing landmarks for pose


mp_drawing.draw_landmarks(
image,
results.pose_landmarks,
mp_holistic.POSE_CONNECTIONS,
mp_drawing_styles.get_default_pose_landmarks_style())

return image

letter = ""
offset = 1
#collect data
def collectData():
with pvc.Camera(width=1280, height=720, fps=30) as cam:
while True:
ret, frame = cap.read()
global letter

# Make a copy of the original image and empty black images


original = frame.copy()
black = np.zeros(frame.shape, dtype=np.uint8)

56
image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
image.flags.writeable = False

# Holistic of the given image is stored in result variable


results = holistic.process(image)

image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

# Draw landmarks on the image


draw_landmarks(image, results)
draw_landmarks(black, results)

# Show output of the image with landmarks


results = model(black)
pred_img = np.squeeze(results.render())

image = original
start_point = (0, image.shape[0])
end_point = (image.shape[1], image.shape[0])
color = (0, 0, 0)
thickness = 150
line_type = cv2.LINE_AA
cv2.line(image, start_point, end_point, color, thickness, line_type)

#print the class when 70% confidence is reached in pytorch


for result in results.pandas().xyxy[0].iterrows():
if result[1]['confidence'] > 0.65:
#To print the class
name=result[1]['name']
ret = True
text = name
print(text)

57
start_point = (0, image.shape[0])
end_point = (image.shape[1], image.shape[0])
color = (0, 0, 0)
thickness = 150
line_type = cv2.LINE_AA
cv2.line(image, start_point, end_point, color, thickness, line_type)

font = cv2.FONT_HERSHEY_DUPLEX
fontScale = 1.5
color = (255,255,255)
thickness = 2
lineType = cv2.LINE_AA

# Calculate the starting point of the text


(text_width, text_height), _ = cv2.getTextSize(text, font, fontScale,
thickness)
width, height, _ = image.shape
x = int((width - text_width) / 2)
y = int((height + text_height) / 2)
cv2.putText(image, text, (x+250,y+40), font, fontScale, color, thickness,
lineType)

image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)


cam.send(image)
cam.sleep_until_next_frame()

collectData()

cap.release()
cap.destroyAllWindows()

58
REFERENCES

[1] "Real-Time Sign Language Translation Using Hand Tracking and Deep

Learning" by C. Lee et al., published in the IEEE Transactions on Multimedia

in 2020.

[2] "Real-Time Sign Language Translation using Deep Learning and

Convolutional Neural Networks" by A. K. Ghosh et al., published in the

Journal of Computer Science and Technology in 2019.

[3] "Real-Time Sign Language Translation using Machine Learning and Deep

Learning Techniques" by M. A. Ali et al., published in the Journal of

Multimedia Tools and Applications in 2019.

[4] "Real-Time Sign Language Translation using Deep Learning and

Computer Vision Techniques" by N. A. Khan et al., published in the

International Journal of Computer Science and Information Security in 2019.

[5] "Real-Time Sign Language Translation using Machine Learning and Deep

Learning Approaches" by M. I. Ahmed et al., published in the International

Journal of Advanced Computer Science and Applications in 2019.

[6] "Real-Time Sign Language Translation Using Machine Learning and Deep

Learning Techniques" by S. K. Saha et al., published in the International

Journal of Computer Applications in 2019.

[7] "Real-Time Sign Language Translation using Machine Learning and Deep

Learning Algorithms" by S. M. Islam et al., published in the International

Journal of Advanced Computer Science and Applications in 2019.

59
[8] "Real-Time Sign Language Translation using Deep Learning and Computer

Vision Techniques" by M. A. Hossain et al., published in the International Journal of

Computer Science and Information Security in 2019.

[9] "Real-Time Sign Language Translation using Machine Learning and Deep

Learning Approaches" by A. H. M. Hassan et al., published in the International

Journal of Advanced Computer Science and Applications in 2019.

[10] "Real-Time Sign Language Translation using Machine Learning and Deep

Learning Techniques" by M. A. Hossain et al., published in the International Journal

of Computer Science and Information Security in 2019

60

You might also like