0% found this document useful (0 votes)

9 views15 pages

NNFL Assignment 1128

The document outlines an assignment focused on developing a deep learning-based model for real-time sign language recognition using convolutional neural networks (CNNs). It details the challenges of accurately interpreting American Sign Language (ASL) gestures, the data preprocessing steps, the neural network architecture, and the training results, which show high accuracy and performance metrics. The conclusion emphasizes the need for further enhancements and optimizations to achieve robust real-time recognition in practical applications.

Uploaded by

mdhossainmaskat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views15 pages

NNFL Assignment 1128

Uploaded by

mdhossainmaskat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Premier University

Department of Computer Science & Engineering

Course Code : CSE 451

Course Title : Neural Networks and Fuzzy logic

Assignment Name : Deep learning-based model for real-time sign language recognition

Date of Submission: 08 / 04 / 2025

Submitted By

Name Rahat Imroze Ahmed

Marks
ID 0222220005101128
Department CSE
Semester 7th
Section E
Introduction & Problem Statement:
Sign language recognition has gained significant attention in the field of computer vision
and human-computer interaction. It is a crucial step in bridging communication gaps for
the hearing impaired and offering an inclusive society. One of the primary challenges in
sign language recognition is developing a model that can accurately interpret hand
gestures in real-time, while also handling various factors such as variations in lighting,
hand positions, and background noise.

The problem of recognizing sign language using computer vision is a significant one due
to the need for high accuracy and adaptability in real-world scenarios. Traditional
methods require manual feature extraction, which can be complex and error-prone. With
deep learning, particularly convolutional neural networks (CNNs), we can automate
feature extraction and classification, leading to more efficient and accurate models for
recognizing American Sign Language (ASL) gestures.

This project focuses on building a CNN-based model to recognize ASL alphabets, aiming
for real-time performance in sign language interpretation and contributing to the
development of assistive technologies for the hearing impaired.

Data Preprocessing:
The dataset used for training the model consists of labeled images of ASL signs, where
each image represents a particular sign in the ASL alphabet. The preprocessing steps
undertaken are essential for standardizing the images and enhancing model performance.
1. Image Augmentation:
Although image augmentation is typically useful in preventing overfitting, this
code does not include explicit augmentation steps like rotation, flipping, or
zooming. However, it can be added to enhance the model's robustness to various
inputs.

2. Normalization:
In this setup, the images are resized to a uniform size of (224, 224), ensuring
consistency in input dimensions. This helps the model learn better, as it processes
images of the same size.

3. Train-Validation Split:

The dataset is split into two parts—80% for training and 20% for
validation—using the validation_split argument of the
image_dataset_from_directory method. This ensures the model has a dedicated
dataset to evaluate its generalization ability.

4. Batching:
A batch size of 32 is used, enabling efficient training by processing multiple
images in parallel, which speeds up the learning process while maintaining
accuracy.

Neural Network Architecture:

The architecture of the neural network used in this project is a simple yet effective
Convolutional Neural Network (CNN), which is suitable for image classification tasks.

1. Layers:

○ Convolutional Layers (Conv2D): These layers are responsible for

learning spatial features from the input images. The network uses three
convolutional layers with 32, 64, and 128 filters, respectively. Each
convolutional layer is followed by a ReLU (Rectified Linear Unit)
activation function, which helps introduce non-linearity and enables the
model to learn complex patterns.
○ MaxPooling Layers (MaxPooling2D): After each convolutional layer, a
max-pooling operation is applied to reduce the spatial dimensions of the
feature maps. This helps decrease the number of parameters and
computations in the model while preserving the important features.

○ Flatten Layer (Flatten): After the feature extraction, the output of the last
convolutional layer is flattened into a one-dimensional array, which is then
passed to a dense layer.

○ Fully Connected Layer (Dense): A dense layer with 128 units and ReLU
activation is used to learn high-level representations of the features.

○ Output Layer (Dense): The output layer consists of a softmax activation

function to classify the input image into one of the ASL alphabet classes,
with one neuron per class.

2. Activation Functions:

○ ReLU: ReLU activation functions are used in the convolutional and dense
layers to introduce non-linearity, enabling the network to learn complex
patterns in the data.

○ Softmax: Softmax is used in the output layer to generate probability

distributions over the possible classes.

3. Optimizer:

○ The Adam optimizer is used, which is popular due to its efficiency in

handling sparse gradients and its adaptive learning rate
Fig: CNN Model Details Diagram
Training and Evaluation Results
The model was trained for 10 epochs, and the training process was monitored using both
accuracy and loss metrics. The results obtained are as follows:

● Training Accuracy: The model reached a training accuracy of 100% by the end
of the 10th epoch, indicating excellent fitting to the training data.

● Validation Accuracy: The validation accuracy was 95.22%, demonstrating that

the model generalizes well to unseen data, though some improvement could be
achieved in real-world scenarios with more diverse data.

Precision, Recall, and F1 Score:

● Precision: 95.98%

● Recall: 95.22%

● F1 Score: 95.25%
These metrics highlight the model’s overall good performance in classifying ASL images
with a balanced approach between precision and recall. The high F1 score confirms the
model’s ability to correctly classify both positive and negative classes.

Fig :Testing the model

Confusion Matrix:
The confusion matrix is plotted to visualize the model's performance across different
ASL classes. It helps identify which classes the model struggles with, guiding future
improvements such as data augmentation or model refinement.
Fig: Confusion Matrix

Challenges in Real-Time Sign Language Recognition:

While the model demonstrates promising results in a controlled validation setting,
real-time sign language recognition presents several challenges:

1. Lighting and Background Variability:

In real-world applications, changes in lighting conditions or complex
backgrounds can negatively affect model performance. To address this, additional
data augmentation techniques such as random brightness or contrast adjustments
can be implemented.
2. Hand Gestures and Variations:
Hand gestures may vary depending on the person performing them, including
size, speed, and orientation of the hand. These variations introduce noise and can
lower accuracy. More diverse data and more complex models (e.g., incorporating
pose estimation) could help mitigate this issue.

3. Real-Time Processing:

Real-time recognition requires the model to process images quickly, which could
be a challenge for deep learning models, particularly on edge devices. Optimizing
the model for faster inference (e.g., using model quantization or lightweight
architectures like MobileNet) would be crucial for real-time use.

4. Overfitting:
Given the relatively small dataset, there is a risk of overfitting the model to the
training data. Using techniques like dropout, data augmentation, or increasing the
dataset size would help improve generalization.

5. Model Deployment:

Deploying the trained model in real-world applications such as mobile or
embedded systems would require model optimization for memory and
computation efficiency. Edge AI frameworks like TensorFlow Lite can be
considered for deployment in resource-constrained environments.

Conclusion and Future Work:

The model performs well in classifying ASL alphabets with high accuracy, precision,
recall, and F1 score. However, to achieve robust real-time sign language recognition,
further enhancements are necessary, such as incorporating hand tracking, leveraging more
diverse datasets, and optimizing the model for edge deployment.

Future work may include:

● Expanding the dataset to cover a wider range of gestures.

● Experimenting with more complex models (e.g., using Recurrent Neural Networks
for sequential gestures).

● Enhancing the model with data augmentation techniques to improve

generalization.

● Exploring lightweight models for real-time deployment in mobile devices.

Code Screenshots:

Fig no.1: Dataset split into test train data

Figure no.2: CNN Model Layers & Activation Function

Figure No.3: Training the model and displaying its accuracy

Figure No.4: Testing the model

Figure No 5: Code for visualizing the confusion matrix

Link of the code: https://www.kaggle.com/code/rahatimroze/nnflassignment

Reflection on Complex Problem-Solving Aspects:
● Does the solution require in-depth engineering knowledge?

Yes. Developing a real-time sign language recognition system involves a solid

understanding of computer vision, machine learning, and deep learning
architectures (such as CNNs). Additionally, it requires knowledge of preprocessing
techniques like normalization and handling image data effectively, which are
crucial for building a robust model. Understanding metrics like precision, recall,
and F1 score is also essential for evaluating performance.

● Are there conflicting technical and engineering challenges?

Yes. One major conflict lies between accuracy and real-time performance. While
deep and complex models like CNNs can achieve high accuracy, they may not be
fast enough for real-time deployment without optimization. Another challenge is
balancing generalization (performing well on new users or varying
lighting/background conditions) with overfitting on the training data. Optimizing
models for performance while keeping them lightweight is a significant
engineering dilemma.

● Does it require abstract thinking and novel problem-solving

techniques?

Absolutely. The project requires abstract thinking to design a neural network that
can interpret complex spatial patterns of hand gestures. It also involves thinking
creatively about data preprocessing, handling diverse gesture styles, and
considering future scalability to real-time systems. Using techniques like CNNs,
softmax activation, and confusion matrices to evaluate performance demonstrates a
high level of abstract reasoning.
●Are the issues encountered infrequent in standard engineering
practice?

Yes. Real-time sign language recognition involves challenges that are not typically
encountered in traditional software or hardware engineering. These include hand
gesture ambiguity, variation in human physiology, dynamic backgrounds, and
adapting AI systems to human motion all of which are specialized and complex
problems in the AI and computer vision domain.

● Does the solution involve adherence to specific standards (e.g., real-time AI

processing)?

Yes. If this solution is to be used in real-world applications, it must meet standards

for real-time inference, low-latency processing, and potentially accessibility or
assistive technology standards. Ensuring consistent performance across diverse
users and environments is essential, as is optimizing the model for deployment
using standards like TensorFlow Lite.

● Are there multiple stakeholders with different needs?

Yes. The primary stakeholders include:

● Hearing-impaired individuals who need accurate and reliable sign

recognition.

● Developers/engineers aiming to optimize the system for real-time use.

● End-users who might integrate the system into educational tools or

translation services.

● Researchers who may want to build on the system for gesture-based

interaction. These stakeholders have varying priorities, accuracy, speed, ease
of integration, and reliability.

● How does this problem involve interdependence between AI, vision,

and human-computer interaction?

This project is a perfect example of the interdisciplinary nature of modern

problem-solving.

● AI (deep learning) is used for learning from image data.

● Computer vision handles preprocessing, interpreting pixel-level data, and

extracting features.

● Human-computer interaction (HCI) is at the core, as the system’s success

depends on how intuitively and accurately it interprets user gestures.

Sign Language Detection
No ratings yet
Sign Language Detection
32 pages
RAADS-R Test: Ritvo Autism Asperger Diagnostic Scale-Revised
100% (3)
RAADS-R Test: Ritvo Autism Asperger Diagnostic Scale-Revised
10 pages
Visual Testing: - Asme - Section 5 (NDT) - Section 5 - Article 9 (VT)
100% (3)
Visual Testing: - Asme - Section 5 (NDT) - Section 5 - Article 9 (VT)
29 pages
Manual Operador Amaro 5000 - OMRON - HOSPITALAR EN
100% (1)
Manual Operador Amaro 5000 - OMRON - HOSPITALAR EN
54 pages
Module 3 User's Guide - Planning and Assessing Health Worker Activities
No ratings yet
Module 3 User's Guide - Planning and Assessing Health Worker Activities
149 pages
AIB The Mock (Recall) Myth PDF
No ratings yet
AIB The Mock (Recall) Myth PDF
2 pages
Conversion of Sign Language To Text: For Dumb and Deaf
No ratings yet
Conversion of Sign Language To Text: For Dumb and Deaf
26 pages
Cmos Fabrication: N - Well Process
No ratings yet
Cmos Fabrication: N - Well Process
42 pages
Next Gen HD LED Lit Videowall User Guide PDF
No ratings yet
Next Gen HD LED Lit Videowall User Guide PDF
109 pages
Realtime Sign Language Gesture Word Recognition From Video Seque 2018
No ratings yet
Realtime Sign Language Gesture Word Recognition From Video Seque 2018
10 pages
MIE324 Final Report: Sign Language Recognition: Anna Deza (1003287855) and Danial Hasan (1003132228) Decemeber 2nd 2018
No ratings yet
MIE324 Final Report: Sign Language Recognition: Anna Deza (1003287855) and Danial Hasan (1003132228) Decemeber 2nd 2018
8 pages
Research 6
No ratings yet
Research 6
10 pages
Summary of Progress
No ratings yet
Summary of Progress
9 pages
Study Material 2 PDF
No ratings yet
Study Material 2 PDF
8 pages
Sign Language Recognition System Using Deep Neural Network
No ratings yet
Sign Language Recognition System Using Deep Neural Network
5 pages
A, Sign Language Detection
No ratings yet
A, Sign Language Detection
32 pages
The Motivation and Attitudes Towards Learning Slang in English A Study of The Fourth-Year Undergr
100% (1)
The Motivation and Attitudes Towards Learning Slang in English A Study of The Fourth-Year Undergr
79 pages
Speech
No ratings yet
Speech
2 pages
EASA Components Categories
No ratings yet
EASA Components Categories
1 page
CTR-12 - FPSO Firenze - Clarification Report - Ph-1 Presv Items
100% (1)
CTR-12 - FPSO Firenze - Clarification Report - Ph-1 Presv Items
3 pages
Asl
No ratings yet
Asl
34 pages
Sign Language Recognition Using Deep Learning and Computer Vision
No ratings yet
Sign Language Recognition Using Deep Learning and Computer Vision
6 pages
Review of Design Fire Heat Release Rate For Tunnels With Fire Suppression Systems
No ratings yet
Review of Design Fire Heat Release Rate For Tunnels With Fire Suppression Systems
11 pages
Synopsis Main
No ratings yet
Synopsis Main
10 pages
Column Interaction Diagram
No ratings yet
Column Interaction Diagram
4 pages
American Sign Language Research Paper
No ratings yet
American Sign Language Research Paper
5 pages
Deep Learning Based Accurate Hand Gesture Recognition Using Enhanced CNN Model
No ratings yet
Deep Learning Based Accurate Hand Gesture Recognition Using Enhanced CNN Model
12 pages
Hand Gesture
No ratings yet
Hand Gesture
37 pages
Name: Pranav G Dasgaonkar Roll No: 70 CLASS: 8 (CMPN-2) CG Experiment No: 09
No ratings yet
Name: Pranav G Dasgaonkar Roll No: 70 CLASS: 8 (CMPN-2) CG Experiment No: 09
12 pages
American SIGN - LANGUAGE - DETECTION
No ratings yet
American SIGN - LANGUAGE - DETECTION
35 pages
Signlanguage Detection 2
No ratings yet
Signlanguage Detection 2
30 pages
Aasl
No ratings yet
Aasl
34 pages
Signlanguagee 2 1
No ratings yet
Signlanguagee 2 1
27 pages
FM Heat & Smoke Detector
No ratings yet
FM Heat & Smoke Detector
34 pages
Guideline Answers To The Concept Check Questions Chapter 8: Capital Budgeting
No ratings yet
Guideline Answers To The Concept Check Questions Chapter 8: Capital Budgeting
8 pages
ProjectTemplateFinal 4 4 - 4
No ratings yet
ProjectTemplateFinal 4 4 - 4
23 pages
SIGNLANGUAGE PPT
100% (1)
SIGNLANGUAGE PPT
15 pages
Synopsis
No ratings yet
Synopsis
4 pages
Chinese Pidgin English - Bibliography PDF
No ratings yet
Chinese Pidgin English - Bibliography PDF
7 pages
Behavioral Finance: Jay R. Ritter
No ratings yet
Behavioral Finance: Jay R. Ritter
3 pages
Unit 6
No ratings yet
Unit 6
15 pages
Report Project
No ratings yet
Report Project
47 pages
Electronics and Communication s7 & s8
No ratings yet
Electronics and Communication s7 & s8
38 pages
Sign Language Detection From Hand Gesture Images Using Deep Multi-Layered Convolution Neural Network
No ratings yet
Sign Language Detection From Hand Gesture Images Using Deep Multi-Layered Convolution Neural Network
5 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
BT40451 Project Report
No ratings yet
BT40451 Project Report
47 pages
Footnote 12 To The Youth PDF Free
No ratings yet
Footnote 12 To The Youth PDF Free
5 pages
Religion, Guilt, and Ethical Standards
No ratings yet
Religion, Guilt, and Ethical Standards
17 pages
DL Final Project Report
No ratings yet
DL Final Project Report
9 pages
U18Ini5600 - Engineering Cilincs - V Project Report
No ratings yet
U18Ini5600 - Engineering Cilincs - V Project Report
14 pages
978-0!00!758620-2 Primary Science Student Book 4
No ratings yet
978-0!00!758620-2 Primary Science Student Book 4
12 pages
Sign Language Hand Gesture Recognition System
No ratings yet
Sign Language Hand Gesture Recognition System
44 pages
Edi 104 - Chapter 3
No ratings yet
Edi 104 - Chapter 3
47 pages
Synopsis Final Year
No ratings yet
Synopsis Final Year
8 pages
HTML Cheat Sheet
No ratings yet
HTML Cheat Sheet
5 pages
Paper Template1
No ratings yet
Paper Template1
9 pages
Sign Language RECOGNITION USING DEEP LEARNING
No ratings yet
Sign Language RECOGNITION USING DEEP LEARNING
28 pages
Research Paper
No ratings yet
Research Paper
13 pages
Conference Latex Template
No ratings yet
Conference Latex Template
6 pages
Sign Language Recognition Using Deep Learning
No ratings yet
Sign Language Recognition Using Deep Learning
12 pages
11.2 The Process of Cell Division
No ratings yet
11.2 The Process of Cell Division
36 pages
A Different Approach To Jensens Alpha and Returning Ranking
No ratings yet
A Different Approach To Jensens Alpha and Returning Ranking
18 pages
Hand Sign Amp Gesture Recognition System
No ratings yet
Hand Sign Amp Gesture Recognition System
4 pages
Conference Paper - 1
No ratings yet
Conference Paper - 1
2 pages
Sign Lang Detection Project
No ratings yet
Sign Lang Detection Project
18 pages
Rephrased Document
No ratings yet
Rephrased Document
2 pages
Sign Language Recognition With Convolutional Neural Networks
No ratings yet
Sign Language Recognition With Convolutional Neural Networks
10 pages
Final Project Synopsis PDF
No ratings yet
Final Project Synopsis PDF
12 pages
Sign Language Detection Presentation
No ratings yet
Sign Language Detection Presentation
9 pages
Sign Lang Detection Project
No ratings yet
Sign Lang Detection Project
16 pages
American Sign Language Detection System
No ratings yet
American Sign Language Detection System
5 pages
Lesson Plan in Napkin Folding
No ratings yet
Lesson Plan in Napkin Folding
2 pages
Peter Siwes Report
No ratings yet
Peter Siwes Report
17 pages
Final Capstone Review
No ratings yet
Final Capstone Review
29 pages
Sign Lang
No ratings yet
Sign Lang
19 pages
11-Chapter 11-Wellsite Geologist
No ratings yet
11-Chapter 11-Wellsite Geologist
140 pages
Sign-Language Final
No ratings yet
Sign-Language Final
32 pages
Complete Sign Language Recognition Report
No ratings yet
Complete Sign Language Recognition Report
5 pages
Evolving Robots To Play Capture The Flag
No ratings yet
Evolving Robots To Play Capture The Flag
8 pages
ASL Sign Language Recognition Presentation
No ratings yet
ASL Sign Language Recognition Presentation
15 pages
Intro File Major Project
No ratings yet
Intro File Major Project
18 pages
Midterm Capstone
No ratings yet
Midterm Capstone
18 pages
IEEE Conference Template 1
No ratings yet
IEEE Conference Template 1
5 pages
Content 3
No ratings yet
Content 3
7 pages
Farman
No ratings yet
Farman
9 pages
Ai and Datascience
No ratings yet
Ai and Datascience
19 pages
Gesture Gennie
No ratings yet
Gesture Gennie
8 pages
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
From Everand
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
Nietsnie Trebla
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet