Review of Related Literature - SignConnect
Review of Related Literature - SignConnect
Tanzer (2024) conducted a study focusing on translating American Sign Language (ASL) to
English, with a specific emphasis on improving the recognition of fingerspelling within sentences. The
tokenization (ByT5) with subword-level tokenization (T5). Findings revealed that ByT5 significantly
outperformed T5 in terms of accuracy, particularly in the translation of fingerspelled terms. The study also
highlighted key challenges in fingerspelling recognition, such as the difficulty caused by rapid, small, and
coarticulated hand movements, as well as the presence of out-of-vocabulary terms. These challenges
impacted overall translation quality. The research underscores the importance of accurate fingerspelling
recognition in enhancing the translation of ASL to English, especially when dealing with proper nouns and
In their study, Rivera and Ong (2018) explored the integration of both manual and non-manual
signals for recognizing Filipino Sign Language (FSL) using machine learning techniques such as Artificial
Neural Network (ANN) and Support Vector Machine (SVM). They employed the Microsoft Kinect sensor
to capture non-manual signals, focusing on facial expressions and head movements crucial to FSL
communication. The research utilized feature extraction techniques, including face orientation, Shape Units
(SU), and Animation Units (AU), with Genetic Algorithm applied for feature selection. However, the study
faced challenges in representing the intensities and co-occurrences of facial expressions, along with issues
of hand occlusions that hindered motion representation. Despite these challenges, the work aimed to
facilitate communication between the Deaf community and non-signers, particularly medical professionals,
system utilizing a vision-based approach with a web camera and Convolutional Neural Networks (CNN).
The system was designed to recognize static gestures, specifically targeting American Sign Language
(ASL) alphabets, numbers, and other common static signs. Key to their approach was the use of feature
extraction techniques, including skin-color detection and image processing methods, to enhance the clarity
and recognition accuracy of the captured gestures. While the system proved to be a valuable tool for those
learning basic sign language, the study also acknowledged several challenges, such as the need for proper
lighting and uniform backgrounds to optimize performance. Moreover, the system's limitation to static
gestures restricts its applicability in real-world dynamic communication contexts. Despite these limitations,
the research highlights the system’s potential as a practical learning tool to aid non-signers in interacting
with the hearing-impaired community without the need for gloves or sensors.
Cayme et al. (2024) explored various approaches to Filipino Sign Language (FSL) recognition,
focusing on the combined use of Convolutional Neural Networks (CNN) and Long Short-Term Memory
(LSTM) models to improve dynamic gesture recognition. Unlike earlier systems that predominantly
addressed static signs, their work emphasized real-time gesture recognition for increased accuracy. The
study also identified feature extraction techniques, such as image pre-processing through cropping,
grayscale conversion, and normalization, which helped enhance the system’s performance. However, the
authors highlighted certain challenges, including the limited research available on FSL, the need for high
computational power, and environmental factors like lighting and skin tone that could influence accuracy.
Despite these challenges, the study underscored the potential applications of their system, particularly in
assistive technologies and educational tools, offering a lightweight model deployable on resource-
recognizing Filipino Sign Language (FSL) and converting it into text. The system leverages machine
learning and Python programming, particularly utilizing Python OpenCV for image capture to recognize
sign gestures. A dataset of approximately 3,000 images per sign was collected to ensure accurate gesture
prediction. However, the researchers encountered challenges, particularly with the application being
functional only on trained computer devices, which limited its presentation capabilities to participants.
Despite these limitations, the application demonstrates potential to bridge communication gaps for Special
Education Students, teachers, and non-disabled individuals by providing translations of basic Filipino
Pascua et al. (n.d.) focused on developing a Filipino Sign Language (FSL) thesaurus management
system using Ren'py, a visual novel engine, to enhance communication for the deaf and mute community
in the Philippines. The study emphasizes that FSL, while still evolving, primarily draws from American
Sign Language (ASL) as its foundational base. This system is designed to address the educational and
communication needs of both deaf individuals and their families by offering a structured tool to learn proper
FSL, particularly for daily interactions and academic purposes. Despite its promising approach, the study
highlights several challenges, including illiteracy, poverty, and the lack of educational support, which hinder
the effective dissemination and adoption of FSL. Additionally, the shift from ASL to FSL in educational
institutions is seen as another significant obstacle in fully realizing the potential of FSL in the Philippines
Samonte et al. (2022) explored the use of deep learning approaches for sign language translation
into text, focusing on methods like Convolutional Neural Networks (CNN), Connectionist Temporal
Classification (CTC), and Deep Belief Networks (DBN) for recognizing sign language. CNN emerged as
the most frequently used method, particularly for feature extraction from images, with other techniques
such as 3DCNN, 2DCNN, and Bidirectional Long Short-Term Memory (BiLSTM) also being employed to
model sequences. Despite the high accuracy achieved in sign language recognition, the study highlighted
challenges in real-world applicability, particularly with limited use of Natural Language Processing (NLP)
techniques. The proposed systems are often conceptual and require further development to be applicable in
real-time scenarios, with additional research needed to enhance usability for individuals with speech and
Modi and More (2013) presented a vision-based approach for translating American Sign Language
(ASL) finger-spelling to text, utilizing image processing techniques to capture and analyze hand gestures.
The system extracts hand features from video frames by converting them into grayscale and binary formats,
followed by BLOB analysis for feature extraction. Their study outlines several challenges, including the
need for static backgrounds, limitations on recognizing signs with movement (e.g., "J" and "Z"), and
constraints related to the clarity of the captured image. Despite these challenges, the authors demonstrate
that their method effectively translates finger-spelling into text using standard computer webcams without
the need for specialized hardware, offering practical applications for assisting communication between
In their research, Shirbhate et al. (2020) explore Indian Sign Language (ISL) recognition using
machine learning algorithms to address the complexity posed by ISL's use of both hands, a challenge absent
in American Sign Language (ASL). The study utilizes techniques like Support Vector Machines (SVM) and
Random Forests for classification and employs hierarchical classification to enhance recognition accuracy.
Feature extraction methods include SIFT for detecting key points and HU’s moments for shape
representation, following image preprocessing techniques like skin segmentation. However, the absence of
standardized datasets and the local variations within ISL contribute to occlusion challenges and feature
overlap. This complexity, along with shared signs between alphabets and numbers, complicates accurate
classification. Although primarily aimed at improving communication for the deaf community in India, the
current system is limited to recognizing static ISL numeral signs, with future potential to include words and
In their study on Filipino Sign Language (FSL) recognition, Cabalfin et al. (2012) utilized a non-
linear manifold learning algorithm, Isomap, to effectively recognize visual signs in FSL. This approach was
instrumental in transforming video data into low-dimensional trajectories for easier classification,
especially for isolated signs. However, the system encountered challenges in distinguishing between signs
that had similar hand shapes or movements, such as minimal pairs, leading to reduced recognition accuracy.
The researchers incorporated Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS)
for comparing these projections, enhancing the system's ability to handle time series variations in signing.
Using a dataset of 72 signs recorded by native Deaf signers, the study highlighted the potential of machine
learning techniques in gesture recognition, though it struggled with recognizing gestures with subtle
https://doi.org/10.48550/arXiv.2408.07065
Rivera, J. P., & Ong, C. (2018). Recognizing non-manual signals in filipino sign language. In Proc. Eleventh
International Conference on Language Resources and Evaluation (LREC 2018) (pp. 1-8).
Tolentino, L. K., Juan, R. O., Thio-ac, A. C., Pamahoy, M. A., Forteza, J. R., & Garcia, X. J. (2019). Static
sign language recognition using Deep Learning. International Journal of Machine Learning and
Cayme, K. J., Retutal, V. A., Salubre, M. E., Astillo, P. V., Cañete, L. G., & Choudhary, G. (2024). Gesture
recognition of Filipino sign language using convolutional and long short-term memory deep neural
Murillo, S. C. M., Villanueva, M. C. A. E., Tamayo, K. I. M., Apolinario, M. J. V., Lopez, M. J. D., & Edd.
(2021). Speak the Sign: A Real-Time Sign Language to Text Converter Application for Basic Filipino Words
https://cajmtcs.centralasianstudies.org/index.php/CAJMTCS/article/view/92
PASCUA, S. M., ESPINA, P. L. C., TALAG, R. P. E., VILLEGAS, L. N., & AQUINO DE GUZMAN, L.
(n.d.). Words in Vision : A Filipino Sign Language Thesaurus Management System Using Ren-py. IFLA
Library. https://library.ifla.org/id/eprint/1732
Samonte, M. J. C., Guingab, C. J. M., Relayo, R. A., Sheng, M. J. C., & Tamayo, J. R. D. (2022, March).
Using Deep Learning in Sign Language Translation to Text. In Proceedings of the International Conference
Modi, K., & More, A. (2013). Translation of sign language finger-spelling to text using image processing.
Bhavadharshini M, Josephine Racheal J, Kamali M, Sankar S, & Bhavadharshini M. (2021). Sign language
Shirbhate, R. S., Shinde, V. D., Metkari, S. A., Borkar, P. U., & Khandge, M. A. (2020). Sign language
recognition using machine learning algorithm. International Research Journal of Engineering and
Cabalfin, E. P., Martinez, L. B., Guevara, R. C., & Naval, P. C. (2012). Filipino sign language recognition
using manifold projection learning. TENCON 2012 IEEE Region 10 Conference, 290, 1–5.
https://doi.org/10.1109/tencon.2012.6412231