Hexatalk Using ANN and DNNS
Hexatalk Using ANN and DNNS
Abstract: Speaker recognition is an essential aspect of human-computer interaction, with applications in security,
personalized services, and more. This project proposes an end-to-end speaker recognition system leveraging Long Short-
Term Memory (LSTM) neural networks. Mel-Frequency Cepstral Coefficients (MFCCs) are used as audio features,
processed by an LSTM model to classify speakers with high accuracy. The proposed system demonstrates the efficacy of
LSTM for temporal feature analysis, achieving robust performance in noisy environments.
How to Cite: M Ravi; Dr. A Obulesu Ch Vinod Vara Prasad; N Abhishek; N Rithish Reddy; V Anil Chary (2025). Hexatalk using
ANN and DNNS. International Journal of Innovative Science and Research Technology, 10(4), 1789-1792.
Https://Doi.Org/10.38124/Ijisrt/25apr1252
Speaker recognition involves identifying or verifying the Traditional speaker recognition systems primarily rely on
identity of a speaker based on audio signals. With the statistical approaches like Gaussian Mixture Models (GMMs)
increasing adoption of smart devices and voice assistants, and Hidden Markov Models (HMMs) for feature extraction
robust speaker recognition systems have become essential for and classification. These methods have been the foundation of
applications like biometric authentication, personalized speaker recognition for decades due to their ability to model
services, and secure communication. speech dynamics and variations effectively. However, they
exhibit several critical limitations
Traditional methods such as Gaussian Mixture Models
(GMMs) and Hidden Markov Models (HMMs) rely on Dependency on Handcrafted Features
handcrafted features and struggle to model the complex and Traditional systems rely heavily on manually designed
dynamic nature of speech signals. They also face challenges in features, such as spectral or prosodic attributes. These
adapting to noise, speaker variability, and environmental handcrafted features often fail to capture the full complexity
changes, limiting their effectiveness in real-world scenarios. of speech signals, especially under varying conditions.
The simulation yielded impressive results, demonstrating This paper proposed an advanced speaker recognition
the efficacy of the LSTM-based speaker recognition system leveraging Long Short-Term Memory (LSTM) neural
system. The key findings include: networks, which effectively addressed the challenges
associated with traditional methods. By utilizing Mel-
High Accuracy on Clean Data: Frequency Cepstral Coefficients (MFCCs) as input features,
The model achieved an accuracy exceeding 95% on clean the system captured critical spectral and temporal speech
datasets, indicating its effectiveness in speaker identification. characteristics, enabling precise speaker classification. The
LSTM architecture excelled in modeling temporal
Robustness in Noisy Conditions: dependencies, achieving robust performance across diverse
Even under high noise levels, the model maintained conditions, including noisy and dynamic environments.
robust performance with minimal degradation in accuracy, Compared to conventional approaches like Gaussian Mixture
outperforming traditional methods. Models (GMMs) and Hidden Markov Models (HMMs), the
proposed system demonstrated significant improvements in
accuracy, scalability, and noise resilience.
.