Voice Recognition Based Security System Using
Voice Recognition Based Security System Using
Sarang Chauhvan
Department of Electronics & Telecommunication Engineering
G H Raisoni College of Engineering, Nagpur, India
[email protected]
Abstract— Following review depicts a unique speech should speak identical words each for recording and
recognition technique, based on planned analysis and recognizing sessions. In non-text related systems this example
utilization of Neural Network and Google API using speech’s is a recording and review session, both are entirely different [1,
characteristics. Multifactor security system pioneered for the 2]. The voices of every individual are easily distinguishable,
authentication of vocal modalities and identification. even will acknowledge one another at the phone. In voice
Undergone project drives completely unique strategy of recognizing gathering the options of the voice is vital.
independent convolution layers structure and involvement of As per considered perusal and taken into account system is
totally unique convolutions includes spectrum and Mel- designed with the set of rules of speech, cryptography
frequency cepstral coefficient. This review takes in the breakthrough within the style of RGB (Red, Green, and Blue)
statistical analysis of sound using scaled up and scaled down spectrograms. Present Strategy working like substitution
spectrograms, conjointly by exploitation the Google Speech- addressing recognition of voice, supported the précising the
to-text API turns speech to pass code, it will be cross-verified Cryptography about system and one’s speech characteristics.
for extended security purpose. Our study reveals that the System is designed to plan usage of Convolutional Neural
incorporated methodology and the result provided elucidate Networks, implementing CNN is of bit a tough task since it has
the inclination of research in this area and encouraged us to out shown a surprising result and shown hostility to unknown
advance in this field. frequency elements which are not authorized to the system’s
database environment.
Keywords— MFCC (Mel-Frequency Cepstrum Coefficients), Convolutional Neural Networks undergoes the vigorous
CNN (Convolutional Neural Network), ANN (Artificial Neural training phase as it has to be feed with multiple voice modalities
Network), ASR (Artificial Speech Recognition), STT (Speech-to- and each layer has its own designated work assigned to it, the
Text), GUI (Graphic User Interface). planned structure of this architecture led to the usage of two
I. INTRODUCTION Convolution and one Fully connected layer, this whole software
program is covered in and written in the Python programming
Restricting to breach the Security and encompassing is supreme environment using Rasp - berry Pi, System undergoes dedicated
vital. Standard System carries the security like passcodes, Neural Network Structure.
finger scanning, scanning of palm, identity verification can
break simply. So upper mentioned systems can breach out by II. SPEECH RECOGNITION PROCESS
perceiving password of particular system, since every coin has
The Method of speech recognition [3] is complicated and a
two sides so, by known force or any other technique Palm or
cumbersome job. The subsequent figure shows the steps
finger scanning breaching the whole system can cause loss of
concerned within the method of speech recognition. Voice
confidential data loss. Proposed system is made in way to grips
recognition Technique mainly consists of 2 chief arms - Audio
with hand in hand security leading to safeguarding documents.
Classification Prediction, Feature Extraction and storage
Voice Recognition technique is supremely excellent amidst
Feature extraction process extracts vital quantity of knowledge
other techniques for security purposes. Since every person has
form a particular modality of voice, leading to show the
its own unique voice adding distinguishable features like
designated authorized speaker. MFCC is employed because
Frequency, Pitch, Amplitude etc. Voice recognition is a vital
Feature Extraction technique [4] is used during this project.
task in life sciences and is split into two groups: text-dependent
Audio Classification Prediction involves the particular
and text-independent. In text-dependent systems, the user
procedure where with the help of forceful Machine learning it
ISBN:
ISBN: 978-1-7281-8529-3/21/$31.00
XX-X-XX-XX-X/19/$31.00 ©2021©2021IEEE
IEEE 738
1
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 30,2021 at 03:51:10 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
is used to spot the unknown speaker in order that the system basis of individual details found in audio files i.e. speaker wave
calculates the information losses, minimum the information files. this device makes it possible to use the speech of the
loss additional correct the system are, every speaker has its own speaker to check their individual identity and access to services
speaker id and knowledge set loaded within the backend of the like dialing of voice modality, mobile phone banking, accessing
system NN compares with knowledge set to its own intelligence the information, mail, voice, services related to information,
developed by the coaching and extract the precise output. lead region security management and isolated connections to
machines such as computer. These Automatic identification [7]
and verification techniques typically thought-about foremost
existing and cut-rate strategies to steer clear of from
Training Testing unapproved connect any common spot or machines practice.
Phase Alternatively, the substitute available in the devices makes it
Phase suitable as speaker recognition for security for a purpose. The
drawback of speaker recognition system is one that's frozen
within the Speech Signal’s study. An exceedingly fascinating
drawback is that in the analysis of the speech signal, and in that
what characteristics create it distinctive among different signals
and what makes one speech signal totally different from
another.
A. Mel-Frequency Cpestrum Coefficients
Feature Extraction is that the extraction of the most effective
constant illustration of voice signals so as to supply a higher
recognition performance. The Principal objective of feature
extraction is to extract characteristics from the speech signal
that comes out to be very distinctive to every individual which
can be additional accustomed helping in differentiating one
speaker from other. For successive section it's vital that this
section have sensible potency because it affects the behavior of
the system. The Characteristics of the Vocal tract is exclusive
Fig.1 Traning and Testing phases of System.
for every individual speaker thus the impulses of the vocal tract
response will be accustomed differentiate speakers which may
be obtained by applying algorithm of Mel-frequency cepstrum
III. SPEECH RECOGNITION PROCESS coefficients (MFCC) [5, 6]. MFCC depends upon recognition
of alteration of the key data measurement of the human ear with
frequency i.e. its supported perceptions of human hearing can’t
comprehend frequencies above 1 kHz. Filters are of two types
in MFCC and they are linearly separated at frequencies below
1000 frequencies and at exponents greater than 1000 Hz. The
overall procedure of the MFCC is shown in figure below.
739
2
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 30,2021 at 03:51:10 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
3740
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 30,2021 at 03:51:10 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
741
4
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 30,2021 at 03:51:10 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
742
5
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 30,2021 at 03:51:10 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
743
6
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 30,2021 at 03:51:10 UTC from IEEE Xplore. Restrictions apply.