0% found this document useful (0 votes)

16 views15 pages

Voice Syn - NN

This research proposal outlines the development of an improved voice recognition system utilizing Mel Frequency Cepstral Coefficients (MFCC) for feature extraction and K-Means algorithm for feature matching. The system aims to enhance security applications by accurately identifying speakers based on their voice characteristics. The study will also evaluate existing voice recognition systems and propose an efficient methodology for implementation using MATLAB.

Uploaded by

usha kumari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views15 pages

Voice Syn - NN

Uploaded by

usha kumari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Research Proposal on

An improved Voice Recognition System using

Feature Extraction of MFCC Technique and
Feature Matching of K-Means Algorithm
Submitted in partial fulfillment of the requirement for the award of the degree

MASTERS OF TECHNOLOGY

in
COMPUTER SCIENCE AND ENGINEERING

USHA KUMARI
(020215001)

under the supervision of

GAURAV AGGARWAL
(Associate Professor & HOD (Dept. of CSE)

DEPARTMENT OF COMPUTER SCIENCE AND

ENGINEERING

Jagannath University NCR, Haryana, India

2016-2017
CONSENT LETTER

I hereby give my consent to supervise USHA KUMARI, roll no. 020215001, student of
M.Tech. (2015-17) during the session 2016-17, for the thesis work to be carried out on the
topic An improved Voice Recognition System using Feature Extraction of MFCC
Technique and Feature Matching of K-Means Algorithm, for partial fulfillment of the
requirement for the award of the degree of Master of Technology in Computer Science and
Engineering.

Signature of the Supervisor

Place:
Date:
1. Introduction

1.1 Introduction

Biometrics refers to automatic recognition of individuals based on their physiological and

behavioral characteristics [1]. The world is crying out for the simpler access controls to
personal authentication systems and it looks like biometrics may be the answer. Instead of
carrying bunch of keys, all those access cards or passwords you carry around with you, your
body can be used to uniquely identify you. Furthermore, when biometrics measures are
applied in combination with other controls, such as access cards or passwords, the reliability
of authentication controls takes Giant step forward. The various application using biometrics
are passports, driving licenses, banking, refraining imposters from hacking into networks,
stealing mails etc. The traditional security systems are Token based system, in this fakers are
prevented from accessing protected resources using ID cards, smart cards etc. The main
disadvantages of token based systems are ID cards can be lost, forged, or misplaced.

Main advantages of biometric system over conventional approach is the reliability, it cannot
be stolen or misplaced. In a biometric system various biometric features are extracting after
capturing the biometric images of the user and authenticating individual by checking against
the templates previously stored in the database. How an individual to be authenticated is
depending upon application of the biometric system is used. The types of operating modes of
biometric system are verification and identification. One of important biometric system is
human voice.

Voice or speaker recognition [2] is the process of automatically recognizing who is speaking
on the basis of individual information included in speech waves. This technique makes it
possible to use the speaker's voice to verify their identity and control access to services such
as voice dialing, banking by telephone, telephone shopping, database access services,
information services, voice mail, security control for confidential information areas, and
remote access to computers.

1.2 Classification of Voice Recognition

Voice recognition is a popular topic in today’s life. The applications of Voice recognition can
be found everywhere, which make our life more effective. For example the applications in the
mobile phone, instead of typing the name of the person who people want to call, people can
just directly speak the name of the person to the mobile phone, and the mobile phone will
automatically call that person. If people want send some text messages to someone, people
can also speak messages to the mobile phone instead of typing [3].

Voice recognition can be classified into a number of categories. Figure 1.1 below provides
the various classifications of speaker voice recognition [1].

Figure 1.1: Classification of Speaker Recognition

1.2.1 OPEN SET vs CLOSED SET

This category of classification is based on the set of trained speakers available in a system as
describe below.

1. Open Set: An open set system can have any number of trained speakers. We have an open
set of speakers and the number of speakers can be anything greater than one.

2. Closed Set: A closed set system has only a specified (fixed) number of users registered to
the system. In our thesis, we have used open set of trained speakers.

1.2.2 IDENTIFICATION vs VERIFICATION

Automatic speaker identification and verification are often considered to be the most natural
and economical methods for avoiding unauthorized access to physical locations or computer
systems. Identification & verifications are described below.

1. Speaker identification: It is the process of determining which registered speaker provides

a given utterance.
2. Speaker verification: It is the process of accepting or rejecting the identity claim of a
speaker. Figure 1.2 and figure 1.3 below illustrate the basic differences between speaker
identification and verification systems.

Similarity

Reference
model Maximum Identification
Input Feature
(Speaker #1) selection result
speech extraction
(Speaker ID)

Similarity

Reference
model
(Speaker #N) (a) Speaker identification

Verification
Input Feature result
Similarity Decision
speech extraction (Accept/Reject)

Reference
Speaker ID Threshold
model
(#M) (Speaker #M)

(b) Speaker verification

Figure 1.2: Block Diagrams of Identification and Verification systems [1]
Figure 1.3: Practical examples of Identification and Verification Systems [1]

Both the figures depict the differences between ASI (Automatic Speaker Identification) and
ASV (Automatic Speaker Verification) systems. Figure 1.2 gives the theoretical block
diagrams of both the processes whereas figure 1.3 gives a practical implementation of the
systems. In our thesis we have focussed only on ASI systems.

1.2.3 TEXT-DEPENDENT vs TEXT-INDEPENDENT

This is another category of classification of speaker recognition systems. This category is

based upon the text uttered by the speaker during the identification process as describe below.

1. Text-Dependent: In this case, the test utterance is identical to the text used in the training
phase. The test speaker has prior knowledge of the system.

2. Text-Independent: In this case, the test speaker doesn’t have any prior knowledge about
the contents of the training phase and can speak anything.

In our thesis, we have used the text-independent model. Thus, we have designed a open-set
text-independent ASI (Automatic Speaker Identification) system in our thesis work.

2. Literature Survey:

Various types of voice and speaker recognition techniques are available. In this section, we
provide the literature review of work done in this field.
Anusuya M. A. et. al. (2009) [5] in their paper presented a brief survey on Automatic Speech
Recognition and discusses the major themes and advances made in the past 60 years of
research, so as to provide a technological perspective and an appreciation of the fundamental
progress that has been accomplished in this important area of speech communication.

Zue, V. et. al. (2011) [6] defined an approach for the audio dialogues, text, icons and graphics
for Speech Recognition and understanding. The authors produced a language of word-pair
which helps in searching and navigation.

Fook C.Y et.al. (2012) [7] defined speech recognition paper. The main aim of their research
is to compare and summarize the well known speech recognition methods used by various
researchers.

Singh P. P. et. al. (2012) [8] described that speech recognition is the new emerging
technology in the field of computer and artificial intelligence. It has changed the way we
communicate with computer and other intelligent devices of same calibre like smart phones.
It is a major area of interest for research in this field which is related to artificial intelligence.
In this paper the overview of this technology and its current implementations were listed and
introduced.

Choudhary A. et. al. (2012) [9] described the speech recognition process using the approach
of AI. The recognition method used is language mode, trigram model and acoustic model. No
GUI is used, acoustic model interface with the telephony system to manage spoken dialogues
by the speaker.

Mathur S. et. al. (2013) [10] in their paper outlines the basic concepts of speaker recognition
along with its diverse applications. It also presents an idea of selecting a robust parameter for
the purpose of identification to attain the accurate results, limitations faced and the recent
built up advances for identification, so as to provide a technological perspective in this
important area of speaker recognition.
Chandra E. et. al. (2014) [11] described that speaker recognition is the process of identifying
a person through his/her voice signals or speech waves. Pattern classification plays a vital
role in speaker recognition. Pattern classification is the process of grouping the patterns,
which are sharing the same set of properties. This paper deals with speaker recognition
system and over view of Pattern classification techniques DTW, GMM and SVM.
Nereveettil C. J. et. al. (2014) [12] in their paper presented the viability of Mel Frequency
Cepstral coefficient Algorithm to extract features and Fuzzy Inference System model for
feature selection, by reducing the dimensionality of the extracted features. There is an
increasing need for a new Feature selection method, to increase the processing rate and
recognition accuracy of the classifier, by selecting the discriminative features. Hence a Fuzzy
Inference system model is used selecting the optimal features from speech vectors which are
extracted using MFCC. The work has been done on MATLAB13a and experimental results
show that system is able to reduce word error rate at sufficiently high accuracy.

Nandyala S. P. et. al. (2014) [13] described a new approach of hybrid HMM/DTW by using
kernel adaptive filters for speech analysis and recognition is used. The noise removal or
filtration of conversations like over the telephone is very important in speech recognition.
Their approach gave better experimental results as compare to traditional results.

Xiang-Lilan et. al. (2014) [14] In this paper they introduced a new merged-weight dynamic
time wrapping algorithm (MWDTW). This method defines a template confidence index for
measuring the similarities between training and testing data, by using the DTW approach. By
using the merge approach of SD speech recognition datasets, HMM and DTW on merged
data sets, resulted six times better than DTW overall.

Doye D. D. et. al. (2015) [15] in their paper worked on the approach of new non linear time
alignment model rather than DTW algorithm. They worked for finding suitable time
alignment algorithm for the Marathi language. They took 46 monosyllabic confusing
alphabets and 46 confusing names for their work. They main feature used in this research
were Mel Frequency Cepstral Coefficients (MFCC), Linear Frequency Cepstral Coefficients
(LFCC) and Linear Prediction Coefficient (LPC)

Padmanabhan J. et. al. (2015) [16] described that the automatic speech recognition along with
Gaussian mixture model, machine learning and HMM is reviewed. The scanning,
preprocessing, extraction and classification of input are done by using the feature of acoustic,
bottleneck and MLP.

Chaudhary P. J. et. al. (2015) [17] in their paper talked about speaker recognition as an
ordinary process whereas speaker identification and speaker verification refer to definite
tasks or assessment modes associated with this process. Here, Speaker Recognition is nothing
but the computing task of validating a person’s claimed identity using features extracted from
the database of various voices. For the areas in which security is a foremost concern, speaker
Recognition technique is one of the most useful and popular biometric recognition
techniques. Various techniques for feature extraction like MFCC, RCC, LPC, LPCC, and
PLPC are discussed in their paper.

Chadha N. et. al. (2015) [18] described various applications of speech recognition systems
are present and these all includes various research challenges. A critical machine learning
based review is defined which addresses the various challenging tasks of speech recognition
system in NLP. In the existing systems, the recognition rate is very less and the noise ration
during the recognition process creates a problem.

Karpagavalli S et. al. (2016) [19] described that speech is the most natural communication
mode for human beings. The task of speech recognition is to convert speech into a sequence
of words by a computer program. Speech recognition applications enable people to use
speech as another input mode to interact with applications with ease and effectively. Speech
recognition interfaces in native language will enable the illiterate/semi-literate people to use
the technology to greater extent without the knowledge of operating with computer keyboard
or stylus. A detailed study on automatic speech recognition is carried out and presented in
this paper that covers the architecture, speech parameterization, methodologies,
characteristics, issues, databases, tools and applications.

3. Problem Formulation & Research Motivational

Voice Recognition is the process of converting a speech signal to a sequence of words, by

means of algorithms implemented as a computer program. Speech or voice is the most natural
form of human communication. Speech recognition technology has made it possible for
computer to follow human voice commands and understand human languages. The primary
function of the speech recognition engine is to process spoken input and translate it into text
that an application understands. The application can then do one of two things [4]:

▪ The application can interpret the result of the recognition as a command. In this case, the
application is a command and control application. An example of a command and control
application is one in which the caller says “check balance”, and the application returns the
current balance of the caller’s account.
▪ If an application handles the recognized text simply as text, then it is considered a dictation
application. In a dictation application, if you said “check balance,” the application would not
interpret the result, but simply return the text “check balance”.

For reasons ranging from technological curiosity about the mechanisms for mechanical
realization of human speech capabilities to desire to automate simple tasks which necessitates
human machine interactions and research in automatic speech recognition by machines has
attracted a great deal of attention for sixty years. Based on major advances in statistical
modeling of speech, automatic speech recognition systems today find widespread application
in tasks that require human machine interface, such as automatic call processing in telephone
networks, and query based information systems that provide updated travel information,
stock price quotations, weather reports, Data entry, voice dictation and access to information.

4. Objectives

Voice recognition is the process of automatically recognizing who is speaking on the basis of
individual information included in speech waves. This thesis describes how to build a simple,
yet complete and representative automatic voice recognition system. Such a voice recognition
system has potential in many security applications. For example, users have to speak a PIN
(Personal Identification Number) in order to gain access to the laboratory door, or users have
to speak their credit card number over the telephone line to verify their identity. By checking
the speech characteristics of the input utterance, using an automatic voice recognition system
similar to the one that we will describe, the system is able to add an extra level of security.

The overall objectives of voice recognition are summarized below:

1. The main aim of this thesis is speaker identification, which consists of comparing a
speech signal from an unknown speaker to a database of known speaker. The system
can recognize the speaker, which has been trained with a number of speakers.
2. Study the existing voice recognition systems.

3. Develop a new efficient technique for voice recognition system by applying Mel

Frequency Cepstral Coefficients (MFCC), K-means and Euclidean distance technique

4. Build a system that delivers optimal performance both in terms of speed and accuracy.
5. Proposed Methodology

This thesis describes how to build a simple, yet complete and representative automatic voice
recognition system. Such a voice recognition system has potential in many security
applications. For example, users have to speak a PIN (Personal Identification Number) in
order to gain access to the laboratory door, or users have to speak their credit card number
over the telephone line to verify their identity. By checking the speech characteristics of the
input utterance, using an automatic voice recognition system similar to the one that we will
describe, the system is able to add an extra level of security.

The main aim of this thesis is speaker identification, which consists of comparing a speech
signal from an unknown speaker to a database of known speaker. The system can recognize
the speaker, which has been trained with a number of speakers. In this work, we use the Mel
Frequency Cepstral Coefficients (MFCC) technique is used to extract features from the
speech signal and compare the unknown speaker with the existing speaker in the database.

Then we use K-means algorithm to cluster the training vectors to get feature vectors. This
algorithm clustered the vectors based on attributes into k partitions. Finally to identify the
unknown speaker, we use Euclidean distance. The Euclidean distance measure the distortion
distance of two vector sets. The speaker with the lowest distortion distance is chosen to be
identified as the unknown person.

Proposed Tool - MATLAB

MATLAB is a high-performance language for technical computing. It integrates

computation, visualization, and programming in an easy-to-use environment where problems
and solutions are expressed in familiar mathematical notation. Typical uses include:
 Mathematical computation
 Algorithm development
 Data acquisition
 Modeling, simulation, and prototyping
 Data analysis, exploration, and visualization
 Scientific and engineering graphics
 Application development, including graphical user interface building
MATLAB is an interactive system whose basic data element is an array that does not require
dimensioning. This allows you to solve many technical computing problems, especially those
with matrix and vector formulations, in a fraction of the time it would take to write a program
in a scalar non interactive language such as C or FORTRAN. The name MATLAB stands for
matrix laboratory. MATLAB was originally written to provide easy access to matrix software
developed by the LINPACK and EISPACK projects. Today, MATLAB engines incorporate
the LAPACK and BLAS libraries, embedding the state of the art in software for matrix
computation. MATLAB has evolved over a period of years with input from many users. In
university environments, it is the standard instructional tool for introductory and advanced
courses in mathematics, engineering, and science.

In industry, MATLAB is the tool of choice for high-productivity research, development, and
analysis. MATLAB features a family of add-on application-specific solutions called
toolboxes. Very important to most users of MATLAB, toolboxes allow you to learn and
apply specialized technology. Toolboxes are comprehensive collections of MATLAB
functions (M-files) that extend the MATLAB environment to solve particular classes of
problems. Areas in which toolboxes are available include signal processing, control systems,
neural networks, fuzzy logic, wavelets, simulation, and many others.

6. Facilities required for proposed work

Speaker recognition [20] is the process of automatically recognizing who is speaking on the
basis of individual information included in speech waves. Speaker recognition can be
classified into identification and verification. Speaker identification is the process of
determining which registered speaker provides a given utterance. Speaker verification, on the
other hand, is the process of accepting or rejecting the identity claim of a speaker.

All speaker recognition systems contain two main modules: feature extraction and feature
matching. Feature extraction is the process that extracts a small amount of data from the
voice signal that can later be used to represent each speaker. Feature matching involves the
actual procedure to identify the unknown speaker by comparing extracted features from
his/her voice input with the ones from a set of known speakers.

6.1 Voice Feature Extraction [21, 22]

The purpose of this module is to convert the speech waveform, using digital signal processing
(DSP) tools, to a set of features for further analysis. There are wide range of possibilities exist
for voice feature extractions such as Linear Prediction Coding (LPC), Mel-Frequency
Cepstrum Coefficients (MFCC), and others. We use MFCC technique in our thesis because it
is the well known and most popular technique.

6.2 Voice Feature Matching

The problem of speaker recognition belongs to a much broader topic in scientific and
engineering so called pattern recognition. The goal of pattern recognition is to classify
objects of interest into one of a number of categories or classes. The objects of interest are
generically called patterns and in our case are sequences of acoustic vectors that are extracted
from an input speech using the MFCC technique. The classes here refer to individual
speakers. Since the classification procedure in our case is applied on extracted features, it
can be also referred to as feature matching [23]. Various techniques are used for voice feature
matching such as Dynamic Time Warping (DTW), Hidden Markov Modeling (HMM), K-
Means and Vector Quantization (VQ). In this thesis, we use K-Means approach [24] due to
ease of implementation and high accuracy.

References

[1] Ashish Kumar Panda, Amit Kumar Sahoo, “Study of Speaker Recognition Systems”,
National Institute of Technology, Rourkela, 2011.

[2] Campbell, J.P., “Speaker recognition: a tutorial”, Proceedings of the IEEE Volume 85,
Issue 9, Sept. 1997 Page(s):1437 – 1462.

[3] Rakesh Tiwari, “An improved algorithm for Speaker Recognition”, School Of
Educational Technology, Jadavpur University, 2010.

[4] Kimberlee A. Kemble, An Introduction to Speech Recognition, [online] Available:

ftp://ftp.software.ibm.com / software / partners / comarketing / na / ss / we / WS_Voice
_Server_White_Paper.pdf

[5] M. A. Anusuya, S. K. Katti, “Speech Recognition by Machine: A Review”, International

Journal of Computer Science and Information Security, Vol. 6, No. 3, 2009.

[6] Zue V, Glass, J., Goodine, D., Leung, H., Phillips, M, Polifroni, J., Seneff, S, “Integration
of speech recognition and natural language processing”, in the MIT voyager system, IEEE,
2011.

[7] Fook, C.Y., Hariharan, M., Yaacob, S., Adom, A., “A review: Malay speech recognition
and audio visual speech recognition”, in Biomedical Engineering (ICoBE), International
Conference, 2012.
[8] Parwinder Pal Singh, Er. Bhupinder Singh, “Speech Recognition as Emerging
Revolutionary Technology”, International Journal of Advanced Research in Computer
Science and Software Engineering, Volume 2, Issue 10, October 2012.

[9] Anupam Choudhary, Ravi Kshirsagar, “Process Speech Recognition System using
Artificial Intelligence Technique”, in International Journal of Soft Computing and
Engineering (IJSCE) ISSN: 2231-2307, Volume-2, Issue-5, 2012.

[10] Surbhi Mathur, Choudhary S. K. and Vyas J. M., “Speaker Recognition System and its
Forensic Implications”, Open Access Scientific Reports, Volume 2, Issue 4, 2013.

[11] Dr E. Chandra, K. Manikandan, M. S. Kalaivani, “A Study on Speaker Recognition

System and Pattern classification Techniques”, International Journal of Innovative Research
in Electrical, Electronics, Instrumentation and Control Engineering, Vol. 2, Issue 2, February
2014.

[12] Catherine J Nereveettil, M. Kalamani, Dr. S.Valarmathy, “Feature Selection Algorithm

for Automatic Speech Recognition Based On Fuzzy Logic”, International Journal of
Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 3, Issue
1, January 2014.

[13] Siva Prasad Nandyala and T. Kishore Kumar, “Hybrid HMM/DTW based Speech
Recognition with Kernel Adaptive Filtering Method”, in International Journal on
Computational Sciences & Applications (IJCSA) Vol.4, No.1, 2014.

[14] Xiang-Lilan, Zhang, Zhi-Gang, Luo,Ming Li, “Merge-Weighted Dynamic Time

Warping for Speech Recognition”, in Journal of Computer Science and Technology, Volume
29, Issue 6, 2014, pp 1072-1082.

[15] D D Doye, T R Sontakke & Smita Nagtode, “The Nonlinear Time Alignment Model for
Speech”, in IETE Journal of Research, Taylor & Francis, 2015, pp 1-6.

[16] Jayashree Padmanabhan and Melvin Jose Johnson Prem kumar, “Machine Learning in
Automatic Speech Recognition: A Survey”, IETE Technical Review, Taylor & Francis, 2015,
pp-1-13.

[17] Parvati J.Chaudhary, Kinjal M. Vagadia, “A Review Article on Speaker Recognition

with Feature Extraction”, International Journal of Emerging Technology and Advanced
Engineering, Volume 5, Issue 2, February 2015.

[18] Neha Chadha, R.C. Gangwar, Rajeev Bedi, “Current Challenges and Application of
Speech Recognition Process using Natural Language Processing: A Survey”, International
Journal of Computer Applications (0975 – 8887) Volume 131 – No.11, December 2015.

[19] Karpagavalli S and Chandra E, “A Review on Automatic Speech Recognition

Architecture and Approaches”, International Journal of Signal Processing, Image Processing
and Pattern Recognition Vol.9, No.4, (2016), pp.393-404

[20] Douglas A. Reynolds, “An Overview of Automatic Speaker Recognition Technology",

©2002 IEEE
[21] Bhupinder Singh, Rupinder Kaur, Nidhi Devgun, Ramandeep Kaur, “The process of
Feature Extraction in Automatic Speech Recognition System for Computer Machine
Interaction with Humans: A Review”,IJARCSSE, Volume 2, Issue 2, February 2012.

[22] Genevieve I. Sapijaszko, Wasfy B. Mikhael, “An Overview of Recent Window Based
Feature Extraction Algorithms for Speaker Recognition”, IEEE, pp 880-883, 2012.

[23] Maider Zamalloa, Germacn Bordel, Luis Javier Rodriguez, Mikel Penagarikano,
“Feature Selection Based on Genetic Algorithms for Speaker Recognition”, 2006, IEEE.

The Six Days of Genesis
94% (18)
The Six Days of Genesis
125 pages
Lecture 6 - Disinfection
No ratings yet
Lecture 6 - Disinfection
91 pages
D400 Research Proposal Format 2021
No ratings yet
D400 Research Proposal Format 2021
5 pages
Chapter 4
No ratings yet
Chapter 4
19 pages
Sita#1part2 Merged
No ratings yet
Sita#1part2 Merged
61 pages
Speaker Verification From Short Utterance Perspective: A Review
No ratings yet
Speaker Verification From Short Utterance Perspective: A Review
15 pages
02 - Looking For A Pattern
No ratings yet
02 - Looking For A Pattern
3 pages
Digital Marketing Be Etc (Insem.) (2019 Pattern) (Semester Viii) (Elective Vi) March 24
No ratings yet
Digital Marketing Be Etc (Insem.) (2019 Pattern) (Semester Viii) (Elective Vi) March 24
1 page
Person Voice Recognition Methods
No ratings yet
Person Voice Recognition Methods
6 pages
An Overview of The Development of Speaker Recognition
No ratings yet
An Overview of The Development of Speaker Recognition
11 pages
A Review On Speaker Recognition - Technology and Challenges
No ratings yet
A Review On Speaker Recognition - Technology and Challenges
14 pages
2015 Bull CAT 09.pdf - PDF
No ratings yet
2015 Bull CAT 09.pdf - PDF
67 pages
33 FutureofVoiceBiometricSystem-ACaseStudy
No ratings yet
33 FutureofVoiceBiometricSystem-ACaseStudy
9 pages
Speaker Recognition Overview
No ratings yet
Speaker Recognition Overview
30 pages
Final
No ratings yet
Final
9 pages
Irjet V7i6965
No ratings yet
Irjet V7i6965
5 pages
Using Gaussian Mixture: Automatic Speaker Recognition Speaker Models
No ratings yet
Using Gaussian Mixture: Automatic Speaker Recognition Speaker Models
20 pages
State of The Art in Speaker Recognitin - 2202.12705v1
No ratings yet
State of The Art in Speaker Recognitin - 2202.12705v1
7 pages
33 VoiceBiometric
No ratings yet
33 VoiceBiometric
13 pages
Automatic Speaker Recognition System Based On Machine Learning Algorithms
0% (1)
Automatic Speaker Recognition System Based On Machine Learning Algorithms
12 pages
PGDCA Diploma
No ratings yet
PGDCA Diploma
15 pages
Math8 - q1 - w4 - d1 - Adding and Subtracting Rational Algebraic Expression - M8AL Ia B 1 - v1
No ratings yet
Math8 - q1 - w4 - d1 - Adding and Subtracting Rational Algebraic Expression - M8AL Ia B 1 - v1
4 pages
134 Rashid Bicet2021
No ratings yet
134 Rashid Bicet2021
9 pages
Module 1.2 Lesson
No ratings yet
Module 1.2 Lesson
8 pages
Fast Speaker Identification Using Recursive Word Sample Attributes
No ratings yet
Fast Speaker Identification Using Recursive Word Sample Attributes
7 pages
An Overview of Text-Independent Speaker Recognitio PDF
No ratings yet
An Overview of Text-Independent Speaker Recognitio PDF
31 pages
177538089-API-570-Final-Exam-Questions - REALIZAR
100% (2)
177538089-API-570-Final-Exam-Questions - REALIZAR
26 pages
AJSAT Vol.5 No.2 July Dece 2016 pp.23 30
No ratings yet
AJSAT Vol.5 No.2 July Dece 2016 pp.23 30
9 pages
2 Inverse Trigomnometric Functions
No ratings yet
2 Inverse Trigomnometric Functions
2 pages
Automatic Speaker Verification
No ratings yet
Automatic Speaker Verification
24 pages
Hedha Houa
No ratings yet
Hedha Houa
5 pages
Analysis of Voice Recognition Algorithms Using Matlab: Atheer Tahseen Hussein
No ratings yet
Analysis of Voice Recognition Algorithms Using Matlab: Atheer Tahseen Hussein
6 pages
Auxiliary Materials - Physical Stability Agents PVPP: Adsorption Capacity
No ratings yet
Auxiliary Materials - Physical Stability Agents PVPP: Adsorption Capacity
2 pages
Psychological Statistics PP
No ratings yet
Psychological Statistics PP
2 pages
Voice Recognition System Using Machine L
No ratings yet
Voice Recognition System Using Machine L
7 pages
Automatic Speaker Recognition by Speech Signal
No ratings yet
Automatic Speaker Recognition by Speech Signal
15 pages
XE Currency Data API Non Technical Quick Start Guide
No ratings yet
XE Currency Data API Non Technical Quick Start Guide
5 pages
Safety, Security, and Convenience: The Benefits of Voice Recognition Technology
No ratings yet
Safety, Security, and Convenience: The Benefits of Voice Recognition Technology
5 pages
Pankaj Singh Synopsis (Recovoicegnition)
No ratings yet
Pankaj Singh Synopsis (Recovoicegnition)
11 pages
Speech Recognition Using Correlation Tec
No ratings yet
Speech Recognition Using Correlation Tec
8 pages
Voice Recognition: An Examination of An Evolving Technology and Its Use in Organizations
No ratings yet
Voice Recognition: An Examination of An Evolving Technology and Its Use in Organizations
8 pages
Performance Comparison of Robust Speech PDF
No ratings yet
Performance Comparison of Robust Speech PDF
6 pages
JIT Template
No ratings yet
JIT Template
2 pages
Russia Project
No ratings yet
Russia Project
14 pages
25 The Comprehensive Analysis Speech Recognition System
No ratings yet
25 The Comprehensive Analysis Speech Recognition System
5 pages
Research Proposal
No ratings yet
Research Proposal
6 pages
Interface Manual
100% (1)
Interface Manual
22 pages
Service Manual: S4S Diesel Engine
100% (2)
Service Manual: S4S Diesel Engine
15 pages
Paper 3 PDF
No ratings yet
Paper 3 PDF
11 pages
JAWS (Screen Reader)
No ratings yet
JAWS (Screen Reader)
18 pages
Speaker Recognition
No ratings yet
Speaker Recognition
11 pages
Axial Cylindrical Roller Bearings - Cages and Washers
No ratings yet
Axial Cylindrical Roller Bearings - Cages and Washers
14 pages
Speaker Recognition System: A Project Report On
No ratings yet
Speaker Recognition System: A Project Report On
48 pages
Q 1
No ratings yet
Q 1
4 pages
Kill Procedures
100% (2)
Kill Procedures
23 pages
Filtering ARP Traffic With Linux Arptables
No ratings yet
Filtering ARP Traffic With Linux Arptables
6 pages
Design A Text-Prompt Speaker Recognition System Using LPC-Derived Features
No ratings yet
Design A Text-Prompt Speaker Recognition System Using LPC-Derived Features
8 pages
Puede-ser-Speaker Identification Based On Hybrid Feature
No ratings yet
Puede-ser-Speaker Identification Based On Hybrid Feature
6 pages
Tactical Barbell Interactive Spreadsheet - Improved
No ratings yet
Tactical Barbell Interactive Spreadsheet - Improved
10 pages
Shareef Seminar Docs
No ratings yet
Shareef Seminar Docs
24 pages
Introduction To Internet of Things Prof. Sudip Misra Assignment 1
No ratings yet
Introduction To Internet of Things Prof. Sudip Misra Assignment 1
24 pages
Fyp Final Poster
No ratings yet
Fyp Final Poster
1 page
Automated Speech Recognition Systems Applications in Industry
No ratings yet
Automated Speech Recognition Systems Applications in Industry
4 pages
Ijet V3i4p19
No ratings yet
Ijet V3i4p19
6 pages
Club 3D Geforce 6200 TC Pcie Turbocache Technology: WWF Panda JR
No ratings yet
Club 3D Geforce 6200 TC Pcie Turbocache Technology: WWF Panda JR
2 pages
IEEE BVi PDF
No ratings yet
IEEE BVi PDF
5 pages
A Voice Identification System Using Hidden Markov Model
No ratings yet
A Voice Identification System Using Hidden Markov Model
6 pages
Voice Recognition With Neural Networks, Type-2 Fuzzy Logic and Genetic Algorithms
No ratings yet
Voice Recognition With Neural Networks, Type-2 Fuzzy Logic and Genetic Algorithms
9 pages
MajorInterim Report1
No ratings yet
MajorInterim Report1
10 pages
Biometric Voice Recognition in Security System
No ratings yet
Biometric Voice Recognition in Security System
9 pages
Digital Signal Processing: The Final
No ratings yet
Digital Signal Processing: The Final
13 pages
Speaker Recognition System - v1
No ratings yet
Speaker Recognition System - v1
7 pages
Automatic Speaker Recognition System
No ratings yet
Automatic Speaker Recognition System
11 pages
DTP
No ratings yet
DTP
9 pages
Methodology For Speaker Identification and Recognition System
100% (1)
Methodology For Speaker Identification and Recognition System
13 pages
Principle and Applications of Speaker Recognition Security System
No ratings yet
Principle and Applications of Speaker Recognition Security System
5 pages
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
No ratings yet
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
10 pages
Project Report
100% (1)
Project Report
60 pages
Electric Heating
No ratings yet
Electric Heating
1 page
Erythropoietin Concentrated Solution (1316)
No ratings yet
Erythropoietin Concentrated Solution (1316)
5 pages
Limiting Reagents - Chemistry LibreTexts
No ratings yet
Limiting Reagents - Chemistry LibreTexts
5 pages
Generic Model For Text Dependent Automatic Gujarati Speaker Recognition
No ratings yet
Generic Model For Text Dependent Automatic Gujarati Speaker Recognition
4 pages
Advanced Signal Processing Using Matlab
No ratings yet
Advanced Signal Processing Using Matlab
20 pages
Speaker Recognition Publish
No ratings yet
Speaker Recognition Publish
6 pages
Advancements in Artificial Intelligence and Machine Learning
From Everand
Advancements in Artificial Intelligence and Machine Learning
Asif Khan
No ratings yet
Aimybox Voice Assistant Development: Definitive Reference for Developers and Engineers
From Everand
Aimybox Voice Assistant Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Voice Technologies and Systems: Definitive Reference for Developers and Engineers
From Everand
Voice Technologies and Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Speaker Recognition: Fundamentals and Applications
From Everand
Speaker Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Optical Character Recognition: Fundamentals and Applications
From Everand
Optical Character Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet

Voice Syn - NN

Uploaded by

Voice Syn - NN

Uploaded by

Research Proposal on

An improved Voice Recognition System using

under the supervision of

DEPARTMENT OF COMPUTER SCIENCE AND

Jagannath University NCR, Haryana, India

Signature of the Supervisor

Biometrics refers to automatic recognition of individuals based on their physiological and

1.2 Classification of Voice Recognition

Figure 1.1: Classification of Speaker Recognition

1.2.1 OPEN SET vs CLOSED SET

1.2.2 IDENTIFICATION vs VERIFICATION

1. Speaker identification: It is the process of determining which registered speaker provides

(b) Speaker verification

1.2.3 TEXT-DEPENDENT vs TEXT-INDEPENDENT

This is another category of classification of speaker recognition systems. This category is

3. Problem Formulation & Research Motivational

Voice Recognition is the process of converting a speech signal to a sequence of words, by

The overall objectives of voice recognition are summarized below:

Frequency Cepstral Coefficients (MFCC), K-means and Euclidean distance technique

Proposed Tool - MATLAB

MATLAB is a high-performance language for technical computing. It integrates

6. Facilities required for proposed work

6.1 Voice Feature Extraction [21, 22]

6.2 Voice Feature Matching

[4] Kimberlee A. Kemble, An Introduction to Speech Recognition, [online] Available:

[5] M. A. Anusuya, S. K. Katti, “Speech Recognition by Machine: A Review”, International

[11] Dr E. Chandra, K. Manikandan, M. S. Kalaivani, “A Study on Speaker Recognition

[12] Catherine J Nereveettil, M. Kalamani, Dr. S.Valarmathy, “Feature Selection Algorithm

[14] Xiang-Lilan, Zhang, Zhi-Gang, Luo,Ming Li, “Merge-Weighted Dynamic Time

[17] Parvati J.Chaudhary, Kinjal M. Vagadia, “A Review Article on Speaker Recognition

[19] Karpagavalli S and Chandra E, “A Review on Automatic Speech Recognition

[20] Douglas A. Reynolds, “An Overview of Automatic Speaker Recognition Technology",

You might also like