0% found this document useful (0 votes)

113 views4 pages

Speech Recognition As Emerging Revolutionary Technology

V2I10-0162

Uploaded by

bbaskaran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

113 views4 pages

Speech Recognition As Emerging Revolutionary Technology

V2I10-0162

Uploaded by

bbaskaran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Volume 2, Issue 10, October 2012 ISSN: 2277 128X

International Journal of Advanced Research in

Computer Science and Software Engineering
Research Paper
Available online at: www.ijarcsse.com
Speech Recognition as Emerging Revolutionary Technology
Parwinder pal Singh Er. Bhupinder singh
Computer science &Engg. Computer science &Engg
IGCE, PTU Kapurthala IGCE, PTU Kapurthala

Abstract— Speech recognition is the new emerging technology in the field of computer and artificial intelligence. It
has changed the way we communicate with computer and other intelligent devices of same calibre like smart phones.
It is a major area of interest for research in this field which is related to artificial intelligence. In this paper the
overview of this technology and its current implementations are listed and introduced.

Keywords— Speech recognition, Phonetics, Acoustic, Utterance, DTW.

I. INTRODUCTION
Speech recognition is the translation of spoken words into text. It is also known as "automatic speech recognition",
"ASR", "computer speech recognition", "speech to text", or just "STT". Speech Recognition is technology that can
translate spoken words into text. Some SR systems use "training" where an individual speaker reads sections of text into
the SR system. These systems analyze the person's specific voice and use it to fine tune the recognition of that person's
speech, resulting in more accurate transcription.

II. TYPES OF SPEECH RECOGNITION SYSTEMS

A. Speaker dependant- A number of voice recognition systems are available on the market. The most powerful can
recognize thousands of words. However, they generally require an extended training session during which the
computer system becomes accustomed to a particular voice and accent. Such systems are said to be speaker
dependent. A speaker dependent system is developed to operate for a single speaker. These systems are usually
easier to develop, cheaper to buy and more accurate, but not as flexible as speaker adaptive or speaker independent
systems. Speaker–dependent software works by learning the unique characteristics of a single person's voice, in a
way similar to voice recognition. New users must first "train" the software by speaking to it, so the computer can
analyze how the person talks. This often means users have to read a few pages of text to the computer before they
can use the speech recognition software
B. Speaker independent - A speaker independent system is developed to operate for any speaker of a particular type
(e.g. American English). These systems are the most difficult to develop, most expensive and accuracy is lower than
speaker dependent systems. However, they are more flexible. Speaker–independent software is designed to
recognize anyone's voice, so no training is involved. This means it is the only real option for applications such as
interactive voice response systems — where businesses can't ask callers to read pages of text before using the
system. The downside is that speaker–independent software is generally less accurate than speaker–dependent
software.
C. Speaker adaptive - A third variation of speaker models is now emerging, called speaker adaptive. Speaker adaptive
systems usually begin with a speaker independent model and adjust these models more closely to each individual
during a brief training period.

III. SPEECH RECOGNITION SYSTEMS DISTINGUISHED ACCORDING TO THE INPUTS

A. IWR: Isolated word recognition - Isolated word recognizers usually require each utterance to have quiet (lack of an
audio signal) on BOTH sides of the sample window. It doesn't mean that it accepts single words, but does require a
single utterance at a time. Often, these systems have "Listen/Not−Listen" states, where they require the speaker to
wait between utterances (usually doing processing during the pauses).Isolated Utterance might be a better name for
this class.
B. CWR: Connected word recognition - Connect word systems (or more correctly 'connected utterances') are similar to
Isolated words, but allow separate utterances to be 'run−together' with a minimal pause between them.
C. CSR: Continuous speech recognition - Continuous recognition is the next step. Recognizers with continuous speech
capabilities are some of the most difficult to create because they must utilize special methods to determine utterance
boundaries. Continuous speech recognizers allow users to speak almost naturally, while the computer determines the
content. Basically, it's computer dictation.

© 2012, IJARCSSE All Rights Reserved Page | 410

Parwinder et al., International Journal of Advanced Research in Computer Science and Software Engg 2 (10),
October- 2012, pp. 410-413
D. SSR: Spontaneous speech recognition - There appears to be a variety of definitions for what spontaneous speech
actually is. At a basic level, it can be thought of as speech that is natural sounding and not rehearsed. An ASR
system with spontaneous speech ability should be able to handle a variety of natural speech features such as words
being run together, "ums" and "ahs", and even slight stutters.
E. Voice Verification/Identification – Some ASR systems have the ability to identify specific users. This document
doesn't cover verification or security systems.[4]

IV. WORKING OF SPEECH RECOGNITION SYSTEM

Basically, the microphone converts the voice to an analog signal. This is processed by the sound card in the computer,
which takes the signal to the digital stage. Input from user is also known as utterance (Spoken input from the user of
a speech application. An utterance may be a single word, an entire phrase, a sentence, or even several sentences.)[3]This
is the binary form of ―1s‖ and ―0s‖ that make up computer programming languages. Computers don’t ―hear‖ sounds in
any other way.
Sound-recognition software has acoustic models (An acoustic model is created by taking audio recordings of speech, and
their text transcriptions, and using software to create statistical representations of the sounds that make up each word. It
is used by a speech recognition engine to recognize speech) [1] convert the voice sounds to one of about four dozen basic
speech elements (called phonemes). The latest versions of speech technology have been refined so that they eliminate the
noise and useless information that is not needed to let the computer work. The words we speak are transformed into
digital forms of the basic speech elements (phonemes).
Once this is complete, a second sector of the software begins to work. The language is compared to the digital
―dictionary‖ that is stored in computer memory. This is a large collection of words, usually more than 100,000. When it
finds a match based on the digital form it displays the words on the screen. This is the basic process for all speech
recognition systems and software. [2]

Fig. 1 Flowchart of Simple Speech Recognition System

V. ALGORITHMS USED FOR SPEECH RECOGNITION SYSTEM

Both acoustic modelling and language modelling are important parts of modern statistically-based speech recognition
algorithms. Hidden Markov models (HMMs) are widely used in many systems. Language modelling has many other
applications such as smart keyboard and document classification.
Hidden Markov models- Modern general-purpose speech recognition systems are based on Hidden Markov Models.
These are statistical models that output a sequence of symbols or quantities. HMMs are used in speech recognition
because a speech signal can be viewed as a piecewise stationary signal or a short-time stationary signal. In a short time-
scales (e.g., 10 milliseconds), speech can be approximated as a stationary process. Speech can be thought of as a Markov
model for much stochastic purposes.Another reason why HMMs are popular is because they can be trained automatically
and are simple and computationally feasible to use. In speech recognition, the hidden Markov model would output a
sequence of n-dimensional real-valued vectors (with n being a small integer, such as 10), outputting one of these every

Parwinder et al., International Journal of Advanced Research in Computer Science and Software Engg 2 (10),
October- 2012, pp. 410-413
10 milliseconds. The vectors would consist of cepstral coefficients, which are obtained by taking a Fourier transform of a
short time window of speech and decorrelating the spectrum using a cosine transform, then taking the first (most
significant) coefficients. The hidden Markov model will tend to have in each state a statistical distribution that is a Fig 1
shows the general speech recognition system. Each word, or (for more general speech recognition systems), mixture of
diagonal covariance Gaussians, which will give likelihood for each observed vector. Each phoneme will have a different
output distribution; a hidden Markov model for a sequence of words or phonemes is made by concatenating the
individual trained hidden Markov models for the separate words and phonemes. Described above are the core elements of
the most common, HMM-based approach to speech recognition. Modern speech recognition systems use various
combinations of a number of standard techniques in order to improve results over the basic approach described above. A
typical large-vocabulary system would need context dependency for the phonemes (so phonemes with different left and
right context have different realizations as HMM states); it would use cepstral normalization to normalize for different
speaker and recording conditions; for further speaker normalization it might use vocal tract length normalization (VTLN)
for male-female normalization and maximum likelihood linear regression (MLLR) for more general speaker adaptation.
The features would have so-called delta and delta-delta coefficients to capture speech dynamics and in addition might
use heteroscedastic linear discriminant analysis (HLDA); or might skip the delta and delta-delta coefficients and
use splicing and an LDA-based projection followed perhaps by heteroscedastic linear discriminant analysis or a global
semitied covariance transform (also known as maximum likelihood linear transform, or MLLT). Many systems use so-
called discriminative training techniques that dispense with a purely statistical approach to HMM parameter estimation
and instead optimize some classification-related measure of the training data. Examples are maximum mutual
information (MMI), minimum classification error (MCE) and minimum phone error (MPE).Decoding of the speech (the
term for what happens when the system is presented with a new utterance and must compute the most likely source
sentence) would probably use the Viterbi algorithm to find the best path, and here there is a choice between dynamically
creating a combination hidden Markov model, which includes both the acoustic and language model information, and
combining it statically beforehand (the finite state transducer, or FST, approach).A possible improvement to decoding is
to keep a set of good candidates instead of just keeping the best candidate, and to use a better scoring function (rescoring)
to rate these good candidates so that we may pick the best one according to this refined score. The set of candidates can
be kept either as a list (the N-best list approach) or as a subset of the models (a lattice). Rescoring is usually done by
trying to minimize the Bayes risk (or an approximation thereof): Instead of taking the source sentence with maximal
probability, we try to take the sentence that minimizes the expectancy of a given loss function with regards to all possible
transcriptions (i.e., we take the sentence that minimizes the average distance to other possible sentences weighted by their
estimated probability). The loss function is usually the Levenshtein distance, though it can be different distances for
specific tasks; the set of possible transcriptions is, of course, pruned to maintain tractability. Efficient algorithms have
been devised to rescore lattices represented as weighted finite state transducers with edit distances represented
themselves as a finite state transducer verifying certain assumptions.
Dynamic time warping (DTW)-based speech recognition- Dynamic time warping is an approach that was
historically used for speech recognition but has now largely been displaced by the more successful HMM-based approach.
Dynamic time warping is an algorithm for measuring similarity between two sequences that may vary in time or speed.
For instance, similarities in walking patterns would be detected, even if in one video the person was walking slowly and
if in another he or she were walking more quickly, or even if there were accelerations and decelerations during the course
of one observation. DTW has been applied to video, audio, and graphics – indeed, any data that can be turned into a
linear representation can be analysed with DTW.A well-known application has been automatic speech recognition, to
cope with different speaking speeds. In general, it is a method that allows a computer to find an optimal match between
two given sequences (e.g., time series) with certain restrictions. That is, the sequences are "warped" non-linearly to
match each other. This sequence alignment method is often used in the context of hidden Markov models.
Neural networks-Neural networks emerged as an attractive acoustic modelling approach in ASR in the late 1980s.
Since then, neural networks have been used in many aspects of speech recognition such as phoneme classification,
isolated word recognition, and speaker adaptation. In contrast to HMMs, neural networks make no assumptions about
feature statistical properties and have several qualities making them attractive recognition models for speech recognition.
When used to estimate the probabilities of a speech feature segment, neural networks allow discriminative training in a
natural and efficient manner. Few assumptions on the statistics of input features are made with neural networks. However,
in spite of their effectiveness in classifying short-time units such as individual phones and isolated words, neural
networks are rarely successful for continuous recognition tasks, largely because of their lack of ability to model temporal
dependencies. Thus, one alternative approach is to use neural networks as a pre-processing e.g. feature transformation,
dimensionality reduction, for the HMM based recognition. [1]

VI. CURRENT AND FUTURE USES OF SPEECH RECOGNITION SYSTEM

Currently speech recognition is used in many fields like Voice Recognition System for the Visually Impaired [5]
highlights the Mg Sys Visi system that has the capability of access to World Wide Web by browsing in the Internet,
checking, sending and receiving email, searching in the Internet, and listening to the content of the search only by giving
a voice command to the system. In addition, the system is built with a translator that has the functionality to convert html
codes to voice; voice to Braille and then to text again. This system comprises of five modules namely: Automatic Speech
Recognition (ASR), Text-to- Speech (TTS), Search engine, Print (Text-Braille) and Translator (Text-to-Braille and
© 2012, IJARCSSE All Rights Reserved Page | 412
Parwinder et al., International Journal of Advanced Research in Computer Science and Software Engg 2 (10),
October- 2012, pp. 410-413
Braille-to -Text) module, was originally designed and developed for the visually impaired learners, can be used for other
users of specially needs like the elderly, and the physically impaired learners. Speech Recognition in Radiology
Information System. The Radiology report is the fundamental means by which radiologists communicate with clinicians
and patients. The traditional method of generating reports is time consuming and expensive. Recent advances in
computer hardware and software technology have improved Speech Recognition systems used for radiology reporting . [6]
Integration of Robust Voice Recognition and Navigation System on Mobile Robot [7] and there are many other fields in
which speech recognition can be used.

VII. CONCLUSIONS
This paper introduces the basics of speech recognition technology and also highlights the difference between different
speech recognition systems. In this paper the most common algorithms which are used to do speech recognition are also
discussed along with the current and its future use.

ACKNOWLEDGMENT
The author expresses appreciation to Er. Bhupinder Singh for his extensive support.

REFERENCES
[1] en.wikipedia.org/wiki/Acoustic_model
[2] www.thegeminigeek.com/how-speech-recognition-works.
[3] www.lumenvox.com/resources/tips/tipsGlossary.aspx.
[4] Lawrence R. Rabiner,AT&T Labs Florham Park, New Jersey07932,APPLICATIONS OF SPEECH
RECOGNITION IN THE AREA OF TELECOMMUNICATI0NS,1997 IEEE.
[5] Halimah B.Z.Dep. of Info. Science,UKM, Selangor, [email protected],Azlina A.Dep. of Indus.
Comp.UKM, Selangor,[email protected] Behrang P. Dep. of Info. Science, UKM, Selangor,
[email protected] Choo W.O.UTAR, Kampar,Perak, [email protected]
Voice Recognition System for the Visually Impaired: Virtual Cognitive Approach, IEEE2008
[6] Xinxin Wang1,Feiran Wu1,Zhiqian Ye11College of Biomedical Engineering& Instrument Science, Zhejiang
University, Hangzhou, China [email protected] com, yezhiqian@hzcnc, The Application of Speech
Recognition in Radiology Information System,IEEE2010.
[7] Huu-Cong Nguyen, Shim-Byoung , Chang-Hak Kang, Dong-Jun Park and Sung-Hyun Han Division of Mechanical
System Eng., Graduate School, Kyungnam University, Masan, Korea Integration of Robust Voice Recognition and
Navigation System on Mobile Robot, ICROS-SICE International Joint Conference 2009
[8] X., Huang, A., Acero, and H.W., Hon, ―Spoken Language Processing: A Guide to Theory, Algorithm, and System
Development‖. Prentice Hall, Upper Saddle River, NJ, USA, 2001.
[9] M., Ursin, ―Triphone Clustering in Finnish Continuous Speech Recognition‖. Master Thesis, Department of
Computer Science, Helsinki University of Technology, Finland, 2002.
[10] O. Khalifa, S. Khan, M.R. Islam, M. Faizal and D. Dol, ―Text Independent Automatic Speaker Recognition‖, 3rd
International Conference on Electrical & Computer Engineering,, Dhaka, Bangladesh, 28-30 December 2004, pp.
561-564.
[11] C.R. Buchanan, ―Informatics Research Proposal – Modeling the Semantics of Sound‖, School of Informatics,
University of Edinburgh, United Kingdom, March 2005.
[12] http://ozanmut.sitemynet.com/asr.htm, Retrieved in November 2005.
[13] X., Huang, A., Acero, and H.W., Hon, ―Spoken Language Processing: A Guide to Theory, Algorithm, and System
Development‖. Prentice Hall, Upper Saddle River, NJ, USA, 2001.
[14] Y. Linde, A. Buzo, and R. M. Gray, ―An Algorithm for Vector Quantizer Design‖, IEEE Transactions on
Communications, VOL. COM-28, No. 1, pp. 84 - 95, January 1980.
[15] D., Jurafsky, ―Speech Recognition and Synthesis: Acoustic Modeling‖, winter 2005.
[16] S.K., Podder, ―Segment-based Stochastic Modelings for Speech Recognition‖. PhD Thesis. Department of
Electrical and Electronic Engineering, Ehime University, Matsuyama 790-77, Japan, 1997.
[17] S.M., Ahadi, H., Sheikhzadeh, R.L., Brennan, and G.H., Freeman, ―An Efficient Front-End for Automatic Speech
Recognition‖. IEEE International Conference on Electronics, Circuits and Systems (ICECS2003), Sharjah, United
Arab Emirates, 2003.
[18] M., Jackson, ―Automatic Speech Recognition: Human Computer Interface for Kinyarwanda Language‖. Master
Thesis, Faculty ofComputing and Information Technology, Makerere University, 2005.
[19] M.R., Hasan, M., Jamil, and M.G., Saifur Rahman, ―Speaker Identification Using Mel Frequency Cepstral
Coefficients‖. 3rd International Conference on Electrical and Computer Engineering, Dhaka, Bangladesh, 2004, pp.
565-568.
M.Z., Bhotto and M.R., Amin, ―Bangali Text Dependent Speaker Identification Using MelFrequency Cepstrum
Coefficient and VectorQuantization‖. 3rd International Conference on Electrical and Computer Engineering, Dhaka,
Bangladesh, 2004, pp. 569-572.

Final Report
No ratings yet
Final Report
35 pages
Speaker Recognition Using MATLAB
95% (64)
Speaker Recognition Using MATLAB
75 pages
Chapter One
No ratings yet
Chapter One
13 pages
Digital Image Processing - S. Sridhar
0% (11)
Digital Image Processing - S. Sridhar
5 pages
Speech Recognition Using Python
100% (2)
Speech Recognition Using Python
6 pages
Speech Recognition Seminar Report
87% (97)
Speech Recognition Seminar Report
32 pages
Mathematical Foundations of Soft Robotics Engineering
No ratings yet
Mathematical Foundations of Soft Robotics Engineering
369 pages
Convai Technical Overview Speech Ai Part 2 2301964
No ratings yet
Convai Technical Overview Speech Ai Part 2 2301964
11 pages
Ai Project Sona-1 (1) - 250630 - 194118
No ratings yet
Ai Project Sona-1 (1) - 250630 - 194118
10 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
28 pages
The Big Book of Machine Learning Use Case
100% (1)
The Big Book of Machine Learning Use Case
75 pages
Voice Controlled Wheel Chair
0% (1)
Voice Controlled Wheel Chair
56 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Speech Recognition
No ratings yet
Speech Recognition
7 pages
Rohit
No ratings yet
Rohit
14 pages
Iccsee 2012 359
No ratings yet
Iccsee 2012 359
4 pages
Well Tie To Seismic
No ratings yet
Well Tie To Seismic
49 pages
Research Paper
No ratings yet
Research Paper
9 pages
Text and Speech CCS369-UNIT 5
No ratings yet
Text and Speech CCS369-UNIT 5
9 pages
ASR Proof
No ratings yet
ASR Proof
19 pages
Introduction To Speech Recognition
No ratings yet
Introduction To Speech Recognition
3 pages
A Report On
No ratings yet
A Report On
35 pages
AI Speech Recognition Document
No ratings yet
AI Speech Recognition Document
26 pages
Speech Recognition For Mobile Systems: BY: Pratibha Channamsetty Shruthi Sambasivan
No ratings yet
Speech Recognition For Mobile Systems: BY: Pratibha Channamsetty Shruthi Sambasivan
36 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
9 pages
APQP-PPAP Workbook
No ratings yet
APQP-PPAP Workbook
38 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
24 pages
Tsa Ut V
No ratings yet
Tsa Ut V
9 pages
Minor Project123
No ratings yet
Minor Project123
40 pages
A Review On Different Approaches For Speech - Recognition System
No ratings yet
A Review On Different Approaches For Speech - Recognition System
6 pages
A Review On Speech Recognition Challenge
No ratings yet
A Review On Speech Recognition Challenge
7 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
23 pages
A Seminar Report On: R. H. Sapat College of Engineering, Management Studies and Research
No ratings yet
A Seminar Report On: R. H. Sapat College of Engineering, Management Studies and Research
32 pages
Speech Processing
No ratings yet
Speech Processing
4 pages
"Speech Recognition and Voice Detection System": Bachlor of Technology IN Computer Science Engineering
No ratings yet
"Speech Recognition and Voice Detection System": Bachlor of Technology IN Computer Science Engineering
29 pages
Pneumax Omni 36ck
No ratings yet
Pneumax Omni 36ck
48 pages
Tejaswini Group Report
No ratings yet
Tejaswini Group Report
18 pages
Speech Recognition1
No ratings yet
Speech Recognition1
24 pages
Speech Recognition System
No ratings yet
Speech Recognition System
5 pages
A Study On Automatic Speech Recognition
100% (1)
A Study On Automatic Speech Recognition
2 pages
Artificial Intelligence in Voice Recognition
No ratings yet
Artificial Intelligence in Voice Recognition
14 pages
Artificial Intelligence-An Introduction: Department of Computer Science & Engineering
No ratings yet
Artificial Intelligence-An Introduction: Department of Computer Science & Engineering
17 pages
Easychair Preprint: Adnene Noughreche, Sabri Boulouma and Mohammed Benbaghdad
No ratings yet
Easychair Preprint: Adnene Noughreche, Sabri Boulouma and Mohammed Benbaghdad
8 pages
Voice Operated Wheelchair
No ratings yet
Voice Operated Wheelchair
41 pages
Voice Recognition System Speech To Text
No ratings yet
Voice Recognition System Speech To Text
5 pages
DTW Dynamic Time Warping
No ratings yet
DTW Dynamic Time Warping
123 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
19 pages
Computer Aided Text To Speech and Text To Braille System - Visually Impaired (Shruti Drishti) - West Bengal
No ratings yet
Computer Aided Text To Speech and Text To Braille System - Visually Impaired (Shruti Drishti) - West Bengal
18 pages
25 The Comprehensive Analysis Speech Recognition System
No ratings yet
25 The Comprehensive Analysis Speech Recognition System
5 pages
Speech Recognition-Statistical Methods
No ratings yet
Speech Recognition-Statistical Methods
18 pages
Automatic Speech Recognition: A Review: Anchal Katyal, Amanpreet Kaur, Jasmeen Gill
No ratings yet
Automatic Speech Recognition: A Review: Anchal Katyal, Amanpreet Kaur, Jasmeen Gill
4 pages
Automatic Speech Recognition Documentation
No ratings yet
Automatic Speech Recognition Documentation
24 pages
PowerSpy: Location Tracking Using Mobile Device Power Analysis
No ratings yet
PowerSpy: Location Tracking Using Mobile Device Power Analysis
13 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Pratyush Master Thesis 2010
No ratings yet
Pratyush Master Thesis 2010
50 pages
IRJET Speech Scribd
No ratings yet
IRJET Speech Scribd
3 pages
DTW 17179 PDF
No ratings yet
DTW 17179 PDF
23 pages
SG Iron Charts2
100% (3)
SG Iron Charts2
4 pages
Bearing: Types and Representation
No ratings yet
Bearing: Types and Representation
12 pages
Synopsis
No ratings yet
Synopsis
5 pages
Speech Recognition System - A Review: April 2016
No ratings yet
Speech Recognition System - A Review: April 2016
10 pages
Finding Skewness and Deskewing Scanned Document
No ratings yet
Finding Skewness and Deskewing Scanned Document
6 pages
Miyagi - Prosody
No ratings yet
Miyagi - Prosody
5 pages
Speech Recognition - Specific Task of Speech Recognition: Abstract
No ratings yet
Speech Recognition - Specific Task of Speech Recognition: Abstract
7 pages
Speech Recognition As Emerging Revolutionary Technology
No ratings yet
Speech Recognition As Emerging Revolutionary Technology
4 pages
(IJCST-V9I2P18) :swati, Harpreet Kaur
No ratings yet
(IJCST-V9I2P18) :swati, Harpreet Kaur
6 pages
Product Flyer FL3302
No ratings yet
Product Flyer FL3302
4 pages
Ijreas Volume 3, Issue 3 (March 2013) ISSN: 2249-3905 Efficient Speech Recognition Using Correlation Method
No ratings yet
Ijreas Volume 3, Issue 3 (March 2013) ISSN: 2249-3905 Efficient Speech Recognition Using Correlation Method
9 pages
Speech Recognition System - A Review
No ratings yet
Speech Recognition System - A Review
10 pages
NLP Project Reportttt
No ratings yet
NLP Project Reportttt
9 pages
TapDrillSizes PDF
No ratings yet
TapDrillSizes PDF
2 pages
Voice Recognition System
No ratings yet
Voice Recognition System
4 pages
Text Alignment
No ratings yet
Text Alignment
14 pages
Design of Cam Mechanism: Department of Mechanical Engineering, Bharath University, Selaiyur, Chennai, Tamilnadu, India
No ratings yet
Design of Cam Mechanism: Department of Mechanical Engineering, Bharath University, Selaiyur, Chennai, Tamilnadu, India
6 pages
Vivek Kumar - 1613112052
No ratings yet
Vivek Kumar - 1613112052
7 pages
Bending Allowance Calculation Thickness Degree
No ratings yet
Bending Allowance Calculation Thickness Degree
2 pages
Chapter 1. INTRODUCTION
No ratings yet
Chapter 1. INTRODUCTION
2 pages
Paper015571 1
No ratings yet
Paper015571 1
13 pages
Big Book of Data Science Use Cases v3
No ratings yet
Big Book of Data Science Use Cases v3
86 pages
Driver Behavior Profiling An Investigation With Different Smartphone Sensors and Machine Learning
No ratings yet
Driver Behavior Profiling An Investigation With Different Smartphone Sensors and Machine Learning
16 pages
Speech Recognition Technology: Applications & Future: Pankaj Pathak
No ratings yet
Speech Recognition Technology: Applications & Future: Pankaj Pathak
3 pages
Dynamic Time Warping Algorithm Review PDF
No ratings yet
Dynamic Time Warping Algorithm Review PDF
23 pages
Correlation Based Dynamic Time Warping of Multivariate Time Series
No ratings yet
Correlation Based Dynamic Time Warping of Multivariate Time Series
28 pages
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
No ratings yet
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
6 pages
Technical Seminar Presentation On Voice Morphing
No ratings yet
Technical Seminar Presentation On Voice Morphing
21 pages
Lecture8 1MultimodalAlignment
No ratings yet
Lecture8 1MultimodalAlignment
67 pages
BRINJAL GRAVY RECIPE (WITHOUT TAMARIND) - Jeyashri's Kitchen
No ratings yet
BRINJAL GRAVY RECIPE (WITHOUT TAMARIND) - Jeyashri's Kitchen
4 pages
For The Human Linguistic Concept, See: Screensaver Toshiba Laptop Character
No ratings yet
For The Human Linguistic Concept, See: Screensaver Toshiba Laptop Character
11 pages
SNAKE GOURD KOOTU RECIPE - Jeyashri's Kitchen
No ratings yet
SNAKE GOURD KOOTU RECIPE - Jeyashri's Kitchen
4 pages
Qiao 2011
No ratings yet
Qiao 2011
6 pages
Audio Visual Speech Recognition: Advancements, Applications, and Insights
From Everand
Audio Visual Speech Recognition: Advancements, Applications, and Insights
Fouad Sabry
No ratings yet
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Speech Recognition: Fundamentals and Applications
From Everand
Speech Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Development and Evaluation Archery Posture Analysis System Using Inertial Sensor
No ratings yet
Development and Evaluation Archery Posture Analysis System Using Inertial Sensor
9 pages
Flush Pin Gage
No ratings yet
Flush Pin Gage
1 page
Spade: On Shape-Based Pattern Detection in Streaming Time Series
No ratings yet
Spade: On Shape-Based Pattern Detection in Streaming Time Series
10 pages
A Survey On Speech Recognition
No ratings yet
A Survey On Speech Recognition
2 pages
Natural Language User Interface: Fundamentals and Applications
From Everand
Natural Language User Interface: Fundamentals and Applications
Fouad Sabry
No ratings yet
Computer 7
No ratings yet
Computer 7
9 pages
FSR 2019 Paper 18
No ratings yet
FSR 2019 Paper 18
15 pages
2023 ICASSP Soft Dynamic Time Warping For Multi-Pitch Estimation and Beyond
No ratings yet
2023 ICASSP Soft Dynamic Time Warping For Multi-Pitch Estimation and Beyond
5 pages
Material Identification Using Mmwave Radar
No ratings yet
Material Identification Using Mmwave Radar
8 pages
Neural Singing Voice Beautifier
No ratings yet
Neural Singing Voice Beautifier
7 pages

Speech Recognition As Emerging Revolutionary Technology

Uploaded by

Speech Recognition As Emerging Revolutionary Technology

Uploaded by

Volume 2, Issue 10, October 2012 ISSN: 2277 128X

International Journal of Advanced Research in

Keywords— Speech recognition, Phonetics, Acoustic, Utterance, DTW.

II. TYPES OF SPEECH RECOGNITION SYSTEMS

III. SPEECH RECOGNITION SYSTEMS DISTINGUISHED ACCORDING TO THE INPUTS

© 2012, IJARCSSE All Rights Reserved Page | 410

IV. WORKING OF SPEECH RECOGNITION SYSTEM

Fig. 1 Flowchart of Simple Speech Recognition System

V. ALGORITHMS USED FOR SPEECH RECOGNITION SYSTEM

© 2012, IJARCSSE All Rights Reserved Page | 411

VI. CURRENT AND FUTURE USES OF SPEECH RECOGNITION SYSTEM

© 2012, IJARCSSE All Rights Reserved Page | 413

You might also like