0% found this document useful (0 votes)

43 views30 pages

Speechrecognitionfinalpresentation 141124072610 Conversion Gate01

Speech recognition is help full for everyone

Uploaded by

Vyshak K Thilakan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views30 pages

Speechrecognitionfinalpresentation 141124072610 Conversion Gate01

Speech recognition is help full for everyone

Uploaded by

Vyshak K Thilakan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

• What is speech

recognition?
 Speech recognition technology has recently
reached a higher level of performance and
robustness, allowing it to communicate to another
user by talking .

 Speech Recognization is process of decoding

acoustic speech signal captured by microphone or
telephone ,to a set of words.

 And with the help of these it will recognize whole

speech is recognized word by word .
 : speaker independent and speaker dependent.

 Speaker independent models recognize the speech patterns of a

large group of people.

 Speaker dependent models recognize speech patterns from only

one person. Both models use mathematical and statistical
formulas to yield the best work match for speech. A third
variation of speaker models is now emerging, called speaker
adaptive.

 Speaker adaptive systems usually begin with a speaker

independent model and adjust these models more closely to
each individual during a brief training period.
• Most Natural Form Of
Communication
• Differently abled people
• Illiterate
• Helplines
• Cars
Voice Input Analog to Digital Acoustic Model

Language Model

Feedback Display Speech Engine

 Step 1:User Input
The system catches user’s voice in the form of
analog acoustic signal.

 Step 2:Digitization
Digitize the analog acoustic signal.

 Step 3:Phonetic Breakdown

Breaking signals into phonemes.
 Step 4:Statistical Modeling
 Mapping phonemes to their phonetic
representation using statistics model.

 Step 5:Matching
 According to grammar , phonetic representation
and Dictionary , the system returns an n-best list
(I.e.:a word plus a confidence score)
 Grammar-the union words or phrases to constraint
the range of input or output in the voice application.
 Dictionary-the mapping table of phonetic
representation and word(EX:thu,theethe)
Approaches
to ASR

Template Statistics
based based
13
/3
 Store examples of units (words,
phonemes), then find the example that
most closely fits the input
 Extract features from speech signal, then
it’s “just” a complex similarity matching
problem, using solutions developed for all
sorts of applications
 OK for discrete utterances, and a single
user

14
/3
 Hard to distinguish very similar templates
 And quickly degrades when input differs
from templates
 Therefore needs techniques to mitigate
this degradation:
• More subtle matching techniques
• Multiple templates which are aggregated
 Taken together, these suggested …

15
/3
 Collect a large corpus of transcribed
speech recordings
 Train the computer to learn the
correspondences (“machine learning”)
 At run time, apply statistical processes to
search through the space of all possible
solutions, and pick the statistically most
likely one

16
/3
 Acoustic and Lexical Models
• Analyse training data in terms of relevant features
• Learn from large amount of data different
possibilities
 different phone sequences for a given word
 different combinations of elements of the speech signal
for a given phone/phoneme
• Combine these into a Hidden Markov Model
expressing the probabilities

17
/3
 Real-world has structures and processes which have (or
produce) observable outputs:

o Usually sequential (process unfolds over time)

o Cannot see the event producing the output
Example: speech signals
HMM Overview
• Machine learning method

• Makes use of state machines

• Based on probabilistic model

• Can only observe output from states,

not the states themselves
– Example: speech recognition
• Observe: acoustic signals
• Hidden States: phonemes
(distinctive sounds of a language)
HMM Components

• A set of states (x’s)

• A set of possible output symbols
(y’s)
• A state transition matrix (a’s):
probability of making transition from
one state to the next

• Output emission matrix (b’s):

probability of a emitting/observing a
symbol at a particular state

• Initial probability vector:

o probability of starting at a
particular state
o Not shown, sometimes assumed
to be 1
21
/3
HMM Advantages

• Advantages:

o Effective
o Can handle variations in record structure

 Optional fields
 Varying field ordering
 Digitization
• Converting analogue signal into digital representation.
 Signal processing
• Separating speech from background noise.
 Phonetics
• Variability in human speech.
 Phonology
• Recognizing individual sound distinctions (similar phonemes.)
 Lexicology and syntax
• Disambiguating homophones.
• Features of continuous speech.
 Syntax and pragmatics
• Interpreting features.
• Filtering of performance errors (disfluencies).
Speech Recognition is still a very cumbersome problem.
Following are the problem….

 Speaker Variability
Two speakers or even the same speaker will
pronounce the same word differently
 Channel Variability
The quality and position of microphone and
background environment will affect the output
 Speech recognition applications include
 Voice dialling (e.g., "Call home"),
 Call routing (e.g., "I would like to make a collect call"),
 Simple data entry (e.g., entering a credit card number),
 Preparation of structured documents (e.g., A radiology
report),
 Speech-to-text processing (e.g., word processors or emails),
and
 In aircraft cockpits (usually termed Direct Voice Input).
 Medical Transcription
 Military
 Telephony and other domains
 Serving the disabled

Further Applications
• Home automation
• Automobile audio systems
• Telematics
 Faster than “hand-writing”.

 Allows for better spelling, whether it be in

text or documents.

 Helpful for people with a mental or

physical disability .

 Hands-free capability .
 No program is 100% perfect

 Factors that affect the accuracy of speech

recognition are: slang, homonyms, signal-
to-noise ratio, and overlapping speech

 Can be expensive depending on the

program
 http://en.wikipedia.org/wiki/Speech_recognition
 https://www.scribd.com/doc/130376790/Speech-
Recognition
 "Speaker Independent Connected Speech Recognition- Fifth
Generation Computer Corporation". Fifthgen.com.
 http://books.google.co.in/books?hl=en&lr=&id=iDHgboYR
zmgC&oi=fnd&pg=PA1&dq=speech+recognition+papers+
publications&ots=jb6NESTrjF&sig=oMKROIXccSgEyMGO
Zmi5lkToJvM#v=onepage&q=speech%20recognition%20p
apers%20publications&f=false
 http://www.speechrecognition.com
 https://www.google.co.in/?gfe_rd=cr&ei=GbHdU9f1MtKAo
AOW64GADg&gws_rd=ssl

Speech Recognition PPT F
100% (2)
Speech Recognition PPT F
16 pages
Voice Recognition
60% (5)
Voice Recognition
31 pages
Speech Recognition Seminar Report
87% (97)
Speech Recognition Seminar Report
32 pages
Xiao Guest Lecture ASR
No ratings yet
Xiao Guest Lecture ASR
39 pages
Speech Recognition Report
100% (1)
Speech Recognition Report
20 pages
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
100% (1)
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
65 pages
Speech Recognition Seminar
100% (2)
Speech Recognition Seminar
19 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
23 pages
A Review On Different Approaches For Speech - Recognition System
No ratings yet
A Review On Different Approaches For Speech - Recognition System
6 pages
Tutorial On Speech Recognition: Alex Acero Microsoft Research
No ratings yet
Tutorial On Speech Recognition: Alex Acero Microsoft Research
38 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
17 pages
The Development Process and Current State of The Speech Recognition Technology
No ratings yet
The Development Process and Current State of The Speech Recognition Technology
8 pages
Speech Processing
No ratings yet
Speech Processing
70 pages
Automatic Speech Recognition: 2.1 Relevant Keywords From Probability Theory and Statistics
No ratings yet
Automatic Speech Recognition: 2.1 Relevant Keywords From Probability Theory and Statistics
14 pages
Speech Recognition Application
No ratings yet
Speech Recognition Application
13 pages
Speech Recognition: BY Charu Joshi
100% (2)
Speech Recognition: BY Charu Joshi
26 pages
3MCA67 Speech Recognition
No ratings yet
3MCA67 Speech Recognition
14 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Hidden Markov Model and Persian Speech Recognition
No ratings yet
Hidden Markov Model and Persian Speech Recognition
9 pages
SPEECH
100% (1)
SPEECH
17 pages
Lecture 9 - Speech Recognition
No ratings yet
Lecture 9 - Speech Recognition
65 pages
Final Slide
No ratings yet
Final Slide
18 pages
Untitled Document-2
No ratings yet
Untitled Document-2
3 pages
A Speaker Independent Continuous Speech Recognizer For Amharic
No ratings yet
A Speaker Independent Continuous Speech Recognizer For Amharic
5 pages
Speech Recognition1
100% (1)
Speech Recognition1
39 pages
Speech Recognition: BY Charu Joshi
No ratings yet
Speech Recognition: BY Charu Joshi
26 pages
AI Speech Recognition Document
No ratings yet
AI Speech Recognition Document
26 pages
Voice Assistant
No ratings yet
Voice Assistant
34 pages
Speech Recognition1
No ratings yet
Speech Recognition1
24 pages
A Seminar Report On: R. H. Sapat College of Engineering, Management Studies and Research
No ratings yet
A Seminar Report On: R. H. Sapat College of Engineering, Management Studies and Research
32 pages
Speech Recognition: From Wikipedia, The Free Encyclopedia
0% (1)
Speech Recognition: From Wikipedia, The Free Encyclopedia
16 pages
A Report On
No ratings yet
A Report On
35 pages
Minor Project123
No ratings yet
Minor Project123
40 pages
Lectures 1 Rabiner Speech Processing
No ratings yet
Lectures 1 Rabiner Speech Processing
77 pages
Design and Implementation
No ratings yet
Design and Implementation
74 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Feature Extraction Using PCA
No ratings yet
Feature Extraction Using PCA
36 pages
Speech Technology
No ratings yet
Speech Technology
5 pages
Build Automatic Speech Recognition System: Bachelor of Technology
No ratings yet
Build Automatic Speech Recognition System: Bachelor of Technology
25 pages
9 Speech Recognition
No ratings yet
9 Speech Recognition
26 pages
Term Paper ECE-300 Topic: - Speech Recognition
No ratings yet
Term Paper ECE-300 Topic: - Speech Recognition
14 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
NLP 1.3.1 - Speed Recogmnition
No ratings yet
NLP 1.3.1 - Speed Recogmnition
20 pages
Final Report
No ratings yet
Final Report
35 pages
Unit 5 UA
No ratings yet
Unit 5 UA
19 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
35 pages
Presentation On Speech Recognition
No ratings yet
Presentation On Speech Recognition
11 pages
Speech Recognition
0% (1)
Speech Recognition
27 pages
SPEECH RECOGNITION SYSTEM Final
No ratings yet
SPEECH RECOGNITION SYSTEM Final
16 pages
Seminar Presentation: Topic: Speech Recognition
No ratings yet
Seminar Presentation: Topic: Speech Recognition
26 pages
Ai Project Sona-1 (1) - 250630 - 194118
No ratings yet
Ai Project Sona-1 (1) - 250630 - 194118
10 pages
Vivek Kumar - 1613112052
No ratings yet
Vivek Kumar - 1613112052
7 pages
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
No ratings yet
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
6 pages
Ann LA2 Project
No ratings yet
Ann LA2 Project
23 pages
Automatic Speech Recognition Documentation
No ratings yet
Automatic Speech Recognition Documentation
24 pages
List of Autorised Recovery Agencies
No ratings yet
List of Autorised Recovery Agencies
74 pages
Syllable Types.
No ratings yet
Syllable Types.
4 pages
Yoga, Enlightenment and Perfection
No ratings yet
Yoga, Enlightenment and Perfection
225 pages
EmSAT English College Entry Exam Specification English
100% (1)
EmSAT English College Entry Exam Specification English
1 page
Acting With IRISH Manual
100% (1)
Acting With IRISH Manual
23 pages
An Analysis of The Famous Poem When You Are Old
No ratings yet
An Analysis of The Famous Poem When You Are Old
6 pages
Compose Clear: Sentences Using Appropriate Grammatical Structures
100% (1)
Compose Clear: Sentences Using Appropriate Grammatical Structures
16 pages
Revise Tos Grade 3
100% (1)
Revise Tos Grade 3
7 pages
My Flower Album
100% (4)
My Flower Album
54 pages
Making A Powerful Programmable Keypad For Less Than $30
No ratings yet
Making A Powerful Programmable Keypad For Less Than $30
14 pages
Active Passive Voice
No ratings yet
Active Passive Voice
10 pages
OB1
No ratings yet
OB1
13 pages
Friction Lesson Plan
No ratings yet
Friction Lesson Plan
7 pages
Humanistic and Political Literature in Florence and Venice at TH PDF
No ratings yet
Humanistic and Political Literature in Florence and Venice at TH PDF
233 pages
Acts An Exegetical Commentary 1512335 Craig S Keener Instant Download
No ratings yet
Acts An Exegetical Commentary 1512335 Craig S Keener Instant Download
83 pages
Professional Foundations - Week 7 Milestone Rubric
No ratings yet
Professional Foundations - Week 7 Milestone Rubric
6 pages
Introduction To Dynamic Spin Chemistry Magnetic Field Effects On Chemical and Biochemical Reactions Hisaharu Hayashi PDF Download
No ratings yet
Introduction To Dynamic Spin Chemistry Magnetic Field Effects On Chemical and Biochemical Reactions Hisaharu Hayashi PDF Download
27 pages
Final Report ABSTRACT
No ratings yet
Final Report ABSTRACT
4 pages
Gerunds Infinitives
No ratings yet
Gerunds Infinitives
4 pages
Jiva Profounded in Visistadvitha
No ratings yet
Jiva Profounded in Visistadvitha
185 pages
Fourth-Grade Reading Lesson On Theme
No ratings yet
Fourth-Grade Reading Lesson On Theme
3 pages
Aaaaa
No ratings yet
Aaaaa
18 pages
Analog Digital IC Design
No ratings yet
Analog Digital IC Design
1 page
Mod Menu Log - Com - Miniclip.eightballpool
No ratings yet
Mod Menu Log - Com - Miniclip.eightballpool
3 pages
When The Code Becomes A Crime Scene Towards Dark Web Threat Intelligence With Software Quality Metrics
No ratings yet
When The Code Becomes A Crime Scene Towards Dark Web Threat Intelligence With Software Quality Metrics
5 pages
Molloy College Division of Education Lesson Plan
No ratings yet
Molloy College Division of Education Lesson Plan
4 pages
Huynh Duc Huy - FE Developer
No ratings yet
Huynh Duc Huy - FE Developer
3 pages
B.E (2019 Pattern)
No ratings yet
B.E (2019 Pattern)
2 pages
Alchemy of The Heart - Week 3 Article
No ratings yet
Alchemy of The Heart - Week 3 Article
2 pages
Practice Module 2 Introduction To Programming: NIM/Name: 4312111010/abdan Fauzan Nurtsani
No ratings yet
Practice Module 2 Introduction To Programming: NIM/Name: 4312111010/abdan Fauzan Nurtsani
6 pages
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Speech Recognition: Fundamentals and Applications
From Everand
Speech Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet

Speechrecognitionfinalpresentation 141124072610 Conversion Gate01

Uploaded by

Speechrecognitionfinalpresentation 141124072610 Conversion Gate01

Uploaded by

• What is speech

 Speech Recognization is process of decoding

 And with the help of these it will recognize whole

 Speaker independent models recognize the speech patterns of a

 Speaker dependent models recognize speech patterns from only

 Speaker adaptive systems usually begin with a speaker

Feedback Display Speech Engine

 Step 3:Phonetic Breakdown

o Usually sequential (process unfolds over time)

• Makes use of state machines

• Based on probabilistic model

• Can only observe output from states,

• A set of states (x’s)

• Output emission matrix (b’s):

• Initial probability vector:

 Allows for better spelling, whether it be in

 Helpful for people with a mental or

 Factors that affect the accuracy of speech

 Can be expensive depending on the

You might also like