0% found this document useful (0 votes)
81 views23 pages

Team: Mr. Rahul Kr. Singh MR - Hitesh Kumar It Vii Sem

This document discusses speech recognition technology. It provides an overview of what speech recognition is, how it works, challenges, applications, and key players in the market. Speech recognition involves converting speech to text using algorithms to analyze acoustic signals. It allows computers to understand and respond to spoken commands and questions. However, there are still weaknesses like environmental noise, determining word boundaries, and recognizing homonyms. The document also explores how speech recognition may enable applications like universal translation and hands-free computing in the future.

Uploaded by

Hitesh Kumar
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views23 pages

Team: Mr. Rahul Kr. Singh MR - Hitesh Kumar It Vii Sem

This document discusses speech recognition technology. It provides an overview of what speech recognition is, how it works, challenges, applications, and key players in the market. Speech recognition involves converting speech to text using algorithms to analyze acoustic signals. It allows computers to understand and respond to spoken commands and questions. However, there are still weaknesses like environmental noise, determining word boundaries, and recognizing homonyms. The document also explores how speech recognition may enable applications like universal translation and hands-free computing in the future.

Uploaded by

Hitesh Kumar
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

TEAM: Mr. RAHUL KR.

SINGH
Mr.HITESH KUMAR
IT VII SEM
The Computer of the Future will TALK,
LISTEN, UNDERSTAND and RESPOND

One of them is the Apple Macintosh of today.


Apple’s Speech Recognition and Speech Synthesis
Technologies now give speech-savvy applications the
power to carry out your voice commands and even
speak back to you in plain English.
Speech recognition is the process of converting a
speech signal to a sequence of words, by means of an
algorithm implemented as a computer program.

Voice Verification or speaker recognition is a related


process that attempts to identify the person speaking .
 Speech recognition is the process of converting an
acoustic signal, captured by a microphone or a
telephone, to a set of words.

The recognized words can be the final results, as for


applications such as commands & control, data entry,
and document preparation.

They can also serve as the input to further linguistic


processing in order to achieve speech understanding.
This process is even more complicated for phrases and sentences -- the
system has to figure out where each word stops and starts. The classic
example is the phrase "recognize speech," which sounds a lot like
"wreck a nice beach" when you say it very quickly. The program has to
analyze the phonemes using the phrase that came before it in
order to get it right. Here's a breakdown of the two phrases:

r  eh  k  ao  g  n  ay  z       s  p  iy  ch


"recognize speech“

r  eh  k     ay     n  ay s     b  iy  ch


"wreck a nice beach"
Who Can Benefit from Speech
Recognition?
Persons with mobility impairments or injuries
that prevent keyboard access
Persons who have or who are seeking to prevent
repetitive stress injuries
Persons with writing difficulties
Any person who want hands-free access to the
computer
Any persons who wants to increase their typing
speed
(reportedly up to 160 wpm)
WHAT IS REQUIRED TO USE SPEECH
RECOGNITION?

A Powerful Computer
Consistent Speech (not necessarily intelligible)
Fluid speech (i.e., not pausing between words)
desirable for use of continuous speech products
Patience
Basic knowledge of computers
Fairly high cognitive ability
Command recognition - Voice user interface
with the computer
Dictation
Interactive Voice Response
Automotive speech recognition
Medical Transcription
Pronunciation Teaching in computer-aided language
learning applications
Automatic Translation
Hands-free computing
 Discrete
Slower dictation process - better for persons
with difficulty in language processing or in fluid
speech
Word-by-word style, rather than phrases,
reflects the way beginning writers form
sentences
Continuous
Processes speech by phrase
Takes context into account
Is less accurate if phrases are interrupted
Advantages: Speed and accuracy (for most users)
ch
Lead to spee
pe
controlled ty
Identifica ation
tion writer, transl
& Recogn
i t i on system,
of Spea o rk plac e f or P-C.
ker w
Speech Analysis

WHO? What? How? Lie-D


etec
to r

Verification Identification Recognition Understanding


Reference storage:
Properties of
Learned Material

Problem
Speech Analysis:
Recognition:
SPEECH Parameters;
Comparison with
Response,
Reference,
Property Extraction
Decision

Special Chip Main Program

Recognized Speech
 A speaker-independent system can recognize with the
same reliability essentially fewer words than a speaker-
dependent system because the latter is TRAINED IN
ADVANCE. Training in advance means that there exists a
training phase for the speech recognition system, which
takes a half an hour.

Speaker-dependent recognition system can


recognize around 25,000 words, Speaker-
independent recognition system can recognize
around 500 words but with a worse recognition
rate .
The major players in the speech recognition market are
Dragon Systems, Lernout & Hauspie (L&H), and IBM.
 Dragon’s original product, Dragon Dictate, is currently
the only product that uses the discrete speech model

 The current L&H product line, called VoiceXpress,


includes a Standard, Advanced, and Professional edition.

 IBM has been a major player in speech recognition for


many years. Its discrete speech product, IBM Voice Type,
IBM has discontinued this product and is now focusing all
its efforts on developing continuous speech products. Its
current product line, IBM Via Voice Millenium, includes a
Standard, Web and Professional edition. The web edition
features natural language commands for Internet
Explorer, Netscape Communicator and America Online.
Room acoustic with existent environmental noise.
Overlapping of the primary sound wave.
Word boundary must be determined.
During comparison time normalization is necessary.
The same word can be spoken quickly or slowly
Speech Recognition: Weaknesses and Flaws

Low signal-to-noise ratio


Overlapping speech
Intensive use of computer power
Homonyms
Primary goal of the Speech Analysis is to
correctly determine individual words with.
probability ≤ 1
Environmental noise, room acoustics and a speaker’s
physical and psychological conditions play an
important role in determination.
Ex. let’s assume extremely bad individual words
recognition with a probability of 0.95. This means that
5% of the words are incorrectly recognized. If we have a
sentence with 3 words, the probability of
recognizing the sentence correctly is 0.95 × 0.95
× 0.95 = 0.857.
A universal translator
At some point in the future, speech recognition may
become speech understanding.
The Application of Speech Recognition Techniques to
Radar Target Doppler Recognition
universal translator
Multilingual Speech Processing, Edited by Tanja
Schultz and Katrin Kirchhoff, April 2006
Multimedia : COMPUTING ,COMMNICATIONS &
APPLICATIONS (By. RALF STEINMETZ & KLARA
NABRSTED)
www.software.ibm.com/speech/
www.dragonsys.com
http://cslu.cse.ogi.edu/HLTsurvey/ch1node5.html
http://www.apple.com/macosx/developertools
Enjoy Speech Recognition
Technology

You might also like