0% found this document useful (0 votes)
39 views

PHP Voice

The document discusses voice recognition systems and how they work by converting speech to text or commands. It describes the technology used including VoiceXML, which is an XML language for building voice applications. Potential applications of web-based voice recognition systems are also outlined.

Uploaded by

Sidharth Choubey
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

PHP Voice

The document discusses voice recognition systems and how they work by converting speech to text or commands. It describes the technology used including VoiceXML, which is an XML language for building voice applications. Potential applications of web-based voice recognition systems are also outlined.

Uploaded by

Sidharth Choubey
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Problem Statement: Web Based Voice recognition System .

Introduction & working of Voice recognition system :


Today, when we call most large companies, a person doesn't usually answer the phone.
Instead, an automated voice recording answers and instructs you to press buttons to move
through option menus. Many companies have moved beyond requiring you to press buttons,
though. OIten you can just speak certain words (again, as instructed by a recording) to get
what you need. The system that makes this possible is a type oI speech recognition program
-- an automated phone system.
You an also use speech recognition soItware in homes and businesses. A range oI soItware
products allows users to dictate to their computer and have their words converted to text in a
word processing or e-mail document. You can access Iunction commands, such as opening
Iiles and accessing menus, with voice instructions. Some programs are Ior speciIic business
settings, such as medical or legal transcription.
People with disabilities that prevent them Irom typing have also adopted speech-recognition
systems. II a user has lost the use oI his hands, or Ior visually impaired users when it is not
possible or convenient to use a Braille keyboard, the systems allow personal expression
through dictation as well as control oI many computer tasks. Some programs save users'
speech data aIter every session, allowing people with progressive speech deterioriation to
continue to dictate to their computers.
Current programs Iall into two categories:
Small-vocabulary/many-users
These systems are ideal Ior automated telephone answering. The users can speak with a great
deal oI variation in accent and speech patterns, and the system will still understand them most
oI the time. However, usage is limited to a small number oI predetermined commands and
inputs, such as basic menu options or numbers.
Large-vocabulary/limited-users
These systems work best in a business environment where a small number oI users will work
with the program. While these systems work with a good degree oI accuracy (85 percent or
higher with an expert user) and have vocabularies in the tens oI thousands, you must train
them to work best with a small number oI primary users. The accuracy rate will Iall
drastically with any other user.
Speech recognition systems made more than 10 years ago also Iaced a choice between
discrete and continuous speech. It is much easier Ior the program to understand words when
we speak them separately, with a distinct pause between each one. However, most users
preIer to speak in a normal, conversational speed. Almost all modern systems are capable oI
understanding continuous speech.
Speech to Data
To convert speech to on-screen text or a computer command, a computer has to go through
several complex steps. When you speak, you create vibrations in the air. The analog-to-
digital converter (ADC) translates this analog wave into digital data that the computer can
understand. To do this, it samples, or digitizes, the sound by taking precise measurements oI
the wave at Irequent intervals. The system Iilters the digitized sound to remove unwanted
noise, and sometimes to separate it into diIIerent bands oI frequency (Irequency is the
wavelength oI the sound waves, heard by humans as diIIerences in pitch). It also normalizes
the sound, or adjusts it to a constant volume level. It may also have to be temporally aligned.
People don't always speak at the same speed, so the sound must be adjusted to match the
speed oI the template sound samples already stored in the system's memory.

An ADC translates the analog waves of your voice into digital data by sampling the
sound. The higher the sampling and precision rates, the higher the quality.













ext the signal is divided into small segments as short as a Iew hundredths oI a second, or
even thousandths in the case oI plosive consonant sounds -- consonant stops produced by
obstructing airIlow in the vocal tract -- like "p" or "t." The program then matches these
segments to known phonemes in the appropriate language. A phoneme is the smallest
element oI a language -- a representation oI the sounds we make and put together to Iorm
meaningIul expressions. There are roughly 40 phonemes in the English language (diIIerent
linguists have diIIerent opinions on the exact number), while other languages have more or
Iewer phonemes.

The next step seems simple, but it is actually the most diIIicult to accomplish and is the is
Iocus oI most speech recognition research. The program examines phonemes in the context oI
the other phonemes around them. It runs the contextual phoneme plot through a complex
statistical model and compares them to a large library oI known words, phrases and
sentences. The program then determines what the user was probably saying and either outputs
it as text or issues a computer command.


Tecbnology USED :
PHP :

M?SCL


VUICE XML Introduction
InO, (IO,) ls Lhe W3Cs sLandard xML formaL for speclfylng lnLeracLlve volce dlalogues
beLween a human and a compuLer lL allows volce appllcaLlons Lo be developed and deployed ln an
analogous way Lo P1ML for vlsual appllcaLlons !usL as P1ML documenLs are lnLerpreLed by a vlsual
web browser volcexML documenLs are lnLerpreLed by a volce browser
Many commercial VoiceXML applications have been deployed, processing millions oI
telephone calls per day. These applications include: order inquiry, package tracking, driving
directions, emergency notiIication, wake-up, Ilight tracking, voice access to email, customer
relationship management, prescription reIilling, audio news magazines, voice dialing, real-
estate inIormation and national directory assistance applications.
VoiceXML has tags that instruct the voice browser to provide speech synthesis, automatic
speech recognition, dialog management, and audio playback. The Iollowing is an example oI
a VoiceXML document:
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"
<form
<block
<prompt
Hello world!
</prompt
</block
</form
</vxml
When interpreted by a VoiceXML interpreter this will output "Hello world" with synthesized
speech.
Typically, HTTP is used as the transport protocol Ior Ietching VoiceXML pages. Some
applications may use static VoiceXML pages, while others rely on dynamic VoiceXML page
generation using an application server like Tomcat, Weblogic, IIS, or WebSphere.
Historically, VoiceXML platIorm vendors have implemented the standard in diIIerent ways,
and added proprietary Ieatures. But the VoiceXML 2.0 standard, adopted as a W3C
Recommendation on 16 March 2004, clariIied most areas oI diIIerence. The VoiceXML
Forum, an industry group promoting the use oI the standard, provides a conIormance testing
process that certiIies vendors' implementations as conIormant.




Problem statement can be solved by the Iollowing strategy.




Applications :
1. Web based IVRS .
2. Technique can be used to make voice based secure login system.
3. Voice based search engines.
4. Voice based Online home automation system over IP networks.
Etc..

You might also like