0% found this document useful (0 votes)
27 views2 pages

1.1 Definition

This document discusses the key components and requirements of voice browsers. It defines a voice browser as a device that can generate voice output, interpret voice input, and interact with other modalities. The document then discusses the need for grammar representation, model architectures, natural language processing, and speech synthesis markup to enable voice browsers to function across different devices and access dynamic internet content. It aims to clarify the scope and requirements of these different components to help the working group make progress.

Uploaded by

kks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views2 pages

1.1 Definition

This document discusses the key components and requirements of voice browsers. It defines a voice browser as a device that can generate voice output, interpret voice input, and interact with other modalities. The document then discusses the need for grammar representation, model architectures, natural language processing, and speech synthesis markup to enable voice browsers to function across different devices and access dynamic internet content. It aims to clarify the scope and requirements of these different components to help the working group make progress.

Uploaded by

kks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

1.

INTRODUCTION
1.1 DEFINITION
A Voice Browser is a "device which interprets a (voice) markup language and is
capable of generating voice output and/or interpreting voice input, and possibly other
input/output modalities." The definition of a voice browser, above, is a broad one. The fact
that the system deals with speech is obvious given the first word of the name, but what makes
a software system that interacts with the user via speech a "browser" ,The information that
the system uses (for either domain data or dialog flow) is dynamic and comes somewhere
from the Internet. From an end-user's perspective, the impetus is to provide a service similar
to what graphical browsers of HTML and related technologies do today, but on devices that
are not equipped with full-browsers or even the screens to support them. This situation is only
exacerbated by the fact that much of today's content depends on the ability to run scripting
languages and 3rd-party plug-ins to work correctly.
Much of the efforts concentrate on using the telephone as the first voice browsing
device. This is not to say that it is the preferred embodiment for a voice browser, only that the
number of access devices is huge, and because it is at the opposite end of the graphicalbrowser continuum, which high lights the requirements that make a speech interface viable.
By the first meeting it was clear that this scope-limiting was also needed in order to make
progress, given that there are significant challenges in designing a system that uses or
integrates with existing content, or that automatically scales to the features of various access
devices.

Grammar Representation Requirements


It defines a speech recognition grammar specification language that will be
generally useful across a variety of speech platforms used in the context of a dialog and
synthesis markup environment."
When the system or application needs to describe to the speech-recognizer what to listen for,
one way it can do so is via a format that is both human and machine-readable.

Model Architecture for Voice Browser Systems Representations


"To assist in clarifying the scope of charters of each of the several subgroups of
the W3C Voice Browser Working Group, a representative or model architecture for a typical
voice browser application has been developed. This architecture illustrates one possible
arrangement of the main components of a typical system, and should not be construed as a
recommendation."

Natural Language Processing Requirements


It establishes a prioritized list of requirements for natural language processing in
a voice browser environment. The data that a voice browser uses to create a dialog can vary
from a rigid set of instructions and state transitions, whether declaratively and/or procedurally
stated, to a dialog that is created dynamically from information and constraints about the
dialog itself. The NLP requirements document describes the requirements of a system that
takes the latter approach, using an example paradigm of a set of tasks operating on a framebased model. Slots in the frame that are optionally filled guide the dialog and provide
contextual information used for task-selection.

Speech Synthesis Markup Requirements


It establishes a prioritized list of requirements for speech synthesis markup which
any proposed markup language should address. A text-to-speech system, which is usually a
stand-alone module that does not actually "understand the meaning" of what is spoken, must
rely on hints to produce an utterance that is natural and easy to understand, and moreover,
evokes the desired meaning in the listener. In addition to these prosodic elements, the
document also describes issues such as multi-lingual capability, pronunciation issues for
words not in the lexicon, time-synchronization, and textual items that require special
preprocessing before they can be spoken properly .

You might also like