Building VoiceXML Dialogs

Introduction

Until fairly recently, the web has primarily delivered information and services using visual interfaces, on computers equipped with displays, keyboards, and pointing devices. The web revolution had largely bypassed the huge market of customers of information and services represented by the worldwide installed base of telephones for which voice input and audio output provided the primary means of interaction.

VoiceXML 2.0 [VXML2], a Standard recently released by the W3c [W3C] is helping to change that. Building on top of the market established in 1999 by the VoiceXML Forum's VoiceXML 1.0 specification [VXML1], VoiceXML 2.0 and several complementary standards are changing the way we interact with voice services and applications - by simplifying the way these services and applications are built.

VoiceXML is an XML-based [XML] language, designed to be used on the Web. As such, it inherits several key features common to all XML languages:

It leverages existing Web protocols such as HTTP to access remote resources

Any tool that is able to read or write XML documents can read and write a VoiceXML document

Other XML documents can be embedded in VoiceXML documents and fragments; similarly, VoiceXML documents can embed other XML documents and fragments. This is the case with SRGS and SSML, which is described later.

As mentioned above, VoiceXML 2.0 is one in a number of Standards the W3C Voice Browser Working Group is defining to enable the development of conversational voice applications. The specifications making up the Speech Interaction Framework are:

This article is the first in a three-part series that provides an introduction to VoiceXML, as well as SRGS, SSML, and SISR for building conversational web applications. In this first installment the focus will be on building VoiceXML dialogs through both menu and form elements. The second part will outline how VoiceXML takes advantage of the distributed web-based application model as well as advanced features including: local validation and processing, audio playback and recording, support for context-specific and tapered help, and support for reusable sub dialogs. Finally, the third article will discuss natural vs. direct dialogue and how VoiceXML enables both by allowing input grammars to be specified at the form level, not just at the field level.

The Menu Element

Most VoiceXML dialogs are built from one of two elements. The first of these is the <menu> element. A VoiceXML menu behaves much like a collection of HTML links.

A VoiceXML menu has a <prompt>, which contains SSML content, and one or more choices, each identified by a <choice> tag. Each choice consists of a phrase indicating what the user can say, as well as a link to the next VoiceXML document to be executed.

When the VoiceXML Browser recognizes that the user has spoken "sports scores," it fetches the VoiceXML document identified by the corresponding choice (http://www.example.com/sports.vxml) and begins executing it, presumably providing the user with sports information.

The Form Element

The second dialog element in VoiceXML is the <form> element. A VoiceXML form is very similar to an HTML form in that it typically contains one or more input fields that a user must complete. Each input field in a form has a prompt and a specification of what a user can say to fill in the field.

As each <field> is executed, its <prompt> is played. Following the prompt, the user responds by speaking the requested information. When both fields in the form have been filled, the final block is executed. In this example, the executes a submit tag, which sends the variables phone_number and pin_code to the "login" servlet, in much the same way as a "submit" button works on an HTML form. The servlet would then return a new document for the VoiceXML Browser to execute.

Advertiser Disclosure:
Some of the products that appear on this site are from companies from which QuinStreet receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. QuinStreet does not include all companies or all types of products available in the marketplace.

Thanks for your registration, follow us on our social networks to keep up-to-date