VoiceXML 2.0 Grammars, Part I

This technical series will provide programmers with a complete
introduction to the VoiceXML 2.0 grammar format. In part I, we will discuss the XML and ABNF formats, as well as the structure and elements included in a VXML 2.0 document.

Overview

Grammars define the words and sentences (or touch-tone DTMF
input) that can be recognized by a VoiceXML application. One big
drawback of VoiceXML 1.0 was that it lacked a standard speech
recognition grammar format. To some degree, this reduced the
benefits of the specification because it left the burden on VoiceXML
browser developers to define the grammar language and format. For
example, application grammars written for Nuance Voice Web Server
would have to be re-written to work on IBM Voice Server. This
problem was rectified with the Speech Recongition Grammar
Specification (SRGS) introduced by the W3C Voice Browser group in
conjunction with the VoiceXML 2.0 specification.

XML or ABNF?

The VoiceXML 2.0 grammar specification provides two text formats
for writing speech recognition grammars: XML or ABNF. XML is a Web
standard for representing structured data. Many programming and
editing tools incorporate XML editing and processing capabilities.
These XML tools can be used to write VoiceXML 2.0 grammars. ABNF
stands for Augmented Bacus-Naur Form, and is a format used to
specify languages, protocols and text formats. For example HTTP, the
communications protocol used on the World Wide Web (and for
VoiceXML applications), is specified in ABNF format.

The ABNF grammar format uses special characters to define grammar
expressions in a text string while XML grammars are composed of text
strings enclosed in XML elements. Whether to use the ABNF or XML
format is up to you, however, VoiceXML 2.0 only requires implementers
to support the XML format. Therefore, you may want to use the XML
format to write grammars if portability is important to you.

If you're already experienced with the GSL or JSGF grammar
formats, then you'll likely prefer the ABNF format because of its
similarity. If you decide to use the XML format, you will quickly
discover that it is extremely verbose compared to ABNF, making it
more difficult to read. On the other hand, using the DTD or XML
Schema for the XML grammar format in conjunction with an XML editor
makes the task less tedious and reduces syntax errors. The authors
of the VoiceXML 2.0 grammar format have also included an XSL style
sheet for converting XML grammars to ABNF format, which may aid
linguists who prefer to proof grammars in a less verbose text
format.