Get all the necessities of VoiceXML, beginning with a brief overview of the structure of the language. You will see several abstract element types that will be used as a shorthand as you examine each language element type individually.

This chapter is from the book

This chapter is from the book

In Chapter 2, “VoiceXML essentials,” on page 24 our exploration of VoiceXML was motivated by how the language is generally used. This chapter will exhaustively
cover the entirety of VoiceXML from the perspective of its Document Type Definition (DTD). We will begin with a brief overview
of the structure of the language, defining several abstract element types that will be used as a shorthand throughout the
remainder of this chapter as we examine each language element type individually.

This chapter indicates attributes based on VoiceXML 2.0 Working Draft. New attributes that are exclusive to VoiceXML 2.0 implementations
- i.e. not available in VoiceXML 1.0 implementations - have the notation "(2.0)."

VoiceXML document structure

Each VoiceXML document must have a vxml element as its root node. Figure 3-1 lists the legal children of the vxml element type.

All of these children are other VoiceXML element types except two: event.handler and dialog. These are abstract element types, defined as entities in the DTD. Each of them can represent a union of element types.

Figure 3-2 shows the event.handler element types. Any of these event.handler element types can legally be a child of the vxml element type.

Figure 3-3 shows the definition of the abstract dialog element type which includes form and menu. These are the two types of dialogs defined in VoiceXML. Both of these element types may be children of a vxml element.

The menu element type is the parent of choice, property, or prompt element types as well as event.handler element types, previously defined in Figure 3-2, and audible element types which will be defined shortly. Similarly, the form element type is the parent of script, property, link, filled, var, and grammar element types as well as event.handler element types, previously defined, and form.item element types.

The form.item elements, shown in Figure 3-4, play a particular role in the Form Interpretation Algorithm.

The element types that each form item can contain are covered in more detail in the respective element type sections below.

Another abstract element type that appears in Figure 3-3 is executable.content. Executable content elements are typically found as children of either the block, filled, if, or event.handler elements. Figure 3-5 shows the element types that can be interpreted as executable.content. The VoiceXML interpreter typically treats such elements as procedural statements.

The element types that are children of the executable.content elements are discussed in the respective following sections. It is worth pointing out that the abstract element type variableAccessor shown in Figure 3-5 is not actually in the DTD but is added here for clarity.

The audible element types represent audio that can be played back. It is worth noting that audible is implemented in the DTD as an entity named audio. Since there is also a VoiceXML element type audio, we will refer to this entity as audible in this text for the sake of disambiguation. Figure 3-6 lists the audible element types. You will notice that PCDATA (parsed character data) can be interpreted as audible content, as is the case with plain text processed by a text-to-speech processor.

The tts element types are new to VoiceXML 2.0 and are used to either organize or mark up text to be processed by a text-to-speech
processor. As we can see from Figure 3-3, choice and prompt element types can be parents of tts element types.

While the element types for controlling audio output, namely audible and tts, can be found as children of various VoiceXML element types, the element types for controlling audio input are found only
as children of the grammar element type. Figure 3-8 shows the taxonomy of a grammar definition using the GRXML standard grammar definition format.