Creating VoiceXML Applications With Perl

Introduction

VoiceXML is an XML-based language used to create Web content and
services that can be accessed over the phone. Not just those nifty
WAP-enabled "Web phones", mind you, but the plain old clunky home
models that you might use to order a pizza or talk to your Aunt
Mable. While HTML presumes a graphical user interface to access
information, VoiceXML presumes an audio interface where
speech and keypad tones take the place of the screen, keyboard, and
mouse. This month we will look at a few samples that demonstrate how
to create dynamic voice applications using VoiceXML, Perl, and CGI.

Reach Out and Surf Somewhere

To demonstrate how easy it can be to make existing Web content
available over the phone we will create a simple Perl CGI script that
fetches an RSS channel
file containing a list of the most recent uploads to CPAN and converts parts of it to VoiceXML
so that it may be accessed over the phone via a VoiceXML gateway.

use strict;
use XML::XPath;
use LWP::UserAgent;

After loading the necessary module we begin our script by creating
new HTTP::Request and LWP::UserAgent
objects. We then call LWP::UserAgent's
simple_request method to ask the remote server for the
RSS file.

Now that the request has been made, we will begin the VoiceXML
output. We start by creating the mandatory vxml root
element and a minimal form that contains a single
block element. Inside the block element we
put an audio element that asks the user to be patient
while the RSS file is processed and a goto element that
tells the VoiceXML browser to jump to the section of the current
document labeled "headlines".

Next we test the response object to ensure that we received the
remote RSS file. If the file was successfully fetched, we create a new
XML::XPath instance and pass it the content section of
the response object for parsing. If anything goes awry during the
request, or while parsing the returned content, we trap the error in
the scalar $error for later processing. Although the
eval block that wraps the initial call to
XML::XPath adds a fair bit of overhead to the script, it
nevertheless gives us a way to fail gracefully in the event of a
parsing error. Without the surrounding eval, a parser
error would cause the script to die unexpectedly.

If the RSS file has been fetched and parsed successfully we create
a new form element; then, using an audio
element inside a block wrapper, we tell the caller about
the success and prepare them to hear the list of modules.

else {
print qq*
<form id="headlines">
<block>
<audio>
The RSS file has been fetched and processed successfully. The
following modules have recently been up loaded to c pan.
</audio>
</block>
<block>
*;

Next we loop through all the item elements in the RSS
document. For each item element encountered we print a
corresponding audio element for our VoiceXML document
using the value of each item's title child element as the
text.

While this script is not terribly useful in and of itself, think
for a moment about just exactly what we have done here. In a few
lines of code we have taken a resource from a distant part of the Web,
extracted the information that we care about, and made that
information available from any phone anywhere in the world.