GNU Bayonne Is for Telephony

Three years ago I came to realize that we
had a serious need in free software. Although free software had
expanded to fill almost every other void in the enterprise
infrastructure, we had not addressed the needs of
telecommunications. Telecommunications are not only a part of the
infrastructure of every business, but they are also an often
overlooked part of the desktop user's experience. At the same time,
the hardware required to create telephony services for the public
telephone network has become more widely available under commodity
PC platforms and operating systems, including GNU/Linux.

In choosing to address telecommunications with free software,
I and a few others decided to create a framework describing what
all these services might be, ranging from the needs of desktop
users and application programmers to the needs of the largest
commercial carriers. This project later became known as GNUCOMM
when it was officially folded into a GNU project working
group.

One area we chose to define was the idea of a telephony
application server. Such a server should make it both possible and
easy to create and deploy new telephony application services. These
would be applications specifically written to interact with real
people that call the server over regular telephone lines and
interact with the application with both a voice and a telephone
keypad.

Applications of this nature typically include things like
voice-mail systems or prepaid (debit card) calling platforms. All
of these systems are complex and sometimes programmable systems and
specialized computer telephony hardware are needed to provide an
interface between the PC platform and the public telephone network.
This can be hardware that talks to individual analog telephone
lines or even hardware that provides multiport voice control over
ISDN and T1 digital voice circuits, which larger enterprises can
get directly from a local carrier's central office.

With full consideration that such systems in the past were
generally very expensive, always proprietary and often hard to
program, I chose to solve all of these problems at once by writing
a server under the best supported free software platform available
at the time: GNU/Linux.

When we started the project, few companies provided telephony
hardware under GNU/Linux, so we used what was available. Even now,
each telephony card is different from every other one and tends to
include its own API. Since neither the hardware nor the APIs are in
any manner standardized, most people that produce telephony
applications do so for only a single vendor's card family, and they
do so using exclusively the vendor's supplied API. This practice
also means that any vendor in the computer telephony business has
to provide a very broad family of hardware because one could not
substitute easily other products to fit gaps in a product offering.
All these things have made it difficult for new telephony card
vendors to come into existence and easy for the limited vendors
that do exist to maintain their markets without much change.

This is not to say that no efforts were made to standardize
APIs. After all, there is the ECTF (European Community
Telework/Telematics Forum). Being an industry consortium of
proprietary vendors, they would have to come up with, through
committees, a complicated set of standards and proposals for how
proprietary vendors could develop and maintain computer telephony
solutions. Furthermore, they would need to do so in ways that
expand the need for specialized knowledge, increasing the
stranglehold of their existing members on the computer telephony
marketplace.

Another popular organization is the ITU (International
Telecommunications Union), best known for the fact that appointment
often is handled by national governments. In the US, for example,
this is done as a political appointment by the state department,
rather than from among the best and brightest minds.

Our goal was not only to produce a telephony server as free
software, but we wanted also to make telephony application services
as readily and easily approachable as creating and administering a
web site. We also wanted to abstract the telephony driver and APIs
to the point that they were both irrelevant and invisible in the
development of application services. Doing so would mean anyone
could substitute hardware as they wished, rather than being locked
into the offering of a single vendor.

First Came ACS

Since we wanted to abstract everything within the server at a
low level, the first thing we needed was a portable class
foundation written in C++. I wanted to use C++ for several reasons.
First, it seemed natural to use class encapsulation for the driver
interfaces because of their abstract nature. Second, I found I
could write bug-free C++ code faster than I could write C code. In
fact, this would become my first large-scale C++ project.

Why we chose not to use an existing framework is also simple
to explain. We knew we needed threading, socket support and a few
other elements. No single existing framework did all these things
except a few that were larger and more complex than we needed. For
example, we wanted a small footprint for a telephony server. The
most adaptable framework at the time was ACE (Adaptive
Communication Environment), which typically added several MBs of
core image for the runtime library. Since we were looking at
running on machines with as little as 8-12MBs of memory, this
seemed an unacceptable overhead.

GNU Common C++ (originally APE) was created to provide an
easy-to-comprehend and portable class abstraction for threads,
sockets, semaphores, exceptions and so on. APE has since grown and
is now used as a foundation for a number of projects in addition to
being a part of GNU.

As to creating services themselves, we realized we needed a
new way to create telephony applications—one that would make the
process approachable for the average system administrator. For
simplicity we choose to use a common scripting language, which
later became known as GNU ccScript. By writing scripts and
recording audio samples to create telephony application services,
virtually anyone could participate without needing specialized
knowledge or deep understanding of fantastically complex APIs like
those promoted by the ECTF. Because the underlying telephony
hardware is both invisible and abstracted away from the application
scripting language, the cycle of dependence on using a single card
family is also broken.

But what form should this new scripting language take? Many
extension languages assume a separate execution instance (thread or
process) for each interpreter instance, making them unsuitable for
our project. Many extension languages assume expression parsing
with nondeterministic runtime. An expression could invoke recursive
functions or entire subprograms, for example. Again, we did not
want to have a separate execution instance for each interpreter
instance, and we did not want to have each instance respond to the
leading edge of an event callback from the telephony driver as it
steps through a state machine, so none of the existing common
solutions like Tcl, Perl, Guile, etc., would immediately work for
us. Instead, we created an entirely new nonblocking and
deterministic scripting engine for our first server.

Our scripting language is unique in several ways. First of
all, it is step executed and nonblocking. Statements can either
execute and return immediately or schedule their completion for a
later time with the executive. This allows a single thread to
invoke and manage multiple interpreter instances. While a telephony
server can potentially support interactions with hundreds of
simultaneous telephone callers on high-density carrier scale
hardware, we do not require hundreds of native “thread” instances
running in the server, and we have a very modest CPU load.

Another way our scripting is unique is in support for
memory-loaded scripts. To avoid delay or blocking while loading
scripts, all scripts are loaded and parsed into a virtual machine
(VM) structure in memory. When we wish to change scripts, a brand
new VM instance is created to contain these scripts. Calls
currently in progress continue under the old VM, and new callers
are offered the new VM. When the last old call terminates, then the
entire old VM is disposed of. This allows for 100% uptime, even
while services are modified.

Finally, since we were building a C++ scripting system, we
allowed direct class extensions of the script interpreter as a
means to add new script functionality. This allows one to create a
derived dialect specific to a given application or, if needed,
specific to a given telephony driver, simply by deriving it from
the core language through standard C++ class extension.

While the server scripting language can support the creation
of complete telephony applications, it was not designed to be a
general-purpose programming language or to integrate with external
libraries the way traditional languages do. Nonblocking requires
that any module extensions created for the server be highly
customized. Instead, we wanted a general-purpose way to create
script extensions that could interact with databases or other
system resources. To that end we chose a model essentially similar
to how a web server did this when our ACS (Adjunct Communication
Server) Project was created.

The TGI model for our server is similar to how CGI works for
a web server. In TGI, a separate process is started, then is passed
information on the phone caller through environment variables.
Environment variables rather than command-line arguments are used
to prevent snooping of transactions that might include things such
as credit-card information that could be visible with a simple ps
command.

The TGI process is tethered to the server through
stdout and any output the TGI
application generates is used to invoke server commands. These
commands can do things like set return values, such as the result
of a database lookup, or they can do things like invoke new
sessions to perform outbound dialing. Rather than creating a
gateway for each concurrent call session, a pool of available
processes are maintained for TGI gateways so it can be treated as a
restricted resource. It is assumed that gateway execution time
represents a small percentage of the total call time, so
maintaining a small process pool always available for quick TGI
startup is efficient. This helps to prevent stampeding if, say, all
the callers hit a TGI at the same moment.

With these basic tools, it was possible to create interactive
voice response applications. As soon as it was functional, our
first telephony server was used commercially by Open Source Telecom
and other companies. This wide adoption was a result in part of how
simple it is to create new application services and to integrate
telephony applications under this server with other aspects of a
commercial enterprise. As noted, the only requirements are some
skill in constructing a server-side script, the ability to play and
record audio and some knowledge of common tools like Perl.

A typical application for our server might look like the one
shown in Listing 1 [available at
ftp.linuxjournal.com/pub/lj/listings/issue100/6077.tgz],
the playrec script. This script demonstrates the different concepts
in the current scripting language, including symbol scope and event
trapping, which, used under named script references, form a chain
of logic for processing an interactive telephony application. In
Listing 2 [available at
ftp.linuxjournal.com/pub/lj/listings/issue100/6077.tgz],
we have an example of the server's use of Perl with the TGI.pm
module and the tgigetdbval.pl Perl script.