Introducing SOAP

SOAP is something you may find a use for, even if you're not intersted in three-tier web applications.

In the January and February installments
of “At the Forge”, I demonstrated a simple three-tier web
application using a database, web server and the Mason templating
system for mod_perl. We were able to see some of the advantages and
disadvantages of a three-tier web application, particularly when
compared with its two-tier counterpart.

But as I pointed out last month, our three-tier architecture
was incomplete and wasn't necessarily a fair demonstration. That's
because our Perl middleware object layer had to reside on the same
computer as the components we wrote for HTML::Mason, a templating
system built on mod_perl. Depending on how you count things, this
might be considered a two-tier application, albeit one with an
object-oriented abstraction layer between the tiers.

In order to put the Mason components and Perl objects on
separate computers, we somehow need the ability to call an object
method across a network. That is, the following line of Perl would
work, regardless of whether $object resides on the same computer as
our Apache server or somewhere else on the Internet:

$object->method($arg1, $arg2);

Distributed-object technology and remote-procedure calls have
existed for many years on a variety of platforms. In almost every
case, this technology was restricted to a particular language or
platform. DCOM (Distributed Component Object Model) allows objects
of any language to communicate but only under Windows. Java's RMI
(Remote Method Invocation) can only communicate with other Java
objects. CORBA is an exception to this, allowing objects to
communicate across platforms and languages, but CORBA is complex,
has taken awhile to get off the ground and isn't yet a part of most
programmers' knowledge base.

In response to these proprietary and complex protocols, a
number of people in the Internet community have created SOAP, the
Simple Object Access Protocol, that makes it extremely easy to
create distributed applications. Two of the biggest proponents of
SOAP have been Dave Winer (famous for his Scripting News
“weblog”) and Microsoft, which is not usually associated with
open standards and cross-platform protocols. Regardless of what we
in the Linux community might think, Microsoft has publicly embraced
SOAP, making it a cornerstone of its .NET effort.

SOAP History and Concepts

SOAP depends on the idea that any two computers on the
Internet can communicate using HTTP, the protocol that powers the
Web. (Actually, SOAP can be transmitted over nearly any high-level
protocol, including SMTP and POP3, but HTTP is by far the most
common.) It then transmits information using XML, the markup
language that allows us to create tags and document standards. The
server turns the incoming XML into an object method call, and then
turns the object's response into an XML document that is returned
as the HTTP response. Since both HTTP and XML are open standards,
published by the World Wide Web Consortium, they can be (and are)
implemented on a variety of platforms and, thus, interact without
any trouble.

The predecessor to SOAP, known simply as XML-RPC, provided a
simple mechanism for remote procedure calls (RPC) using data
formatted in XML and transmitted over HTTP. For a variety of
reasons, including the fact that XML-RPC could not handle advanced
data structures, the W3C adopted SOAP.

A number of languages and platforms continue to support
XML-RPC, and it's possible that some situations might call for its
use because it has a smaller overhead. Practically speaking,
however, the fact that SOAP has gotten so much attention has led to
the development, use and debugging of its libraries to a much
greater extent than those for XML-RPC. As of this writing, however,
there are more implementations of SOAP than XML-RPC, meaning that
your choice of platform or language might force your hand toward
one protocol or the other.

SOAP, as its name implies, expects to work with objects
rather than simple procedure calls. Thus, SOAP client invokes a
method on a particular object on the server. The method is
specified in the body of the XML document itself, while the object
with which it is associated is named in an HTTP “SOAPAction”
header. Of course, we also need to specify a computer name and port
to which the SOAP request can be directed.

The server itself, including its name and the port number on
which the SOAP request is transmitted, is known as the SOAP proxy.
This makes sense when you consider that the HTTP server is simply
relaying an object method invocation and isn't doing any of this
work by itself. Do not confuse the SOAP proxy with an HTTP proxy.
An HTTP proxy relays requests from an HTTP client to an HTTP server
and often performs security checks and caching. A SOAP proxy, by
contrast, relays messages between a SOAP client and an object on
the proxy's computer.

The object for which the SOAP server acts as a proxy is
sometimes known as the endpoint and is specified in a
“SOAPAction” HTTP header. The name of the endpoint can be
virtually any text string, including hierarchy separators such as
:: and /. In practice, the endpoint has a direct connection to the
object hierarchy associated with the language in which the SOAP
proxy is written. In Perl, the endpoint might be something like
“Foo/Bar”, which refers to the Foo::Bar object located in the
file Foo/Bar.pm.