SOAP

SOAP is the new standard for network communication between software services.
It is a general-purpose technology for sending messages between endpoints, and
may be used for RPC or straightforward document transfer. SOAP messages are
represented using XML and can be sent over any transport layer. HTTP is the most
common transport layer, with implementations also available for Simple Mail
Transport Protocol (SMTP), Java Messaging Service (JMS), and IBM MQSeries (see
Figure 1.9).

The easiest way to publish a software component as a web service is to use a
SOAP container which accepts incoming requests and dispatches them to published
components, automatically translating between SOAP and the component's
native language interface. SOAP containers are available for most programming
languages, including Java, C++, Perl, and C# (see Figure 1.10).

Once a component has been published as a web service, any SOAP-enabled client
that knows the network address of the service and the messages that it
understands can send a SOAP request and get back a SOAP response. To get the
address and message information, SOAP clients read a WSDL file that describes
the web service. Fortunately, most SOAP containers will automatically generate
WSDL for the web services that they host, so developers don't have to write
WSDL manually unless they really want to. Once the WSDL file is read, the client
can start sending SOAP messages to the web service (see Figure 1.11).

Publishing a Web Service

Before delving into the details of the SOAP protocol, I'll show you how
easy it is to create and invoke a web service using a modern language like Java.
The main thing to note is that no knowledge of SOAP or WSDL is necessary to
immediately become a productive web services developer.

The following example shows the steps that are necessary to publish an object
as a web service and then invoke it from a SOAP client. Although most examples
in this book are written in Java, it is important to note that SOAP is language
neutral and can support any combination of languages on the client and server.
Some examples of Java programs talking to C# programs using SOAP are presented
in the .NET chapter.

The object in this example is a simple stock trading service that defines a
single method for buying stock. The buy() method returns the cost of purchasing
a specified quantity of a particular stock. Here is the source code for the
ITrader interface.

Notice that neither the interface nor the source code for the trader service
contains any code related to SOAP or web services. Most SOAP containers are able
to publish unmodified software components, which is good because domain objects
should not be coupled to details of distributed computing.

Each SOAP container has different Application Programming Interfaces (APIs)
for starting up an in-process HTTP server and for publishing objects as web
services. Here is the way that you would start an HTTP server on
http://localhost:8003/soap and export an instance of Trader using GLUE,
the web services platform included with this book. GLUE is described in more
detail in the next chapter.

Binding to a Web Service

Once an object is published as a web service, a SOAP client can bind to it
and invoke it. For example, here's what a SOAP client written using GLUE
looks like. Fortunately, from a Java developer's viewpoint, a web service
can be invoked as if it were a local object, with all the details of SOAP and
WSDL hidden by the underlying infrastructure. Microsoft .NET provides a similar
mechanism for C# and Visual Basic developers.

The binding process returns a proxy that implements a Java interface whose
methods mirror those of the remote service. A message sent to the proxy is
automatically converted into a SOAP request, delivered across the network, and
the SOAP response is converted back into a regular Java result.

FIGURE 1.12 A client proxy hides the communication details from the application

When the TraderClient is executed, SOAP messages fly back and forth between
the client and server, translated automatically between XML and native calls by
the SOAP container. The first method succeeds and returns a value, whereas the
second method throws an exception because the symbol TME is not recognized.

Even without an explanation of the SOAP format, you can probably figure out
what most of it means. Contrast this with the CORBA and DCOM protocols, which
are binary, not self-describing, and tough to trace. I know this firsthand,
having written a CORBA ORB in a previous lifetime.

The first part of the SOAP request is a standard HTTP header that indicates
that the request is an HTTP POST operation whose Universal Resource Identifier
(URI) is /soap/trader. The Content-Type field shows that the HTTP payload is
XML, and the SOAPAction field tells the remote host that the content is a SOAP
message. SOAPAction is often set to the name of the method to invoke so that the
host web server or firewall can perform some high-level message filtering.

The second part of the SOAP request is an XML document that consists of three
main portions:

Envelope

The envelope defines the various XML namespaces that are used by the rest of
the SOAP message, and typically include xmlns:soap (SOAP envelope namespace),
xmlns:xsi (XML Schema for instances), xmlns:xsd (XML Schema for data types) and
xmlns:soapenc (SOAP encoding namespace). More information about these namespaces
is presented later in this book.

Header

The header is an optional element for carrying auxiliary information for
authentication, transactions, routing, and payments. Any element in a SOAP
processing chain can add or delete items from the header; elements can also
choose to ignore items if they are unknown. If a header is present, it must be
the first child of the envelope. Because our example is simple and does not
invoke routers, the header is absent.

Body

The body is the main payload of the message. When SOAP is used to perform an
RPC call, the body contains a single element that contains the method name and
arguments. The namespace of the method name is specified by the web service, and
in this case is equal to
http://tempuri.org/ followed by
the type of the target web service. The type of each argument can be optionally
supplied using the xsi:type attribute; in this example, the first argument is
flagged as an xsd:int, and the second argument as an xsd:string. If a header is
present, the body must be its immediate sibling; otherwise it must be the first
child of the envelope.

A SOAP request is typically accepted by a servlet, CGI or
standalone daemon running on the remote web server. In this example, the GLUE
SOAP container started a servlet running on localhost:8003/soap. When the
servlet gets a request, it checks that the request has a SOAPAction field, and
if it does, forwards it to the SOAP container. The container uses the POST URI
to look up the target web service, parses the XML payload, and then invokes the
method on the component.

Anatomy of a SOAP Response

The result of the invocation is translated by the SOAP container into a SOAP
response and returned back to the sender within the HTTP reply. Here's the
SOAP response from the buy() message sent to the Trader service, with the result
name and value highlighted for clarity.

The XML document is structured just like the request except that the body
contains the encoded method result. By convention, the name of the result is
equal to the name of the method followed by "Response", and the
namespace of the result is the same as the namespace of the original method.

SOAP Exceptions

If an exception occurs at any time during the processing of a message, a SOAP
fault is generated and encoded in a manner similar to a regular SOAP response.
Here is the SOAP response that is returned when our example client attempts to
buy stock for a ticker symbol that is not recognized.

The standard HTTP reply header indicates an exception by using status code
500. The XML payload contains an envelope and body just like a regular response,
except that the content of the body is a soap:Fault structure whose fields are
defined as follows:

faultcode

A code that indicates the type of the fault. The valid values are soap:Client
(incorrectly formed message), soap:Server (delivery problem),
soap:VersionMismatch (invalid namespace for Envelope element) and
soap:MustUnderstand (error processing header content).

Faultstring

A human readable description of the fault.

Faultactor

An optional field that indicates the URL of the source of the fault.

detail

An application-specific XML document that contains detailed information about
the fault.

Some SOAP implementations add an additional element to encode
information about remote exceptions such as their type, data, and stack trace so
that they can be rethrown automatically on the client.

Performance

Now that you've seen how SOAP messages are passed back and forth using
HTTP and XML, it is time to contemplate performance issues.

CORBA and DCOM use binary encoding for arguments and return values. In
addition, they assume that both the sender and the receiver have full knowledge
of the message context and do not encode any meta-information such as the names
or types of the arguments. This approach results in good performance, but makes
it hard for intermediaries to process messages. And since each system uses a
different binary encoding, it's hard to build systems that interoperate.

Because SOAP uses XML to encode messages, it's very easy to process
messages at every step of the invocation process. In addition, the ease of
debugging SOAP messages is leading to a quick convergence of the various SOAP
implementations, which is important because large-scale interoperability is what
SOAP is all about.

On the surface, it seems that an XML-based scheme would be intrinsically
slower than that of a binary-based model, but it's not as straightforward
as that.

First, when SOAP is used for sending messages across the Internet, the time
to encode/decode the messages at each endpoint is tiny compared with the time to
transfer bytes between endpoints, so using XML in this case is not significant.

Second, when SOAP is used to send messages between endpoints in a closed
environment, such as between departments within the same company, it's
likely that the endpoints will be running the same implementation of SOAP. In
this case, there are opportunities for optimizations that are unique to that
particular implementation. For example, a SOAP client could add an HTTP header
tag to a SOAP request that indicates that it supports a particular optimization.
If the SOAP server also supports that optimization, it could return an HTTP
header tag in the first SOAP response that tells the client that it's okay
to use that optimization in subsequent communications. At that point, both the
client and the server could start using the optimization.

The fastest SOAP implementations typically get at least 500 messages/second
on a 600MHz desktop PC when the client and the server are in different programs
in the same machine, and around 300 messages/second on a fast local area network
(LAN).

Other SOAP Features

The example in this section was very simple and demonstrated only a subset of
SOAP functionality. Additional features, many of which are covered later in this
book, include:

Arrays, objects, and other complex data structures may be sent across the
network in a platform and language neutral way.