Virtual Token Descriptor-XML can accelerate parsing for applications based on Web Services Security

Web Services Security (WSS) refers to a set of XML message-level standards designed to ensure the security of various aspects of SOA (service-oriented architecture). Yet, due largely to the inherent issues of DOM (Document Object Model) and SAX (Simple API for XML Processing), the real-world implementations of WSS generally have poor performance characteristics that often fail to meet the requirements of production SOA deployment. With the advent of VTD-XML (Virtual Token Descriptor-XML), this is about to change fundamentally. Still, many problems with WSS are deeper than they appear, and overcoming them would inevitably require changes to the problems themselves.

The objectives of this article are:

To analyze the performance issues of DOM for WSS applications and look at how VTD-XML solves those issues.

To introduce XMLModifier, a new feature introduced in version 1.8 of VTD-XML, and show some of the latest benchmark numbers of VTD-XML most relevant to WSS.

To identify some of the technical issues in WS signing and encryption and propose possible fixes.

The new clothes of WSS

If you are one of those enterprise developers spending considerable time tuning for better application performance, you probably are aware of the strategy involving the following steps:

Identify the performance bottleneck in the application.

Rewrite/optimize the corresponding code to eliminate the bottleneck.

While this tuning strategy is usually effective, it depends on an underlying important assumption—in the second step, you must have reasonable control of the code corresponding to the performance "hot" spot. Otherwise, if the most significant bottleneck contributor lies in a well-known class library of the JDK itself, you are likely to get stuck in a quagmire.

Good examples of this problem are the real-world implementations of WSS. At the conceptual level, WSS is a set of message-level specifications designed to ensure the authenticity, confidentiality and integrity of SOAP messages. A WSS endpoint takes an incoming SOAP message and computes security tokens (essentially XML fragments), which are then inserted into the original SOAP message. Unfortunately, most existing WSS implementations generally have poor performance characteristics. Certain operations of WSS, such as WS signing and encryption, even have the reputation of being deadly slow.

While the computation of security tokens varies in complexity, a WSS application generally has to parse the entire SOAP message for the following two reasons:

The values of the security tokens are computed from SOAP data, which can be anywhere in the SOAP message.

The computed security tokens need to be inserted back into locations that can be anywhere in the SOAP message.

For those reasons, SAX is, generally speaking, not well-suited for WSS implementations because SAX parsers force developers to buffer the events or create their custom object models, both of which require undue implementation effort. DOM, on the other hand, provides much-needed power and flexibility, since a DOM tree resides in memory.

Unfortunately, DOM parsing is known to be memory and CPU intensive. But this is not the only problem. Inserting a security token, no matter how simple, requires that the entire incoming XML message be reserialized. But reserialization doesn't come cheap: It involves memory copying, buffer allocation, and character decoding.

Why are WSS implementations slow? DOM parsing is slow, and reserialization makes them a lot slower. Worse, both parsing and reserialization are inevitable with DOM — there doesn't seem to be an easy way out.

Facing a problem this obvious, I must ask: Do you see the same problem that I see?

Change is coming

My last two JavaWorld articles focused on two key benefits of VTD-XML: high-performance parsing and incremental update. Both are quite essential for a high-performance WSS implementation. Why? First, VTD-XML parses XML messages five to 10 times faster than DOM parsers, consumes just one-third to one-fifth of the memory, and, more importantly, exports a hierarchical view of XML Information Set (Infoset) that one can navigate back, forth, and sideways. Second, VTD-XML internally keeps an XML message intact and undecoded, meaning reserializing the parts of the SOAP message irrelevant to the security token computation is no longer necessary. When the security tokens are generated, just stick them anyplace you want in the message.

From a technical perspective, VTD-XML has raised the base-line WSS performance to a level that is close to VTD-XML's parsing performance. A 3-year-old 1.7-GHz Pentium M processor gives you between 50 MB/second and 70 MB/second, roughly 10 to 15 times DOM's throughput of doing both parsing and reserialization. In other words, now is the time to raise expectations on WSS performance.

Incrementally update XML with XMLModifier

Before version 1.8, VTD-XML had three main classes that performed parsing, navigation, and XPath evaluation. The latest version of VTD-XML introduces XMLModifier, a new class that simplifies the incremental updates of XML content. It does three things:

Inserts bytes or strings into an XML file.

Deletes portions of XML.

Updates the original XML with newer content.

Sharing the basic concepts of other classes, XMLModifier operates directly at the byte level, instead of the node level, in DOM. To use XMLModifier, developers usually follow these steps:

Instantiate an instance of XMLModifier: There are two constructors available: One takes an instance of VTDNav; the other is argument-less. If the second constructor is used, call bind() to attach an instance of VTDNav to XMLModifier.

Record various types of modification operations: As the code navigates to different parts of the document, call XMLModifier's various methods to insert, delete, or update various parts of the XML document. Some of those methods take as input an integer corresponding to the VTD token index. Other methods operate on the token at VTDNav's cursor.

Shown below, our simple application inserts an attribute in the root element purchaseOrder, replaces the name child of shipTo from "Alice Smith" to "Janice Smith," inserts a new element before shipTo, and deletes billTo entirely:

When output() is called, the XMLModifier instance does a few checks internally. After deleting the billTo element in the test.xml, our sample application can continue to delete its attributes or child elements. But there exists a semantic ambiguity. If the billTo element is removed, its attributes and children are all gone. What does it mean to delete the attributes, or children, of a "nonexistent" element? In this case, calling output() will throw an XMLModifyException.

Note that while XMLModifier's methods are designed to avoid introducing errors into XML, the well-formedness of the output's byte content is not guaranteed. For example, the current implementation of XMLModifier also forbids the calling of insertBeforeElement() or insertAfterElement() twice in a row at the same cursor location. The reason is because it is, again, ambiguous. Say you insert <test/> the first time and <test2/> the second time. What should the output look like? <test/><test2>? Or <test2/><test/>? If you want the output to look like the former, why not just insert <test/><test2/> all at once? It is up to the developers to decide how to use XMLModifier's methods to correctly produce well-formed XML output.