The elusive XPath nodeset serialization

I have been involved in various capacity with five different specifications that define a GET (or GET-like) operation that takes as input an XPath expression used to pinpoint the subset of the XML document that should be retrieved (here is a quick history as of a couple of years ago, more has happened since). And I must shamefully admit that all but one are simply impossible to implement in an interoperable way.

That’s because they instruct implementers to return an XPath nodeset in the response SOAP message but say nothing about how to serialize the nodeset. While an XPath nodeset contains the kind of things that make up an XML document, it is not an XML document by itself. There is an infinite number of possible ways to serialized an XPath nodeset into XML. To have any hope of interoperability on this, a serialization algorithm has to be clearly described by the specification. Which hasn’t happened.

Let’s start with WS-ResourceProperties (WS-RP). It has a QueryResourceProperties operation that takes an XPath expression as input. The specification says that “the response MUST contain an XML serialization of the results of evaluating the QueryExpression against the resource properties document“. Great, thanks. The example provided happens to return a nodeset with only one node (a boolean), which is implicitly serialized into the text representation of that boolean. What if there is more than one node in the nodeset? What about other types of nodes?

Moving on to WS-Management, which defines a SOAP header that uses XPath to qualify a WS-Transfer GET request such that it only retrieves a subset of the target XML document. While it does a better job than WS-RP at describing the input (e.g. it specifies the context node and what namespace declarations are in scope for the XPath evaluation) it is even more cavalier than WS-RP in describing the output: “the output (lines 53-55) is like that supplied by a typical XPath processor and might or might not contain XML namespace information or attributes“. By “a typical XPath processor” we should understand MSXML I suppose. But as far as I know a “typical XML processor” doesn’t return XML, it returns language-specific data structures (e.g. a C# or Java object, like a nu.xom.Nodes instance). And here too, the examples only use single-node nodesets.

WS-ResourceTransfer (WS-RT) was supposed to be the convergence of these two efforts, so presumably it would have learned from their mistakes. While it is better written in general than its predecessors, it fails just as badly with regards to specifying the nodeset serialization. And once again, the example provided uses a nodeset with just one node.

And then came the CMDBf query operation which, for some unclear reason, was deemed in need of a built-in XPath transformation of records. As I pointed out in my review of CMDBf 1.0 at the time, this feature was added without taking the pain to define the XML serialization of the resulting nodeset. And there isn’t even an example of the XPath serialization.

It is sad in a way, but the only specification that acknowledges the problem and addresses it came before any of the four above even got started. It is the WSMF (Web Services Management Framework) work that we did at HP, and more specifically the “note on dynamic attributes and meta information” (not available at HP anymore but available from archive.org) . This specification was the first one to define a GET operation that is qualified by an XPath expression. Unlike its successors it also explicitly narrowed down the types of nodes that could be selected (“The manager MUST NOT send as input an XPath statement that returns a nodeset containing nodes other than element, attribute and namespace nodes“). And for those valid types it described how to serialized them in XML (“When a node in the result nodeset is an attribute node, for the sake of the response it is serialized as an element node which has the same name as the name of the original attribute (see example 4 for an illustration). The element is in the same namespace as the namespace the attribute it represents is in. This applies to namespace nodes as well, they are serialized like an attributes in the xmlns namespace“). Turning an attribute into an element of the same QName might not be the smartest thing in retrospect (after all there may be an element by that QName already) but at least we recognized and addressed the problem.

Not so. Anyone wanting to use XPath for a SOAP-based query language still would have to specify a serialization.

The first problem with the W3C serialization is that the XML output method doesn’t work for all nodesets. Try to use it on a nodeset that contains a top-level attribute node and you get error err:SENR0001. And even for the nodesets it accepts, it sometimes returns less-than-useful results. For example, if your XPath is of the form /employee/name/text() and you have four employees, the result will look something like this:

“Joe SmithKathy O’ConnorHelen MartinBrian Jones”

Concatenated text values without separators. I guess W3C is like a department store, they don’t offer complimentary wrapping anymore…

That’s why the nux.xom.xquery.ResultSequenceSerializer class had to define its own wrapping mechanims to produce a useful XML serialization. The API gives you the choice between the W3C_ALGORITHM and the WRAP_ALGORITHM.

Bottom line, and however much some would like to think of it that way, XPath (1 or 2) is not an XML subsetting/transformation mechanism. It could be used to create one (as XSLT does), but you have to do your own plumbing.

In addition to the technical aspects of this discussion, what else can be learned from this sad state of things? The fact that all these specifications define an XPath-driven query mechanism that is simply broken (beyond the simplest use cases) withouth anyone even noticing tells me that there isn’t a real need for full XPath query over SOAP (and I am talking about XPath 1.0, the introduction of XPath 2.0 in CMDBf is even more out there). A way to retrieve individual elements (and maybe text values) is all that is needed for 99% of the use cases addressed by these specifications. Users would be better served (especially in a version 1.0) by specifications that cover the simple case correctly than by overly generic, complex and poorly documented features. There is always time to add features later if the initial specification is successful enough that users encounter its limitations.