Florent Georges

XSLT 2.0 extensions for Saxon

Introduction

This is a set of extensions for XSLT 2.0, developed for the
Saxon
processor, version 9. For now, there is a
URIResolver that can pass through a proxy
(in particular proxies requiring to authenticate) and a
function to send HTTP requests.

Each extension is implemented in it own Java class, with one
single dispatching class for all extensions. That way it is
easy to see the functions aimed to be used from XSLT, as well
as it is easy to reference them from the XSLT code. This
class is org.fgeorges.exslt2.saxon.Exslt2.
It only contains public static methods. So to use the
extensions, just declare the correct namespace and use the
functions:

Note the Proxy URI Resolver is not an extension, strictly
speaking. You don't use it from within the stylesheet, but
rather you configure Saxon to use it to resolve HTTP accesses
(for instance by doc() or document()). See below.

Download

Everything you could need is in the following archive: fgeorges-0.1.zip. There is
the JAR file, the documentation (this page), the Java source
files, the complete XSLT samples and the Javadoc. To
install, you just have to put the JAR in the classpath,
depending on how you actually invoke Saxon (you can have a
look at the shell script I
wrote for myself to launch Saxon from the command line).

Proxy URI Resolver

The class
org.fgeorges.xslt.HttpProxyUriResolver is
an implementation of the JAXP's interface
URIResolver. A lot of resources can be
retrieved from a stylesheet via the HTTP protocol.
Unfortunately, numerous places are behind a proxy, so the
connection used to retrieved those resources has to be
configured properly.

Java provides a standard way to configure the proxy host name
and port number, by setting the properties
http.proxyHost and
http.proxyPort respectively. But there is
no way to set the credentials for a proxy (credentials are
not always needed, but this is more and more used, especially
within large organizations).

This class provides you with this ability, by substituting to
the standard Saxon's resolver. When it encounters an HTTP
request, it configures the connection with the right
credentials. If the resource is not an HTTP request, the
resolver fall back to the standard Saxon's mechanism. The
credentials are set up via the properties
fgeorges.httpProxyUser and
fgeorges.httpProxyPwd.

So you have to adapt the way you launch Saxon to add the JAR
to the classpath, and set both properties (besides the two
standard properties for the proxy's host name and port
number). For instance:

My shell
script for Saxon supports setting those options more
easily (both for standard proxy settings and the
authetication extension). You are then able to use the
following (equivalent to the above command):

HTTP & HTTPS

Warning: Although this extension has been useful for
a few years, it is no longer maintained. It has now evolved
into the similar http:send-request() function, part of the
EXPath project.
If you consider to use it, you are strongly adviced to have a
look at the EXPath HTTP Client
instead.

This extension allows you to make an HTTP request from an
XPath expression within your XSLT stylesheet. You just have
to call ex:http-send() with the right
parameters, and you then get the result of the request as
value of the function.

ex:http-send()

ex:http-send($request as node(),
$uri as xs:string) as element()

$uri is the target URI the HTTP request
will be sent to. $request must be an
element node or a document node with a single element.
The name of the element is not relevant. It represents
the HTTP request, and looks like:

<http-requestmethod="post"mime-type="text/xml"charset="utf-8"><headername="Header-Name">...</header><headername="Header2-Name">...</header><body>
The textual value of body will be the payload of the HTTP request...
</body></http-request>

The attribute method is the HTTP method
(for instance get,
post or delete).
This is get by default. The attribute
mime-type is the MIME type of the
request. This is
text/xml by default. The attribute
charset is the encoding of the request,
by default utf-8.

You can also set the credential information for the target
server, with the attributes user and
password. This will set credential
conforming to Basic
HTTP Authentication. If you set the properties
for the proxy credentials (as explained in the previous
section), they will be used as well to go through the
proxy.

The result of the function is an element with the following format:

<http-responsecode="200"><message>OK</message><headername="Header-Name">...</header><headername="Header-x-Name">...</header><body>
The textual value of body was the payload of the HTTP response...
</body></http-response>

This is important to understand that HTTP caries text in
the body of both requests and responses. So if you want
to send and/or receive XML, for instance to query a Web
service via SOAP, you will have to serialize or parse the
XML. This is showed in the examples below.

Simple, complete eXist samples

Here are two complete samples sending request to the REST
interface of a running eXist database (eXist is a native
XML database, see http://exist-db.org/).
You should not know eXist in order to understand the
samples. All you need to know is: 1/ authentication to
eXist is done by HTTP Basic Authentication (which is
supported by this extension), 2/ to remove a document from
the database, one has to send an HTTP DELETE request to
eXist, and 3/ to upload a document, one has to send an
HTTP PUT request.

The above stylesheet send an HTTP PUT to an eXist database
running on the same machine. Then it checks that
everything was ok (HTTP codes starting by '2' mean 'Ok').
Note that it uses unparsed-text()
instead of doc() to access the
document, to avoid to get it parsed, and to have instead
the raw text of the file.

Complete Google contacts sample

The Google APIs provide a simple REST API: you just need
to send an HTTP POST request with parameters encoded in
application/x-www-form-urlencoded (that
means the request body looks like:
param1=value1&param2=value2, with a
bit of escaping). You first need to use the Authentication
API to get an authentication
token, that you'll pass to every call
of other APIs. Then you can use the Contact
API to get the data of all your contacts, then a
second call to get the data of all the groups your
contacts belong to.

Before showing the whole stylesheet, here are what the
three request should look like (more exactly what the
elements representing the three HTTP request should look
like). Here is the authentication call (indented for
readibility, but there shouldn't be any carriage return):