README.md

Version: 1.1

Author: Norman Walsh

Date: 28 July 2011

The REST endpoint library

(Are you anxious and impatient? Do you want to just see it work?
Then you might prefer to skip down to the slides example).

The REST endpoint library is a set of XQuery modules designed to
make the development and deployment of RESTful web services on
MarkLogic server easier. It consists of two parts, an XML vocabulary
for describing web service endpoints and a library module.

The XML vocabulary is used to write declarative descriptions of the
endpoints. These description include the mapping of URI parts to
parameters, additional parameters, and conditions that must be met in
order for the incoming request to match.

The XML vocabulary has been designed so that same description can be
used for both the rewriter and the endpoint. One motivation for this
approach was to assure that the same code is used for both choosing
a rewrite and validating an endpoint; this minimizes the possibility
of semantic drift between the two halves of the task.

REST endpoint markup

The REST endpoint markup is designed to facilitate writing
services in MarkLogic server. To that end, it provides a single,
declarative style for describing the URI, parameters, and other
aspects of an "endpoint". The URI rewriter uses a list of these
endpoint descriptions to dispatch to the right main module. The main
module in turn uses the same description to validate its parameters.

In other words, by writing a single description you get a rewriter for
free and validated, typed parameters in a single function call.

The simplest example just maps the URI to a module:

<request uri="/" endpoint="/default.xqy"/>

If the incoming request is for "/", it is rewritten to "/default.xqy".
The uri is a regular expression, so you could also support requests
for index.html like this:

<request uri="^/(index\.html)?$" endpoint="/default.xqy"/>

Extracting parameters from the URI

If all your rewrites were that simple, you'd hardly need a library
to simplify them. A more common sort of example is one that involves
translating parts of a URI into parameters. Suppose, for example, that
we have an endpoint for displaying slides from a presentation. The endpoint
needs the name of the slide deck to display and the number of the slide.
We could just slap those on as parameters, but getting URIs right is
an important part of building a REST interface.

Instead, we want to expose them through URIs of this form:
/slides/deck/number. So the third slide of the "uc11" deck would
be referenced with the URI /slides/uc11/3. That endpoint can be described
as follows:

As before, the rewriter starts by comparing the request URI with the
specified uri regex. If it doesn't match, we don't have to go
further. But if request URI is /slides/uc11/3, then it does match.

For each uri-param, the rewriter will construct a parameter with
the specified name using fn:replace() to compute the value. So the
deck parameter will have the value uc11.xml because
fn:replace('/slides/uc11/3', '^/slides/(.+?)/(\d+)$', '$1.xml') is
uc11.xml and the num parameter will have the value 3.

These parameters are passed to the specified endpoint. You can think of
this as a rewrite of the request URI to

/slides.xqy?decl=uc11.xml&num=3

although that isn't exactly what happens for
subtle technical reasons
related to POST requests that aren't important right now.

Decoding parameters in the endpoint

So far so good. But inside the slides.xqy module, we've got to decode
all those parameters. In the simplest cases, like this one, it's not too
hard to call xdmp:get-request-field() for each parameter. But as endpoints
become more complicated with optional parameters and repeated parameters it
quickly becomes quite tedious.

Since we have a description of the endpoint, we can let a single library
function deal with all the complexity for us. Assuming we have the relevant
request element in $request, this simple call

$params := rest:parse-options($request)

will return a map of the parameters. That still may not look too compelling,
but we can immediately begin to expose additional benefits "for free".

The first benefit you get is error detection. If someone manages to call this
endpoint with the wrong options, the rest:parse-options()
method will automatically raise an error. This means you (a) don't have to write
the error checking code and (2) don't have to worry about bugs that might be
introduced by failing to check for those errors.

Parameter types

Another problem you may already have encountered in dealing with endpoint
parameters is their type. Consider this simple implementation of the slides
endpoint:

Even if you get the right document with $deck, you're still going to
get a type error because $slideno is a string. Rather than trying to
remember to do all of the type casting in your endpoint, and
addressing the errors potentially raised by those casts, we can
augment the description:

This code will work because $slideno will be a decimal. As before, you
get error detection for free. If someone manages to call this endpoint with a
num parameter that isn't a decimal, the rest:parse-options() call will
raise the error for you.

Supporting additional parameters

You aren't limited to just parameters parsed from the URI. Suppose we augment
the slides.xqy module to support a "theme" parameter. We can simply add that
to the description:

Required parameters

in which case a request URI that doesn't have a theme will not match in the rewriter, and an
attempt to call the endpoint without a theme will raise an error.

Default values

Alternatively, you can provide a default value:

<param name="theme" default="hires"/>

Repeatable parameters

You can also mark
a parameter as repeatable. Without stretching our running "slides" example too far,
let's say you wanted to allow a css parameter to specify additional stylesheets
for a particular slide. You might want to allow more than one, so you could add
a css parameter like this:

<param name="css" repeatable="true"/>

In the rewriter, this would allow any number of css parameters. In the endpoint,
there'd be a single css key in the parameters map but its value would be a list.

To recap: if you don't make any declarations about a parameter beyond its name, then
it's optional but may occur at most once; if you say it's required it must occur at
least once; and if you say it's repeatable, it may occur more than once.

Choosing from a list of values instead of an atomic type

For parameters that must be numbers, booleans, or other atomic types, the
as attribute is all you need. But suppose you want to limit the value to
a discrete set of possibilities?

A parameter like theme, for example, might be a string but might accept only
a few possible values. You can do that with the values attribute:

<param name="theme" values="hires|lowres|mobile|bw" default="hires"/>

With this declaration, the theme must be exactly one of "hires", "lowres",
"mobile", or "bw".

Aliases

Finally, to round out the options on parameter handling, you can also specify
aliases. This option is useful if you decide to change the names of parameters
on the server but don't want to break existing clients. Suppose, for example,
that you extend your slides framework so that it accepts stylesheets in languages
other than CSS. Then you might want to rename the css parameter to style.
If you specify the parameter this way:

<param name="style" alias="css" repeatable="true"/>

then the endpoint will see a parameter named style even if the user specifies
css in the request.

The real motivating factor for this feature however, is to support a, uh, feature
of jQuery. If, to continue the above example, you use jQuery's
AJAX features to send an array of style parameters to the endpoint, even if you specify
the key "style" in your JSON object, jQuery will send the parameter with the
name "style[]". It automatically adds square brackets to indicate that an array
is being passed. You can work around that by making style[] an alias:

<param name="style" alias="css|style[]" repeatable="true"/>

Like a list of values, a list of aliases can be separated by vertical bars.

Matching in parameters

We have one more trick up our sleeves with respect to parameters. Sometimes it's
useful to be able to perform the sort of match and replace operations on parameter
values that we can perform on parts of the URI. Suppose, for example, that you've
got a parameter that will contain an internet media type, you can extract part of
that value using match:

Matching multiple URIs

There's no rule in the URI rewriter that says only a single request can be specified
for a given endpoint. In the rules we've looked at so far for the "slides" example,
we're only matching a request for a single slide. Suppose we wanted another rule for
presenting the "title page" slide. One reasonable format for that URI would be to simply
leave off the trailing slide number. So /slides/uc11 would display the title page
slide. We can support that with two rules:

If the request doesn't have a number, it won't match the first request, but will
match the second. In this case, you simply have to make sure that the /slides.xqy endpoint
uses a request that will validate all of the possible rewrites.

To simplify earlier examples, I omitted the <options> element. In practice the rewriter
will have many requests to consider and they're grouped in an options element.

Other HTTP Verbs

A request that doesn't specify any verbs only matches HTTP GET requests. If you
want to match other verbs, simply list them:

With this request, only users with the specifed execute privilege can POST to
that URI. If a user without that privilege attempts to post, this request won't match
and control will fall through to the next request. In this way, you can provide fallbacks
if you wish.

In a rewriter, failing to match a condition causes the request not to match. In an endpoint,
failing to match a condition raises an error.

Authentication

As shown in the example above, the auth condition tests for the specified privilege:

Accept headers

When a user agent requests a URI, it can also specify the kinds of responses that it is
able to accept. These are specified in terms of media types. You can specify the
media type(s) that are acceptable with the accept header:

<accept>application/json</accept>

A request that specifies the accept shown above will only match user agent
requests that can accept JSON responses.

A request that specifies the function shown above will only match requests
for which the specified function returns true. The function will be passed
the URI string and the function condition element.

And

An and condition must contain only conditions. It returns true if
and only if all of its child conditions return true.

<and>
...conditions...
</and>

There is no guarantee that conditions will be evaluated in any
particular order or that all conditions will be evaluated.

If more than one condition is present at the top level in a request, they
are treated as they occurred in an and.

Or

An or condition must contain only conditions. It returns true if and
only if at least one of its child conditions return true.

<or>
...conditions...
</or>

There is no guarantee that conditions will be evaluated in any
particular order or that all conditions will be evaluated.

User extensibility

By default, the rewriter and endpoint parser will reject any request that
includes additional, user-specified parameters. Sometimes you may want to
write an endpoint that allows arbitrary parameters specified by the user.
This behavior is controlled by the user-params attribute. It can be specified
on the http method, the request, or even the options element.

The default, one-argument form of rest:rewrite() takes all of the relevant
parameters (URI, HTTP request method, accept headers, and user parameters) from
the environment. There are other entry points that provide more control if that's
desirable.

rest:process-request()

The rest:process-request() function is used in the endpoint main
module to parse the incoming request against the options. It returns a
map that contains all of the parameters as typed values. Processing
the request also checks all of the assocaited conditions and will
raise an error if any condition is not met.

If the request is processed successfully, you know that all of the conditions have
been met and the $params map contains all of the parameters. If not, an error
will occur which you can catch and process. See Handling errors.

rest:check-options()

Proper functioning of the REST endpoint library depends on the correctness of the
descriptions used. The rest:check-options() method takes an options node and returns
a report of the problems found. If this function does not return an empty sequence,
you have made a mistake and the library will not perform reliably.

rest:check-request()

Proper functioning of the REST endpoint library depends on the correctness of the
descriptions used. The rest:check-request() method takes a request node and returns
a report of the problems found. If this function does not return an empty sequence,
you have made a mistake and the library will not perform reliably.

rest:report-error()

The rest:report-error() function is a convenience function for transforming
error:error nodes into HTML. See Handling errors.

Low level functionality

The REST endpoint library exposes a few additional, low-level functions. These may
be useful in more complex applications that need to perform sophisticated processing
on requests or implement their own rewriting strategies.

rest:matching-request()

Note that the rewrite URI is composed from both the request element and the request
environment, including additional parameters, so it's not possible to construct the rewrite
URI solely from the request.

rest:test-request-method()

The rest:test-request-method() function tests a request against
the xdmp:get-request-method(). It returns an empty sequence if the
test passes and raises an error otherwise.

rest:test-conditions()

The rest:test-conditions() function tests all of the
conditions of a request. It returns an empty sequence
if the test passes and raises an error otherwise.

rest:get-acceptable-types()

The rest:get-acceptable-types() function returns a list of media types.
These are the media types that are the intersection of what the endpoint description
claims it can produce and what the user agent claimed it could accept.
They're returned in preference order.

rest:get-raw-query-params()

The rest:get-raw-query-params() function returns a map of all the
query parameters. This does not include the parameters that would be
derived from matching the URI string. The parameters returned by this
function are all strings, they have not been type checked.

Handling errors

The REST endpoint library includes a rest:report-error() function
that performs a simple translation of MarkLogic Server error markup to
HTML. You can invoke it in a module to report errors:

If the user agent making the request accepts text/html, a simple HTML-formatted
response is returned. Otherwise, the raw error XML is returned.

You can also use this function in an error handler to process all of the
errors for a particular application.

Handling redirects

The URL rewriter translates the requested URI into a new URI for
dispatching within the server. The user agent making the request is totally
unaware of this translation.
As REST APIs mature and expand, it's sometimes useful to respond to a
request by telling the user agent to reissue the request at a new URI.
This is called redirection.

For example, suppose we decide to change the URI pattern for slides from
/slides/deck/number to /presentations/deck/number. If someone
makes a request for a /slides/ URI, we want to use redirection to tell the
user agent they're using to reissue the request to the equivalent /presentations/ URI.
(Browser users can tell this has happened because the URI in their address bar
will change; this means if they bookmark the URI, they'll be bookmarking the new,
correct URI, not the old, incorrect one.)

You can support redirects by adding a redirect.xqy module like this one to
your application:

You can employ as many redirects as you want through the same redirect.xqy module
by changing the value of the __ml_redirect__ parameter. (In the unlikely event that
your application needs a parameter with that name, simple change it to something else
in both the module and each request.)

Handling OPTIONS

One of the nice things about having a declarative syntax for endpoint
descriptions is the ability to interrogate those definitions for other
purposes. For example, one could imagine automating some aspects of
unit testing based on the ability to find the description for an
endpoint.

You can implement this using the REST endpoint library by supporting the
OPTIONS method. Here's a very simple options.xqy module that will return
the request associated with a particular URI.

Note that if the request URI is /, this module will return the entire
options element, exposing the complete set of endpoints. For consistency, even
a single request is therefore wrapped in an options node.

Because a single description can match different HTTP methods, possibly with different
parameters, when the URI is not /, the request is treated as a GET for the purposes
of finding the request.

You can use it by adding the following request to the end of your options: