This is an Apache Working Draft for review by all interested
parties. It is a draft document and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Working Drafts as reference material or
to cite them as other than "work in progress". This work is part of the Apache Cocoon Project

Abstract

This document specifies an XML namespace that addresses a complete
region of web publishing, that of logic-based, dynamic content generation. This language
is introduced to fill an existing gap between the W3C specifications and working draft and
the increasing demand for a flexible server side approach based on the new XML paradigm.

Introduction

This document specifies both an XML document type definition and a development
methodology to generate dynamic XML by server side processing of client's requests. Such a
specification is useful to define an open and standard way to develop and maintain dynamic
XML server pages. The technology described in this document was designed to complete the
XML-based publishing framework defined by the Cocoon Project and it's mainly targeted
on this project, even if the final goal of this effort is to submit a request to a standard
body (such as W3C) for final recommendation.

Origins

The need for an open language to standardizing server side programmatic XML generation
was observed when XML-based web publishing frameworks emerged and no available technology
was detailed, stable, useful and open enough to be used. XSP, by mixing Turing-complete
programming logic with page content, provide a flexible yet fully portable and extensible
way to develop dynamic XML content. Moreover, being completely XML-based, XSP are fully
integrated with XML-based web architectures that allow XSL-transformation to obtain the
context separation that is needed for complex sites to increase their management
parallelism.

Being based on an XML paradigm from the beginning, XSP don't suffer limitations other
server pages technologies do: the ability to XSL-transform XSP directly and recursively
allows a more compact and precise DTD to be designed since content/logic/style separation
is performed by the architecture and not by the language itself. For this reason, XSP are
completely transparent to the namespaces/document-types used.

Layer Separation

Being a rather complex technology, the XSP specification will be separated into layers.
These layers will have different goals and restrictions and will allow faster development
cycles and a better defined development model. Every layer will define its own document
type definition which may extend the one of the previous layer or completely change it,
depending on layer goals. Layers should be seen as levels of abstraction, much like
programming languages range from higher-levels to lower-levels.

General Goals

Following is a summary of the design principles governing the general XSP specification:

should integrate completely with existing W3C recommendations and working drafts

should be programming language independent

should be aimed to programmers but should be relatively easy to understand

should allow pages to be compiled (into Java servlets or other equivalent technology)

should define the relations to the programming languages (object models, variable scopes)

Layer 2 Goals

Following is a summary of the design principles governing the Layer 2 of the XSP
specification:

should define a human oriented element set

should be aimed to human generations so:

reducing the number of elements to a minimum is of minimal importance

reducing verbosity of the documents is of maximal importance

should be aimed to medium-low knowledged programmers:

automatization of complex operations is of maximal importance

tendency to hide page logic is of maximal importance

should be possibly XSLT transformed into XSP Layer 1 documents

Final Goals

The XSP specification would eventually evolve into a single specification with a single
document type definition. This will happen when the working draft phase will be terminated
and all involved parties will agree on the specification stability. The Layer 1 will be
the first to be developed and tested in a working implementation. Subsequent layers will
probably need several evolution stages to reach their final shape.

Relationship to Existing Standards

Three standards have been especially influential:

JSP -

defines a way to embed programmatic logic into web documents.

XSLT -

defines a way to transform XML documents.

XML -

defines a flexible still highly structured paradigm for web
content generation and distribution.

Many server side dynamic web content generators have been evaluated and confronted,
especially WebMacro and GSP.

Terminology

The following basic terms apply in this document:

document -

a document is the final result of the client request phase and they can be obtain from a
single file that is read from disk/cache or by processing several ones. Documents are said
static if their content doesn't change with user request parameters nor time.
Documents are said dynamic if they do.

page -

a page is the entity that is requested by the client and drives the document creation
process. In the simplest case, a document is created reading the page and sending it
directly without further processing. In case of compiled pages, a binary object is
executed and it's content is used as page content. Pages are said compiled if
they are translated into binary code. Note that compiled pages may be created from normal
pages the first time the page is requested and executed as binary code in further requests
for performance reasons.

sheet -

a sheet is the processing unit of the document creation chain. Each sheet is a file and
they contain the instructions to transform the requested page into the document sent to
the requesting client. Sheets are said style sheets if they are the last of the
chain and no further processing in performed, logic sheets if they contain XSP
elements. Both types are said transformation sheets since they contain XSLT
elements.

document type -

a document type is a unique name that identifies the type of the document being
generated. This term has the same meaning as in the XML specification. Note how a document
has only one document type but this could change during processing since transformation
sheets allow the transformation from one document type into another.

XSP Syntax and DTDDefined External Entities

The XSP specification defines some external entities that may be used to reduce the
verbosity of XSP document, allowing the inclusion the default DTD via entity mapping. The
standard way to include the XSP DTD into XSP documents is:

The XSP DTD was designed with simplicity in mind. The number of elements and attributes
was reduced to a minimum to allow a fast and easy learning process. On the other hand, no
special helper elements were defined in Layer 1 to reduce the spec development time and to
favor early feedback from both implementers and users.

The following is the complete DTD. It must be noted that this DTD can hardly be used
(alone) to validate any XSP due to the fact that XSP are namespace orthogonal and are
designed to include as content mark-up elements that belong to other namespaces.
The XSchema effort will allow multi-namespace validation.

This simple example shows the power of content/logic/style separation. While the <title>
tag has a very special meaning in the page document type, indicating the page
title, the <counter> element is needs to be dynamically substituted by
the number of times the document has been requested. The logic that performs such behavior
is included in tag itself, but unlike other existing server side technologies, the
behavior is not defined in the page itself, but on the logic sheet that is applied to
evaluate this behavior. In fact, the same page may have a totally different behavior
depending on the logicsheet that is applied to the page. Note that it's beyond the scope
of this specification to define a way to associate transformation sheets to pages. The
associated logicsheet that uses the Java language as logic definition may look like:

At this point it's worth to note that from an XSP point of view, there is
no difference in how the XSP page was created, either directly written or
created with n levels of transformation. So, independently of whether an
XSL stylesheet or a special algorithm was used to generate the final
source code, it may look like this [Note: many key issues regarding
servlets were omitted for simplicity and this example must not be
considered mandating as a way to format XSP into servlet source code]

Note that in this example the XML document is being generated as a stream
but a DOM tree is used to create it. The DOM tree can't be passed directly to
the servlet engine for further processing because the current servlet specification
(2.2) does not allow for content generation in a format other than a stream. A rather undesirable consequence
of this is that the resulting XML document would need to be re-parsed in case a final XSL
stylesheet or other post-transformation must be applied.

To solve this problem and speed up the execution on server side XML
processing, the XSP can be compiled into something like this:

The above shows one of the best features of XSP: output independence.
Since the output objects are not accessible directly from the internal page
logic (unlike other similar technologies, such as JSP), the page compiler can
choose between a great variety of possible ways to generate and forward the
page content. In fact, while the first example uses DOM as a construction set
and a stream as output method, the exact same page is compiled in the second
example to use a SAX event-based model and a document handler as output.

Finally, It is beyond the scope of this specification to define how XSP are translated
into binary code and how these interact with the publishing frameworks that handle
them, but it is mandated that this should be completely transparent to the
page programmer and an XSP page should behave exactly the same (modulo
performance) in every XSP engine.

XSP and JSP

XSP and JSP might appear as overlapping at a first glance since they both:

While these are very important points were the two specifications do
overlap, there are significant differences described hereafter.

Output Exposure

In all different server pages technologies, some data regarding the status
of the resource are available to page logic. Since JSP follow the Servlet API
model, expecting JSP pages to be compiled into servlets, the same data
available to servlet is available to page logic. This allows the page logic to
obtain access to the output channel (being either an OutputStream or a Writer
for servlets).

While this is not a problem for normal web operation when no further server
side processing is performed, for XML generation (where further server side
processing may be needed, depending on client capabilities) the Servlet/JSP
limitations impose on the server pages engine a parsing stage that is
completely avoided in XSP.

In fact, in XSP, page logic has not direct access to the output
channel and it's the page compiler responsibility to choose the preferred
method to compile the page, depending on processing needs and server
requirements.

It should be noted how XSP spec provides three different contexts: content,
logic and eval. These three contexts never overlap
since content is used to create static markup content, logic
to indicate programming logic and eval to bridge the two domains,
allowing a logic component to be evaluated without exposing the output channel
to the logic context.

This is a very significant difference since it allows XSP page compiler to hardcode
pre-parsed XML content thus removing the request time parsing overhead that
JSP always require.

Page Readability

For these reasons, XSP, unlike JSP, uses the XML feature of syntax
orthogonalily that allows almost any programming language code to be
easily distinguishable between markup elements, while JSP needs to enclose
programming code by scriptlet tags. The following is an example to show
the different results based on the same logic and code.