XProc: An XML Pipeline Languagexproc2007-07-06XMLRevision markupNorman WalshSun Microsystems, Inc.Norman.Walsh@Sun.COMAlex MilowskiInvited expertalex@milowski.orgThis specification describes the syntax and semantics of
XProc: An XML Pipeline Language, a language
for describing operations to be performed on XML documents.
An XML Pipeline specifies a sequence of operations to be
performed on one or more XML documents. Pipelines generally accept one
or more XML documents as input and producing one or more XML documents
as output, though they are not required to do so. Some pipelines are
entirely self-contained, starting with input derived inside the
pipeline and producing no XML output.This section describes the status of this document at
the time of its publication. Other documents may supersede this
document. A list of current W3C publications and the latest revision
of this technical report can be found in the W3C technical reports index
at http://www.w3.org/TR/.This document was produced by the
XML Processing Model Working Group
which is part of the
XML Activity.
Publication as a Working Draft does not imply endorsement
by the W3C Membership. This is a draft document and may be updated,
replaced or obsoleted by other documents at any time. It is
inappropriate to cite this document as other than work in progress.
This is a public Working Draft. This draft addresses many, but
not all, of the design questions that were incomplete in previous
drafts. The library of standard steps, both required and optional, is
still being reviewed and considered. The Working Group continues to
encourage feedback from potential users. A revision marks draft, with
respect to the 5 April 2007 specification, has been provided, though
it is not obviously of great value due to editorial reorganization of
the material.The most significant changes in this draft are: a new mechanism
for dealing with parameters, new defaulting rules for primary input and
output ports, and revisions to the standard step library.Please send comments about this document to
public-xml-processing-model-comments@w3.org (public
archives are available).This document was produced by a group operating under the
5 February 2004 W3C Patent Policy.
W3C maintains a
public list of any patent disclosures
made in connection with the deliverables of the group; that page also
includes instructions for disclosing a patent. An individual who has
actual knowledge of a patent which the individual believes contains
Essential Claim(s)
must disclose the information in accordance with
section 6 of the W3C Patent Policy.
IntroductionAn XML Pipeline specifies a sequence of operations to be
performed on a collection of XML input documents. Pipelines take zero or more
XML documents as their input and produce zero or more XML documents as
their output.A pipeline consists of steps. Like
pipelines, steps take zero or more XML documents as their input
and produce zero or more XML documents as their output. The inputs to
a step come from the web, from the pipeline document, from the inputs to the
pipeline itself, or from the outputs of other steps in the pipeline. The outputs
from a step are consumed by other steps, are outputs of the pipeline as
a whole, or are discarded.There are two kinds of steps: atomic steps and compound
steps. Atomic steps carry out single operations and have no
substructure as far as the pipeline is concerned, whereas compound steps
include a subpipeline of steps within themselves.This specification defines a standard library,
, of steps.
Pipeline implementations may support additional types of
steps as well. is a graphical representation of a
simple pipeline that performs XInclude processing and validation on a
document.A simple, linear XInclude/Validate pipelineA simple, linear XInclude/Validate pipelineThis is a pipeline that consists of two atomic steps, XInclude and
Validate. The pipeline itself has two inputs, “Document” and “Schema”.
How these inputs are connected to XML documents outside the pipeline
is implementation-defined.
The XInclude step reads the pipeline input “Document” and produces a result
document. The Validate step reads the pipeline input “Schema” and
the output from the XInclude step and produces a
result document. The result of the validation, “Result Document”,
is the result of the pipeline.
How pipeline outputs are connected to XML documents outside the pipeline is
implementation-defined.The pipeline document for this pipeline is shown in
.A simple, linear XInclude/Validate pipeline<p:pipeline name="fig1" xmlns:p="http://www.w3.org/2007/03/xproc">
<p:input port="source" primary="yes"/>
<p:input port="schemaDoc" sequence="yes" primary="no"/>
<p:output port="result"/>
<p:xinclude/>
<p:validate-xml-schema>
<p:input port="schema">
<p:pipe step="fig1" port="schemaDoc"/>
</p:input>
</p:validate-xml-schema>
</p:pipeline>
is a more complex example: it
performs schema validation with an appropriate schema and then styles
the validated document.A validate and transform pipelineA validate and transform pipelineThe heart of this example is the conditional. The “choose”
step evaluates an XPath expression over a test document. Based on the
result of that expression, one or another branch is run. In this example,
each branch consists of a single validate step.A validate and transform pipeline<p:pipeline xmlns:p="http://www.w3.org/2007/03/xproc">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<div>
<p>This is documentation</p>
</div>
</p:documentation>
<p:choose>
<p:when test="/*[@version &lt; 2.0]">
<p:validate-xml-schema name="val1">
<p:input port="schema">
<p:document href="v1schema.xsd"/>
</p:input>
</p:validate-xml-schema>
</p:when>
<p:otherwise>
<p:validate-xml-schema name="val2">
<p:input port="schema">
<p:document href="v2schema.xsd"/>
</p:input>
</p:validate-xml-schema>
</p:otherwise>
</p:choose>
<p:xslt name="xform">
<p:input port="stylesheet">
<p:document href="stylesheet.xsl"/>
</p:input>
</p:xslt>
</p:pipeline>
Pipeline ConceptsA pipeline
is a set of connected steps, outputs flowing into inputs, without any loops (no step can
read its own output, directly or indirectly).
A pipeline is itself a
step and must satisfy the constraints on
steps.The result of evaluating a pipeline is the result of evaluating
the steps that it contains, in the order determined by the connections
between them. A pipeline must behave as if it
evaluated each step each time it occurs. Unless otherwise
indicated, implementations must not assume that
steps are functional (that is, that their outputs depend only on
their explicit inputs, options, and parameters) or side-effect free.StepsA step is the
basic computational unit of a pipeline. Steps are either atomic or
compound.An
atomic step is a step that performs a unit of
XML processing, such as XInclude or transformation, and has no
internal subpipeline. Atomic steps carry out fundamental XML
operations and can perform arbitrary amounts of computation, but they
are indivisible. An XSLT step, for example, performs XSLT processing;
an XML Schema Validation step validates one input with respect to some
set of XML Schemas, etc.There are many types of atomic steps. The standard library of
atomic steps is described in , but
implementations may provide others as well. Each use, or instance, of
an atomic step invokes the processing defined by that type of step. A
pipeline may contain instances of many types of steps and many
instances of the same type of step.Compound steps, on the other hand, control and organize the flow of
documents through a pipeline, reconstructing familiar programming
language functionality such as conditionals, iterators and exception
handling. They contain other steps, whose evaluation
they control.A
compound step is a step that
contains additional steps. That is, a compound step differs from an
atomic step in that its semantics are at least partially determined by the
steps that it contains.Every compound step contains one or more steps.
The steps that occur
directly inside a compound step
are called contained steps.A compound step which immediately
contains another step is called its container.The steps (and the
connections between them) within a compound step form a
subpipeline.The last step in a
subpipeline is the last step in document order within its container.
A compound step can contain one or more subpipelines and it
determines how and which of its subpipelines are evaluated.Steps have “ports” into which inputs and outputs are connected
or “bound”. Each step has a number of input ports and a number of
output ports, all with unique names. A step can have zero input ports
and/or zero output ports. (All steps have an implicit output port for
reporting errors that must not be declared.)
Steps have any number of options, all with unique names.
A step can have zero options.Steps may have access to any number of parameters, all with
unique names. A step can have zero parameters.Inputs and OutputsAlthough some steps can read and write non-XML resources,
what flows between steps through
input ports and output ports are exclusively XML documents or sequences of XML
documents. Each XML document (or document in a sequence) must conceptually be
an with a Document
Information Item at its root. The inputs and outputs can be
implemented as sequences of characters, events, or object models, or any other
representation the implementation chooses.It is a dynamic error if a non-XML
resource is produced on a step output or arrives on a step
input.An implementation may make it possible for a
step to produce non-XML output (through channels other than a named
output port)—for example, writing a PDF
document to a URI—but that output cannot flow through the pipeline. Similarly,
one can imagine a step that takes no pipeline inputs, reads a non-XML file
from a URI, and produces an XML output. But the non-XML file cannot arrive
on an input port to a step.The common case is that each step has one or more inputs and
one or more outputs. illustrates symbolically
an atomic step with two inputs and one output.An atomic stepAn atomic step with two inputs and one outputAll atomic steps are defined by a p:declare-step. The
declaration of an atomic step defines the input ports, output ports,
and options of all steps of that type. For example, every
p:xslt step has two inputs, named
“source” and “stylesheet”, and
one output named “result” and the same set of options.
The situation is slightly more complicated for compound steps
because they don't have separate declarations; each instance of a
compound step serves as its own declaration. Compound steps don't have
declared inputs, but they do have declared outputs, and unlike atomic
steps, on compound steps, the number and names of the outputs can be
different on each instance of the step. illustrates symbolically
a compound step with one output. As you can see from the
diagram, the output from the compound step comes from one of the outputs
of the subpipeline within the step.A compound stepA compound step with two inputs and one outputThe input ports declared on
a step are its declared inputs.The output ports declared on a
step are its declared outputs.
When a step is used in a pipeline, connections
are made to all of its inputs and outputs.
When a step is used, all of the declared
inputs of the step must be connected. Each input may be
connected to:The output port of some other step.A fixed, inline document or sequence of documents.A document read from a URI.One of the inputs declared on the top-level p:pipeline step.
When an input accepts a sequence of documents, it may have one
or more bindings to any of those locations.All of the declared outputs of a step
must be connected. Outputs may be connected to:The input port of some other step.One of the outputs declared on the top-level p:pipeline step.
Output ports on compound steps have a dual nature: from the
perspective of the compound step's siblings, its outputs are just
ordinary outputs and must be connected as described above. From the
perspective of the compound step itself, they are inputs into which
something must be connected.Within a compound step, the declared outputs
of the step can be connected to:The output port of some
contained step.A fixed, inline document or sequence of documents.A document read from a URI.Each input and output is declared to accept or produce either a
single document or a sequence of documents. It is
not an error to connect a port that is declared to
produce a sequence of documents to a port that is declared to accept
only a single document. It is, however, an error if the former
step actually produces more than one document at run
time.The
signature of a step is the set of inputs,
outputs, and options that it is declared to accept. Each
atomic step (e.g. XSLT or XInclude) has a fixed signature, declared
globally or built-in, which all its instances share, whereas each
compound step has its own implicit signature.A step
matches its signature if and only if it specifies
an input for each declared input, it specifies no inputs that are not
declared, it specifies
an option for each option that is declared to be required, and it
specifies no options that are not declared.
In other words, every input and required option must be specified
and only inputs and options that are declared may be
specified. Options that aren't required do not have to be
specified.Steps may also produce error, warning, and informative
messages. These messages appear on a special “error output” port defined
(only)
in the catch clause of a try/catch.
Outside of a try/catch, the disposition of error
messages is implementation-dependent.
Primary Inputs and OutputsAs a convenience for pipeline authors, each step may have one
input port designated as the primary input port and one output port
designated as the primary output port.If a step has exactly
one input port, or if one of its input ports is explicitly designated
as the primary, then that input port is the primary input
port of the step. If a step has a single input
port and that port is explicitly designated as
not being the primary input port, or if a step has more than
one input port and none is explicitly designated the primary, then the
primary input port of that step is undefined.If a step has exactly
one output port, or if one of its output ports is explicitly
designated as the primary, then that output port is the
primary output port of the step. If a
step has a single output port and that port is explicitly designated as
not being the primary, or if a step has more than
one output port and none is explicitly designated the primary, then the
primary output port of that step is undefined.The special significance of primary input and output ports is
that they are connected automatically by the processor if no
explicit binding is given. Generally speaking, if two steps appear
sequentially in a subpipeline, then the primary output of the first
step will automatically be connected to the primary input of the
second.Additionally, if a compound step has no
declared inputs and the first step in its subpipeline has an unbound
primary input, then an implicit (and unnamed) primary input port will
be added to the compound step. If a compound step has no declared
outputs and the last step in its subpipeline has an unbound primary
output, then an implicit (and also unnamed) primary output port will
be added to the compound step.The practical consequence of these rules is that
straightforward, linear pipelines are much simpler to read, write, and
understand. The following pipeline has a single input which is
transformed by the XSLT step; the result of that XSLT step is the
result of the pipeline:<p:pipeline xmlns:p="http://www.w3.org/2007/03/xproc">
<p:xslt>
<p:input port="stylesheet">
<p:document href="docbook.xsl"/>
</p:input>
</p:xslt>
</p:pipeline>
It is semantically equivalent to this pipeline:<p:pipeline name="main" xmlns:p="http://www.w3.org/2007/03/xproc">
<p:input port="source"/>
<p:input port="result">
<p:pipe step="transform" port="result"/>
</p:input>
<p:xslt name="transform">
<p:input port="source">
<p:pipe step="main" port="source"/>
</p:input>
<p:input port="stylesheet">
<p:document href="docbook.xsl"/>
</p:input>
</p:xslt>
</p:pipeline>
OptionsSome steps accept options. Options are name/value pairs.An option is
a name/value pair where the name is an
expanded name
and the value must be a string.
If a document, node, or other value is given, its
string value
is computed and that string is used.
The options declared on a
step are its declared options.
All of the options specified on an atomic step must have been declared.
Option names are always expressed as literal values, pipelines cannot
construct option names dynamically.
The options on a step which have
specified values, either because a p:option element specifies
a value or because the declaration included a default value,
are its specified options.ParametersSome steps accept parameters. Parameters are name/value pairs.A parameter is
a name/value pair where the name is an
expanded name
and the value must be a string.
If a document, node, or other value is given, its
string value
is computed and that string is used.
Unlike options, which have names known in advance to the
pipeline, parameters are not declared and their names may be unknown
to the pipeline author. Pipelines can dynamically construct sets of
parameters. Steps can read dynamically constructed sets with
parameter inputs.ConnectionsSteps are connected together by their input ports and output ports.
It is a static error if there are any loops in
the connections between steps: no step can be connected to itself
nor can there be any sequence of connections through other steps that
leads back to itself.EnvironmentThe
environment of a step is the static information
available to each instance of a step in a pipeline.The environment consists of:A set of readable ports.
The
readable ports are the step name/output
port name pairs that are visible to the step.
Inputs and outputs can only be connected to
readable ports.A set of in-scope options.
The
in-scope options are the set of options that
are visible to a step. All of the in-scope options are
available to the processor for computing option and parameter values.
The actual options passed to a step are those that are declared for a
step of its type and that have values either provided explicitly with
p:option elements on the step or as defaults in the
declaration of the step.A default readable port.
The default readable port,
which may be undefined, is a specific step name/port name pair from the set of readable
ports.The
empty environment contains no readable ports,
no in-scope options, and an undefined default readable port.
Unless otherwise specified, the environment of a contained step is its
inherited environment.
The inherited
environment of a
contained step is an environment that is the same
as the environment of its container with the
standard modifications.
The
standard modifications
made to an inherited environment are:All of the specified options of the
container are added to the
in-scope options. The value of any option
in the environment with the same name as one of the options specified on
the container is
shadowed by the new value.In other words, steps can access the most recently specified value of
all of the options specified on
any ancestor step.The union of all the declared outputs of all of the step's sibling
contained steps are added to the
readable ports.In other words, sibling steps can see each other's outputs
in addition to the outputs visible to their container.If there is a preceding sibling step element:If that preceding sibling has a primary output port,
then that output port becomes
the default readable port.Otherwise, the default readable port is undefined.If there is not a preceding sibling step element,
the default readable port is unchanged.A step with no parent inherits the empty environment.
XProc Extension Functions for XProcThe XProc processor must support a few additional functions in XPath
expressions evaluated by the processor.System PropertiesXPath expressions within a pipeline document can interrogate the processor
for information about the current state of the pipeline. Four aspects of the processor
are exposed through the p:system-property function
in the pipeline namespace:Function: Stringp:system-property(Stringproperty)The property string must have the form of a QName; the
QName is expanded into a name using the namespace declarations in scope for the
expression. The p:system-property function returns the string
representing the value of the system property identified by the QName.
If there is no such property, the empty string must be returned.Implementations must provide the following system
properties, which are all in the XProc namespace:p:episodeReturns a string which should be
unique for each invocation of the pipeline processor.The unique identifier must consist of ASCII alphanumeric characters and must
start with an alphabetic character. Thus, the string is syntactically an XML
name.p:product-nameReturns a string containing the name of the implementation,
as defined by the implementer. This should normally remain constant from one
release of the product to the next. It should also be constant across
platforms in cases where the same source code is used to produce compatible
products for multiple execution platforms.p:product-versionReturns a string identifying the version of the
implementation, as defined by the implementer. This should normally vary
from one release of the product to the next, and at the discretion of the
implementer it may also vary across different execution platforms.
p:vendorReturns a string which identifies the vendor of the processor.p:vendor-uriReturns a URI which identifies the vendor of the processor. Often, this is
the URI of the vendor's web site.p:versionReturns the version of XProc implemented by the processor; for processors
implementing the version of XProc specified by this document, the number is “1.0”.The value will always be a string in the lexical space of the decimal data
type defined in . This allows the value to be
converted to a number for the purpose of magnitude comparisons.Step AvailableThe p:step-available function reports whether or not
a particular type of step is understood by the processor.Function: Booleanp:step-available(Stringstep-type)The step-type string must have the form of a QName; the
QName is expanded into a name using the namespace declarations in scope for the
expression. The p:step-available function returns true if and
only if the processor knows how to evaluate steps of the specified type.Iteration CountIn the context of a p:for-each or a p:viewport,
the p:iteration-count function reports the number of
iterations that have occurred. In the context of other standard XProc
compound steps, it returns 1.Function: Integerp:iteration-count()In the context of an extension compound step, the value is
implementation-defined.Syntax OverviewThis section describes the normative XML syntax of XProc. This
syntax is sufficient to represent all the aspects of a pipeline, as
set out in the preceding sections.
Elements in a pipeline document represent the pipeline,
the steps it contains, the connections between those steps, the steps
and connections contained within them, and so on. Each step is represented
by an element; a combination of elements and attributes specify
how the inputs and outputs of each step are connected and how options and parameters
are passed.Conceptually, we can speak of steps as objects that have
inputs and outputs, that are connected together and which may
contain additional steps. Syntactically, we need a mechanism
for specifying these relationships.Containment is
represented naturally using nesting of XML elements. If a particular element
identifies a compound step then the step elements that
are its immediate children form its subpipeline.The connections
between steps are expressed using names and references to those
names.Six kinds of things are named in XProc:Step types,Steps,Input ports,Output ports,Options, andParametersXProc NamespacesThe XML syntax for XProc uses three namespaces:http://www.w3.org/2007/03/xprocThe namespace of the XProc XML vocabulary described by this
specification; by convention, the namespace prefix
“p:” is used for this namespace.http://www.w3.org/2007/03/xproc-stepThe namespace used for documents that are inputs to and outputs
from several standard and optional steps described in this
specification. Some steps, such as p:http-request and
p:store, have defined input or output vocabularies. We use
this namespace for all of those documents. The conventional prefix
“c:” is used for this namespace.http://www.w3.org/2007/03/xproc-errorThe namespace used for error reporting. When a step fails inside a
p:try, it may produce error messages that can be inspected in the
p:catch. The error namespace is used for those messages.
The conventional
prefix “err:” is used for this namespace.Scoping of NamesThe scope of the names of step types is the pipeline. Each pipeline
processor has some number of built in step types and may declare
(directly, or by reference to an external library) additional step types.
The scope of the names of the steps themselves is determined
by the environment of each step. In general,
the name of a step, the names of its sibling steps, the
names of any steps that it contains directly, the names of its
ancestors; and the names of its ancestor's siblings are all in the same scope.
All in-scope steps must have unique names: it is
a static error if two steps with the same
name appear in the same scope.The scope of an input or output port name is the step on
which it is defined. The names of all the ports on any step
must be unique.Taken together, these uniqueness constraints guarantee that the
combination of a step name and a port name uniquely identifies
exactly one port on exactly one in-scope step.The scope of option names is essentially the same as the scope
of step names, with the following caveat: whereas step names
must be unique, option names may be repeated. An option specified on a
step shadows any specification that may already be in-scope.Parameter names are not scoped; they are distinct on each step.Global AttributesThe following attributes may appear on any element
in a pipeline:The attribute xml:id with the semantics outlined in
.The attribute xml:base with the semantics outlined in
.Associating Documents with PortsA binding associates an input
or output port with some data source.
A document or a sequence of documents can be bound to a port in
four ways: by source, by URI, by providing it inline, or by making it explicitly empty.
Each of these mechanisms is allowed on the
p:input, p:output, p:xpath-context,
p:iteration-source, and p:viewport-source
elements.Specified by URIA document is specified
by URI if it is referenced with a URI.
The href
attribute on the p:document element is used to refer
to documents by URI.In this example, the input to the p:identity step named
“otherstep” comes from “http://example.com/input.xml”.
<p:identity name="otherstep">
<p:input port="source">
<p:document href="http://example.com/input.xml"/>
</p:input>
</p:identity>
It is a dynamic error if the processor
attempts to retrieve the URI specified on a p:document
and fails. (For example, if the
resource does not exist or is not accessible with the user's
authentication credentials.)Specified by sourceA document is specified
by source if it references a specific port
on another step. The
step and
port attributes on the p:pipe
element are used for this purpose.
In this example, the “document” input to the
p:xinclude step named
“expand” comes from the “result”
port of the step named “otherstep”.<p:xinclude name="expand">
<p:input port="source">
<p:pipe step="otherstep" port="result"/>
</p:input>
</p:xinclude>
When a p:pipe is used, the specified port must be
in the readable ports of the current environment.
It is a static error if the port specified by a
p:pipe is not in the readable ports of the
environment.Specified inlineAn
inline document is specified directly in
the body of the element that binds it.
The content of the p:inline element is used for this purpose.
In this example, the “stylesheet”
input to the XSLT step named
“xform” comes from the content of the
p:input element itself.<p:xslt name="xform">
<p:input port="stylesheet">
<p:inline>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
...
</xsl:stylesheet>
</p:inline>
</p:input>
</p:xslt>
Inline documents are considered “quoted”, they are not
interpolated or available to the pipeline processor in any way except
as documents flowing through the pipeline.Specified explicitly emptyAn
empty sequence of documents is specified with the
p:empty element.In this example, the “source”
input to the XSLT 2.0 step named
“generate” is explicitly empty:<p:xslt2 name="generate">
<p:input port="source">
<p:empty/>
</p:input>
<p:input port="stylesheet">
<p:inline>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
...
</xsl:stylesheet>
</p:inline>
</p:input>
<p:option name="template-name" value="someName"/>
</p:xslt2>
If you omit the binding on a primary input port, a binding to the
default readable port will be assumed. Making the
binding explicitly empty guarantees that the binding will be to an empty
sequence of documents.Note that a p:input or p:output element
may contain more than one p:pipe, p:document,
or p:inline
element. If more than one binding is provided, then the
specified sequence of documents is made available on that port in the same
order as the bindings.DocumentationPipeline authors may add documentation to their pipeline documents
with the p:documentation element. Except when it appears as a descendant of
p:inline, the p:documentation element is
completely ignored by pipeline processors, it exists simply for documentation
purposes. (If a p:documentation is provided as a descendant of p:inline,
it has no special semantics, it is treated literally as part of the document
to be provided on that port.)Pipeline processors that inspect the contents of p:documentation elements
and behave differently on the basis of what they find are not
conformant. Processor extensions must be
specified with extension elements.
Extension attributesAn element from the
XProc namespace may have any attribute not from the
XProc namespace, provided that the expanded-QName of the attribute has
a non-null namespace URI. Such an attribute is called an
extension attribute. The presence of
an extension attribute must not cause the connections between steps
to differ from the connections that any other
conformant XProc processor would produce. They must not cause the
processor to fail to signal an error that a conformant processor is
required to signal. This means that an extension attribute must not
change the effect of any XProc element except to the extent that the
effect is implementation-defined or implementation-dependent.A processor which encounters an extension attribute that it does not
recognize must behave as if the attribute was not present.Extension elementsThe presence of an extension element must not cause the
connections between steps to differ from the connections that any
other conformant XProc processor would produce. They must not cause
the processor to fail to signal an error that a conformant processor
is required to signal. This means that an extension element must not
change the effect of any XProc element except to the extent that the
effect is implementation-defined or implementation-dependent.There are three contexts in which an extension element might
occur:In an inline
document. All elements in an inline document are
considered quoted; no extension element can occur.
In a subpipeline. In a subpipeline, any
element in a namespace that is in the set of
ignored namespaces is an
extension element. Every other element identifies a step.In any other context, any element that is not in the pipeline
namespace is an error.Ignored namespacesThe element children of a p:pipeline can come from many different
namespaces. Some of the children identify steps in the subpipeline, others may
be extension elements.
In order to determine which elements are extension elements and which
are expected to identify steps, the pipeline may specify a set of
“ignored namespaces”
The ignored namespaces are a set of namespaces which do not
identify steps. They are ignored by the processor unless the processor
happens to recognize one or more of them as
extension elements.Syntactically, a pipeline author can specify the set
of ignored namespaces with the
ignore-prefixes attribute. This attribute
can appear on the p:pipeline and p:pipeline-library
elements.
It is a static error if the
ignore-prefixes
attribute appears on any other element in the pipeline namespace.The value of the ignore-prefixes attribute
is a sequence of tokens, each of which must be the prefix of
an in-scope namespace.
It is a static error if any token
specified in the ignore-prefixes attribute is not
the prefix of an in-scope namespace.Elements in an ignored namespace are only ignored when they appear
as the direct children of the p:pipeline or
p:pipeline-library which specifies the ignored namespaces.Any ignored namespaces that are specified in a pipeline library
are not inherited by pipelines either within that library or that
import that library, they only apply to the elements that appear as
children of the p:pipeline-library element on which they
are specified.Syntax SummariesThe description of each element in the pipeline namespace is accompanied
by a syntactic summary that provides a quick overview of the element's
syntax:For clarity of exposition, some attributes and elements are elided from
the summaries:An xml:id attribute is allowed on any element.
It has the semantics of .An xml:base attribute is allowed on any element.
It has the semantics of .The p:documentation element is not shown, but it is allowed anywhere.Attributes that are syntactic shortcuts
for option values are not shown.StepsThis section describes the core steps of XProc.
Every compound step in a pipeline has several parts: a set of inputs, a
set of outputs, a set of options, a set of contained
steps, and an environment.In previous drafts, inputs, outputs, and options occurred in
a fixed order. In this draft, they may appear in any order (but before the
contained steps). Is that problematic?
Except where otherwise noted, a compound step can have an arbitrary
number of outputs, options, and contained steps.
It is a static error if a
compound step has no contained steps.p:pipelineA pipeline is specified by the p:pipeline element. It
encapsulates the behavior of a subpipeline. Its
children declare the inputs, outputs, and options that the pipeline
exposes and identify the steps in its subpipeline.
A pipeline can declare additional steps (e.g., ones that are
provided by a particular implementation or in some
implementation-defined way) and import other pipelines. If a pipeline
has been imported, it may be invoked as a step within the pipeline
that imported it.Viewed from the outside, a p:pipeline is a
black box which performs some calculation on its inputs and produces
its outputs. From the pipeline author's perspective, the computation
performed by the pipeline is described in terms of contained
steps which read the pipeline's inputs and produce
the pipeline's outputs.The environment inherited by the contained
steps of a p:pipeline is the empty
environment with these modifications:All of the declared inputs of the pipeline are added to the
readable ports in the environment.If the pipeline has a primary input port, that input is
the default readable port, otherwise the default
readable port is undefined.All of the declared options of the
pipeline are added to the
in-scope options in the environment.If the p:pipeline has a primary output port
and that port has no binding, then it is
bound to the primary output port of
the last step in the subpipeline.
It is a static error if the
primary output port has no binding and the last step in the subpipeline does
not have a primary output port.There are two additional constraints on pipelines:A p:pipelinemust not itself be
a contained step.
If a p:pipeline is part of a
p:pipeline-library or if it is imported directly with
p:import, then it must have a
name or a type
or both.If the pipeline initially invoked by the processor has inputs or
outputs, those ports are bound to documents outside of the pipeline in
an implementation-defined manner.If a pipeline has a type then that
type may be used as the name of a step to invoke the pipeline. This
most often occurs when the it has been imported into another pipeline,
but pipelines may also invoke themselves recursively.
If it does not have a type, then
its name is used to invoke it as a step.For pipelines that are part of a p:pipeline-library, see
for more details on how
p:pipeline names are used to compute step names.ExamplesA pipeline might accept a document and a stylesheet as input;
perform XInclude, validation, and transformation; and produce the
formatted document as its output.A Sample Pipeline Document<p:pipeline name="pipeline" xmlns:p="http://www.w3.org/2007/03/xproc">
<p:input port="document" primary="yes"/>
<p:input port="stylesheet"/>
<p:output port="result" primary="yes"/>
<p:xinclude/>
<p:validate-xml-schema>
<p:input port="schema">
<p:document href="http://example.com/path/to/schema.xsd"/>
</p:input>
</p:validate-xml-schema>
<p:xslt>
<p:input port="stylesheet">
<p:pipe step="pipeline" port="stylesheet"/>
</p:input>
</p:xslt>
</p:pipeline>
p:for-eachA for-each is specified by the p:for-each element. It
processes a sequence of documents, applying its
subpipeline to each document in turn.When a pipeline needs to process a sequence of documents using
a step that only accepts a single document, the p:for-each
construct can be used as a wrapper around the step that accepts
only a single document. The p:for-each will apply that step to
each document in the sequence in turn.The result of the p:for-each is a sequence of
documents produced by processing each individual document in the input
sequence. If the subpipeline is connected to one or more output ports
on the p:for-each, what appears on each of those ports is
the sequence of documents that is the concatenation of the sequence
produced by each iteration of the loop.The p:iteration-source is an anonymous input:
its binding
provides a sequence of documents to the p:for-each
step. If no iteration sequence is explicitly provided, then the
iteration source is read from the
default readable port.A portion of each input document can be selected using the select attribute. If no selection is
specified, the document node of each document is
selected.Each subtree selected by the p:for-each from each
of the inputs that appear on the iteration source is wrapped in a document
node and provided to the subpipeline.
The processor provides each document, one at a time, to the
subpipeline represented by the children of the
p:for-each on a port named
current.For each declared output, the processor collects all the
documents that are produced for that output from all the iterations,
in order, into a sequence. The result of the p:for-each on
that output is that sequence of documents.The environment inherited by the contained steps
of a p:for-each is the inherited environment
with these modifications:The port named “current” on the p:for-each is
added to the readable ports.The port named “current” on the p:for-each is
made the default readable port.If the p:for-each has a primary output port
and that port has no binding, then it is
bound to the primary output port of
the last step in the subpipeline.
It is a static error if the
primary output port has no binding and the last step in the subpipeline does
not have a primary output port.ExamplesA p:for-each might accept a sequence of chapters as its input,
process each chapter in turn with XSLT, a step that accepts only a
single input document, and produce a sequence of formatted chapters as
its output.A Sample For-Each<p:for-each name="chapters">
<p:iteration-source select="//chapter"/>
<p:output port="html-results">
<p:pipe step="make-html" port="result"/>
</p:output>
<p:output port="fo-results">
<p:pipe step="make-fo" port="result"/>
</p:output>
<p:xslt name="make-html">
<p:input port="stylesheet">
<p:document href="http://example.com/xsl/html.xsl"/>
</p:input>
</p:xslt>
<p:xslt name="make-fo">
<p:input port="source">
<p:pipe step="chapters" port="current"/>
</p:input>
<p:input port="stylesheet">
<p:document href="http://example.com/xsl/fo.xsl"/>
</p:input>
</p:xslt>
</p:for-each>
The //chapter elements of the document are
selected. Each chapter is transformed into HTML and XSL Formatting Objects using an
XSLT step. The resulting HTML and FO documents are aggregated together
and appear on the html-results and fo-results
ports, respectively, of the chapters step itself.p:viewportA viewport is specified by the p:viewport element. It
processes a single document, applying its
subpipeline to one or more subsections of the
document.
The result of the p:viewport is a copy of the original
document with the selected subsections replaced by the results of
applying the subpipeline to them.The p:viewport-source is an anonymous input:
its binding
provides a single document to the p:viewport
step. If no document is explicitly provided, then the
viewport source is read from the
default readable port.The match attribute specifies an
expression that is a
Pattern
in . Each matching node in the source document
is wrapped in a document node and provided to the viewport's
subpipeline.The processor provides each document, one at a time, to the
subpipeline represented by the children of the
p:viewport on a port named
current.What appears on the output from the p:viewport will
be a copy of the input document where each matching node is
replaced by the result of applying the subpipeline to the subtree
rooted at that node.It is a dynamic error
if the viewport source does not provide exactly one document.
The environment inherited by the contained steps
of a p:viewport is the inherited environment
with these modifications:The port named “current” on the p:viewport is
added to the readable ports.The port named “current” on the p:viewport is
made the default readable port.If the p:viewport has a primary output port
and that port has no binding, then it is
bound to the primary output port of
the last step in the subpipeline.
It is a static error if the
primary output port has no binding and the last step in the subpipeline does
not have a primary output port.ExamplesA p:viewport might accept an XHTML document as its input,
add an hr element before all div elements that
have the class value “chapter”,
and return an XHTML document that is the same as the original except
for that change.A Sample Viewport<p:viewport match="h:div[@class='chapter']"
xmlns:h="http://www.w3.org/1999/xhtml">
<p:insert at-start="true">
<p:input port="insertion">
<p:inline>
<hr xmlns="http://www.w3.org/1999/xhtml"/>
</p:inline>
</p:input>
</p:insert>
</p:viewport>
</p:pipeline>
The nodes which match h:div[@class='chapter'] (according to the
rules of ) in the input document are selected.
An hr is inserted as the first child of each h:div
and the resulting version replaces the original h:div.
The result of the whole step is
a copy of the input document with a horizontal rule as the first child of each
selected h:div.p:chooseA choose is specified by the p:choose element. It
selects exactly one of a list of alternative
subpipelines based on the
evaluation of expressions.A p:choose has no inputs. It contains an arbitrary number of
alternative subpipelines,
exactly one of which
will be evaluated.The list of alternative subpipelines consists of zero or more
subpipelines guarded by an XPath expression, followed optionally by a
single default subpipeline.The p:choose considers each subpipeline in turn and selects
the first (and only the first) subpipeline for which the guard
expression evaluates to true in its context.
If there are no subpipelines for which the expression evaluates to
true, the default subpipeline, if it was specified, is
selected.After a subpipeline is selected, it is
evaluated as if only it had been present.The result of the p:choose
is the result of the selected subpipeline.In order to ensure that the result of the p:choose is
consistent irrespective of the subpipeline chosen,
each subpipeline must
declare the same number outputs with the same names. If any of the subpipelines
specifies a primary output port, each subpipeline must
specify exactly the same output as primary.
It is a
static error if two
subpipelines
in a p:choose declare different outputs.It is a dynamic error
if no subpipeline is selected by the p:choose
and no default is provided.The p:choose can specify the context node against which
the
expressions that occur on each branch are evaluated. The context
node is specified as a
binding for the
xpath-context. If no binding is provided, the default
xpath-context is the document on the
default readable port.
It is a static error if no binding
is provided and the default readable port is undefined.It is a dynamic error
if the xpath-context is bound to a sequence of
documents.Each conditional subpipeline is represented by a
p:when element.Each p:when branch of the p:choose has a
test attribute which must
contain an
expression. That XPath expression's effective boolean value is the
guard expression for the subpipeline contained
within that p:when.The p:when can specify a context node against which
its test expression is to be evaluated.
That context node is specified as a binding
for the xpath-context.
If no context is specified on the p:when, the context
of the p:choose is used. It is a
static error if no context is specified in
either the p:choose or the p:when
and the default readable port is
undefined.The default branch is represented by a
p:otherwise element.ExamplesA p:choose might test the version attribute of the document
element and validate with an appropriate schema.A Sample Choose<p:choose name="version">
<p:when test="/*[@version = 2]">
<p:validate-xml-schema>
<p:input port="schema">
<p:document href="v2schema.xsd"/>
</p:input>
</p:validate-xml-schema>
</p:when>
<p:when test="/*[@version = 1]">
<p:validate-xml-schema>
<p:input port="schema">
<p:document href="v1schema.xsd"/>
</p:input>
</p:validate-xml-schema>
</p:when>
<p:when test="/*[@version]">
<p:identity/>
</p:when>
<p:otherwise>
<p:error code="NOVERSION"
description="Required version attribute missing."/>
</p:otherwise>
</p:choose>
p:groupA group is specified by the p:group element. It
encapsulates the behavior of its subpipeline.A p:group is a convenience wrapper for a collection of steps.
The result of a p:group is the result of its subpipeline.ExamplesAn Example Group<p:group>
<p:option name="db-key" value="some-long-string-of-nearly-random-characters"/>
<p:choose>
<p:when test="/config/output = 'fo'">
<p:xslt>
<p:parameter name="key" select="$db-key"/>
<p:input port="stylesheet">
<p:document href="fo.xsl"/>
</p:input>
</p:xslt>
</p:when>
<p:when test="/config/output = 'svg'">
<p:xslt>
<p:parameter name="key" select="$db-key"/>
<p:input port="stylesheet">
<p:document href="svg.xsl"/>
</p:input>
</p:xslt>
</p:when>
<p:otherwise>
<p:xslt>
<p:parameter name="key" select="$db-key"/>
<p:input port="stylesheet">
<p:document href="html.xsl"/>
</p:input>
</p:xslt>
</p:otherwise>
</p:choose>
</p:group>
p:try/p:catchA try/catch is specified by the p:try element. It
isolates a subpipeline, preventing any errors
that arise within it from being exposed to the rest of the
pipeline.The p:group represents the initial subpipeline and
the recovery (or “catch”) pipeline is identified with a
p:catch element.The p:try step
evaluates the initial subpipeline and, if no errors occur, the results of that
pipeline are the results of the step. However, if any errors
occur, it abandons the first subpipeline, discarding any output that
it might have generated, and evaluates the recovery subpipeline.If the recovery subpipeline is evaluated, the results of the
recovery subpipeline are the results of the p:try step. If
the recovery subpipeline is evaluated and a step within that
subpipeline fails, the p:try fails.In order to ensure that the result of the p:try is consistent
irrespective of whether the initial subpipeline provides its output or
the recovery subpipeline does, both subpipelines must declare the same
number of outputs with the same names.
If either of the subpipelines
specifies a primary output port, both subpipelines must
specify exactly the same output as primary.
It is a static
error if the p:group and p:catch subpipelines
declare different outputs.The recovery subpipeline of a p:try
is identified with a p:catch:The environment inherited by the contained
steps of the p:catch is the inherited
environment with
this modification:The port named “error” on the p:catch is
added to the readable ports.Should the error port be made the default readable port?ExamplesA pipeline might attempt to process a document by dispatching it
to some web service. If the web service succeeds, then those results
are passed to the rest of the pipeline. However, if the web service
cannot be contacted or reports an error, the p:catch step can
provide some sort of default for the rest of the pipeline.An Example Try/Catch<p:try>
<p:group>
<p:http-request>
<p:input port="source">
<p:inline>
<c:http-request method="post" href="http://example.com/form-action">
<c:entity-body content-type="application/x-www-form-urlencoded">
<c:body>name=W3C&amp;spec=XProc</c:body>
</c:entity-body>
</c:http-request>
</p:inline>
</p:input>
</p:http-request>
</p:group>
<p:catch>
<p:identity>
<p:input port="source">
<p:inline>
<c:error>HTTP Request Failed</c:error>
</p:inline>
</p:input>
</p:identity>
</p:catch>
</p:try>
Other StepsOther steps are specified by elements that occur as
contained steps and are not in any of the the ignored
namespaces.Other steps can be atomic:Or compound:Each atomic step must be the name of a p:pipeline
type or must have been declared with a p:declare-step that
appears in the pipeline, or an imported library, before it is used.
Pipelines can refer to themselves (recursion is allowed),
to pipelines defined in imported libraries, and to other pipelines in the
same library if they are in a library.If the step element name is the same as the type of a step
declared with p:declare-step, then that step invokes the
declared step.If the step element name is the same as the type or name of a
p:pipeline, then that step runs the pipeline identified by
that type or name.It is a static error
if a pipeline contains a step whose specified inputs, outputs, and
options do not match the
signature for steps of that
type.It is a dynamic error
if the running pipeline attempts to invoke a step which the processor
does not know how to perform.A pipeline author can make the set of parameters passed to a step explicit
with a parameter input. If the step does not make
an explicit binding for a parameter input, the default could be either to pass
no parameters to the step or to behave as if the parameter input was bound
to the pipeline parameters.The working group is divided on this issue and this draft does not
provide an answer to that question. Reader feedback is encouraged.Syntactic Shortcut for Option ValuesNamespace qualified attributes on a step are
extension attributes.
Attributes, other than name, that are not
namespace qualified are treated as a syntactic shortcut for
specifying the value of an option. In other words, the following two
steps are equivalent:The first step uses the standard p:option syntax:

]]>The second step uses the syntactic shortcut:]]>Note that there are significant limitations to this shortcut syntax:It only applies to option names that are not in a namespace.It only applies to option names that are not otherwise used on the
step, such as “name”.It can only be used to specify a constant value. Options that are computed
with a select
expression must be written using the longer form.It is a static error if an
option is specified with both the shortcut form and the long form.It is a static error to use an
option on an atomic step that is not declared on steps of that type.Other pipeline elementsp:input ElementA p:input identifies an input port for a step, declaring it
if necessary. There are two kinds of inputs that may be declared, ordinary
“document” inputs and “parameter” inputs.
It is a static error to specify
any kind of input other than “document” or “parameter”.Document InputsThe port attribute defines the name
of the port. It is a static error to identify
two ports with the same name on the same step.It is a static error
if the port given does not match the name
of an input port specified in the step's declaration.On compound steps and p:declare-step, an input
declaration can indicate if a sequence of documents is allowed to
appear on the port and if the port is a primary input port.
If sequence
is specified with the value “yes”, then a sequence is allowed.
If
sequence is not specified,
or has the value “no”, then it is a
dynamic error for a sequence of more than one
document to appear on the declared port.An input port is a primary input port if
primary is specified with the value “yes”
or if the step has only a single input port and primary is not specified. It is a static error to specify
that more than one input port is the primary.On p:declare-step, the p:input simply declares the
input port. In all other contexts, the declaration may be
accompanied by a binding
for the input:If no binding is provided, the input will be bound to the
default readable port.
It is a static error if no binding
is provided and the default readable port is undefined.
A select
expression may also be provided. The select expression, if specified, applies the
specified
select expression to the document(s) that are read.
Each node that matches is wrapped in a document and provided to the
input port. In other words,<p:input port="source">
<p:document href="http://example.org/input.html"/>
</p:input>
provides a single document, but<p:input port="source" select="//html:div">
<p:document href="http://example.org/input.html"/>
</p:input>
provides a sequence of zero or more documents, one for each matching
html:div in http://example.org/input.html.A select expression can equally be applied to input read from
another step. This input:<p:input port="source" select="//html:div">
<p:pipe step="origin" port="result"/>
</p:input>
provides a sequence of zero or more documents, one for each matching
html:div in the document (or each of the documents)
that is read from the result
port of the step named origin.It is a
dynamic error if the select
expression on a p:input returns anything other than a possibly empty
set of nodes.Parameter InputsA parameter input port is a distinguished kind of input port. It
exists only to receive computed parameters; if a step does not have a
parameter input port then it cannot receive computed parameters. A
parameter input port must satisfy all the constraints of a normal,
document input.The port attribute defines the name
of the port. It is a static error to identify
two ports with the same name on the same step.It is a static error
if the port given does not match the name
of an input port specified in the step's declaration.When used on a step, parameter input ports always accept a
sequence of documents. If no binding is provided, the parameter input
will be bound to @@TBD.All of the documents that appear on a parameter input must
either be c:parameter documents or c:parameter-list
documents.A step which accepts a parameter input reads all of the documents presented
on that port, using each c:parameter (either at the root or inside
the c:parameter-list) to establish the value of the
named parameter. If the same name appears more than once, the last value specified
is used. If the step also has literal p:parameter elements, they are
are also considered in document order. In other words, p:parameter elements
that appear before the parameter input may be overridden by the
computed parameters; p:parameter elements that appear after may override the
computed values.The c:parameter elementA c:parameter represents a parameter on a parameter input.The name attribute of the c:parameter
must have the lexical form of a QName. If it contains a colon, then its expanded
name is constructed using the namespace declarations in-scope on the c:parameter
element. If it does not contain a colon and the namespace
attribute is specified, then it is an expanded name in the specified namespace.
If the namespace attribute is not specified, its expanded name has no namespace.
It is a
dynamic error if the name attribute
of a c:parameter element contains a colon and a
namespace attribute is specified.Any extension attributes that appear on the c:parameter element
are ignored.The c:parameter-list elementA c:parameter-list represents a list of parameters on a
parameter input.The c:parameter-list contains zero or more c:parameter
elements.
It is a dynamic error
if the parameter list contains any elements other than
c:parameter.Any extension attributes that appear on the c:parameter-list element
are ignored.p:iteration-source ElementA p:iteration-source identifies input to a p:for-each.The select attribute and
binding
of a p:iteration-source work the same way that they do in a
p:input.p:viewport-source ElementA p:viewport-source identifies input to a p:viewport.Only one binding is allowed and it works
the same way that bindings work on a p:input.
It is a
dynamic error for a sequence of more than one
document to appear on the p:viewport-source.
No select expression is allowed.p:xpath-context ElementA p:xpath-context identifies a context against which
an
expression will be evaluated for a p:when.
Only one binding is allowed and it works
the same way that bindings work on a p:input.
It is a
dynamic error for a sequence of more than one
document to appear on the p:xpath-context.
No select expression is allowed.It is a dynamic error
if the context is bound to p:empty and the test expression
refers to the context node.p:output ElementA p:output identifies an output port, optionally
declaring it, if necessary.The port attribute defines the name
of the port. It is a static error to identify
two ports with the same name on the same step.It is a static error
if the port given does not match the name
of an output port specified in the step's declaration.An output declaration can indicate if a sequence of documents is
allowed to appear on the declared port. If
sequence is specified with the value “yes”,
then a sequence is allowed.
If sequence is not
specified on p:output, or has the value “no”, then it is a
dynamic error if the step produces a
sequence of more than one document on the declared port.An output declaration can indicate if it is to be considered the
primary output for the step.
If primary is specified with the value “yes”,
then the named port will be treated as the primary output port.
It is a static error to identify
more than one output ports as primary.On compound steps,
the declaration may be accompanied by a
binding for the output.It is a static error to
specify a binding for a p:output inside a
p:declare-step.If a binding is provided for a p:output, documents are
read from that binding and those documents form the
output that is written to the output port. In other
words, placing a p:document inside a p:output causes
the processor to read that document and provide it on
the output port. It does not cause the processor to
write the output to that document.p:log ElementA p:log element is a debugging aid. It associates a
URI with a specific output port on a step:The semantics of p:log are that it writes to the specified
URI whatever document or documents appear on the specified port. How a sequence
of documents is represented is implementation-defined.It is a static error if the
port specified on the p:log is not the name of an output port
on the step in which it appears or if more than one p:log
element is applied to the same port.Implementations may, at user option, ignore all p:log
elements.Options and Parametersp:option ElementThe p:option element is used both to declare
options and to establish values for them. It can occur in three
contexts:
On p:declare-step, it declares
that the step accepts the named option. It may also provide a default value
for the option.
On a compound step, it provides a value for the option,
simultaneously declaring it.On an atomic step, it provides a value for the
option, overriding any default specified in the declaration.Declaring OptionsOptions are declared on p:declare-step
with p:option:The name of the option must be a QName. If it does not contain a prefix
then it is in no namespace.
It is a static error to declare
an option in the XProc namespace.An option may be declared as required
or it may be given a default
value.
It is a static error to specify
that the option is both requiredand has a default value.If an option is required, it is a static
error to invoke the step without specifying a value for
that option.Using OptionsOptions are used on a step with p:option.
The name of the option must be a QName. If it does not contain a prefix
then it is in no namespace.
It is a static error to declare
an option in the XProc namespace.The option must be
given a value when it is used.
It is a static error to use an
option on an atomic step that is not declared on steps of that type.Assigning Values to OptionsWhen an option is declared, it may be given
a default value. When it is used, it must be given
a value.The value can be specified in two ways: with a
select or
value attribute.If a select expression is given, it
is evaluated against the document specified
in the
binding
and the string value of the expression becomes the
value of the option. It is a dynamic error
if a document sequence is specified in the binding for a
p:option.The select expression may refer to
the values of other in-scope options by variable reference.
It is a
static error if the variable reference uses a
QName that is not the name of an in-scope option.If a select expression is used but
no binding is provided, the implicit binding is to the
default readable port.
It is a static error if no binding
is provided and the default readable port is undefined.If a value attribute is specified,
its content becomes the value of the option.In the case where the value of an option is a constant, its
value may also be specified on the parent step as specified in
.It is a static error if the
value is not specified with either select or
value, or if both are specified.p:parameter ElementThe p:parameter element is used to establish the
value of a parameter. The parameter must be given a
value when it is used.The value can be specified in two ways: with a
select or
value attribute.If a select expression is given, it
is evaluated against the document specified
in the
binding
and the string value of the expression becomes the
value of the parameter.
It is a dynamic error
if a document sequence is specified in the binding for a
p:parameter.The select expression may refer to
the values of in-scope options by variable reference.
It is a
static error if the variable reference uses a
QName that is not the name of an in-scope option.If a select expression is used but
no binding is provided, the implicit binding is to the
default readable port.
It is a static error if no binding
is provided and the default readable port is undefined.If a value attribute is specified,
its content becomes the value of the parameter.It is a static error if the
value is not specified with either select or
value, or if both are specified.Option and Parameter Namespace BindingsOption and parameter values carry with them not only their literal
or computed string value but also the set of namespaces that were in-scope
on the element which defined them.This is necessary because QName values in options or parameters
may subsequently need to be expanded by the steps which accept them.When values are computed from several sources, the union of the
sets of namespaces must be carried with the new value. The results of computing
the union in the presence of conflicting declarations for a particular prefix
are implementation-dependent.p:declare-step ElementA p:declare-step provides the type and
signature of an atomic step. It declares the
inputs, outputs, and options for all steps of that type.Implementations may use
extension attributes to
provide implementation-dependent information about a declared step.
For example, such an attribute might identify the code which
implements steps of this type.The value of the type can be from
any namespace provided that the expanded-QName of the value has
a non-null namespace URI. It is a
static error
if the type attribute is in no namespace.
If the namespace URI of the type is the
XProc namespace, then the declaration must be exactly as defined in
this specification. Neither users nor implementors may define additional
steps in the XProc namespace.p:pipeline-library ElementA p:pipeline-library is a collection of step
declarations and/or pipeline definitions.
Libraries can import pipelines and/or other libraries.
It is a static error if the import
references in a pipeline or pipeline library are
circular.If the p:pipeline-library specifies a namespace with
the namespace attribute, then all of the
untyped pipelines that occur in the library are in that
namespace.For example, given the following pipeline library:<p:pipeline-library xmlns:p="http://www.w3.org/2007/03/xproc"
namespace="http://example.com/ns/pipelines">
<p:import href="ancillary-library.xml"/>
<p:import href="other-pipeline.xml"/>
<p:pipeline name="validate">
<!-- definition of validate pipeline -->
</p:pipeline>
<p:pipeline name="format" type="my:format"
xmlns:my="http://example.com/vanity/mine">
<!-- definition of format pipeline -->
</p:pipeline>
</p:pipeline-library>
The pipeline named “validate”
is in the namespace
http://example.com/ns/pipelines. That means
that it must be invoked from the importing pipeline with
a qualified name of the form:
…
]]>(Assuming that the “ex” prefix is bound to
http://example.com/ns/pipelines.)The pipeline named “format” has an explicit type so
it must be invoked with a qualified name of the form:
…
]]>(Assuming that the “my” prefix is bound to
http://example.com/vanity/mine.)The pipeline library namespace applies only to pipelines that are
defined directly in the library; it does not apply to pipeline libraries
that are imported or pipelines that are directly imported.p:import ElementAn p:import loads a pipeline or pipeline library, making
it available in the pipeline or library which contains the p:import.An import statement loads the specified URI and makes any
pipelines declared within it available to the current pipeline.
An imported pipeline has an implicit signature that consists of the
inputs, outputs, and options declared on it.
It is a dynamic error if the URI
of a p:import cannot
be retrieved or if, once retrieved, it does not point to a
p:pipeline-library or p:pipeline.It is a dynamic error to import
a single pipeline if that pipeline does not have a
name or a type.It is a dynamic error to import
more than one pipeline with the same name or type (either directly or within a
library).p:pipe ElementA p:pipe reads from the output port of another step.The p:pipe element connects to the output port of another step.
It identifies the output port too which it connects with the name of the step in the
step attribute and the name of the port
on that step in the port attribute.In all cases except the p:output
of a compound step, it is a static
error if the port identified by a p:pipe is not
in the readable ports of the
environment of the step that contains the
p:pipe.It is a static error
if the port identified by a p:pipe in the
p:output of a compound step is not
in the readable ports of the environment
inherited by the contained steps of the
compound step.In other words, the output of a compound step must be bound to
the output of one of its contained steps. All
other bindings must be to ports that are already readable in the
current environment.p:inline ElementA p:inline provides a document inline.The content of the p:inline element is wrapped in a document
node and passed as input. The base URI of the document is the base URI of the
p:inline element.The nodes inside a p:inline element naturally inherit
the namespaces that are in-scope at the point where they occur in the
pipeline document. Implementations must assure that those namespaces
remain in-scope in the resulting document.It is a static error
if the content of the p:inline element is not a well-formed
XML document.p:document ElementA p:document reads an XML document from a URI.The document identified by the URI in the href
attribute is loaded and returned.It is a dynamic error if the document
referenced by a p:document element
does not exist, cannot be accessed, or is not a well-formed XML
document.The parser which the p:document element employs
must be
conformant to Namespaces in XML.
It must not perform validation.
It must not perform any other processing, such as
expanding XIncludes.Use the p:load step if you need to perform DTD-based
validation or wish to perform other processing on the document before it is
used by a step.p:empty ElementA p:empty binds to an empty sequence of documents.p:documentation ElementA p:documentation contains human-readable documentation.There are no constraints on the content of the p:documentation element.
Documentation is ignored by pipeline processors.
ErrorsErrors in a pipeline can be divided into two classes: static
errors and dynamic errors.Static ErrorsA static
error is one which can be detected before pipeline evaluation
is even attempted. Examples of static errors include cycles,
incorrect specification of inputs and outputs, and reference to unknown
steps.Static errors are fatal and must be detected before any steps
are evaluated.Dynamic ErrorsA A dynamic
error is one which occurs while a pipeline is being
evaluated. Examples of dynamic errors include references
to URIs that cannot be resolved, steps which fail, and pipelines
that exhaust the capacity of an implementation (such as memory or
disk space).If a step fails due to a dynamic error, failure propagates
upwards until either a p:try is encountered or the entire pipeline
fails. In other words, outside of a p:try, step failure causes
the entire pipeline to fail.Standard Step LibraryThis appendix describes the standard XProc steps. A machine-readable
description of these steps may be found in
pipeline-library.xml.Some steps in this appendix consume or produce an XML vocabulary
defined in this section. In all cases, the namespace for that
vocabulary is http://www.w3.org/2007/03/xproc-step and is
represented by the prefix 'c:' in this appendix.Also, in this section, several steps use this empty element for
result information:The steps described in this draft are intended mainly as a
starting point for discussion and to present a flavor for the sorts of
steps envisioned. The WG has not yet discussed them in detail.Required StepsThis section describes standard steps that must be supported
by any conforming processor.CountThe Count step counts the number of documents in the source input sequence and returns a single document on result containing that number. The generated document contains a single c:result element whose value attribute is the string representation of the number of documents in the sequence.DeleteThe Delete step deletes the matching items from the source input document and produces the resulting document with the deletions on the result port. The matching items are specified by the match pattern in the

match

option. The match pattern may match multiple items to be deleted but nested matches are not considered as their ancestors would already be deleted.

EqualThe Equal step accepts two, single documents and returns “1” if
they are fn:deep-equal (as defined in
) to each other, “0”
otherwise. The return value is expressed using a c:result
document.

ErrorThe Error step generates an error using the options specified on the step. The error generated can be caught
by a try/catch language construct like any other dynamic error.

Note that this step has no inputs and no outputs. The containing language construct (e.g. a choose construct)
controls whether the error is generated. Since the step generates an error upon invocation, there is no direct output. Instead, the error generates an instance of the err:errors element on the error port just like any other dynamic error.For example, give the following invocation:

]]>

The error vocabulary element (and document) generated on the error output port is:
The document element is unknown
]]>
Escape MarkupThe Escape Markup step applies XML serialization to the
children of the document element and replaces those children with their
serialization. The outcome is a single element with text content that
represents the "escaped" syntax of the children if they were
serialized.For example, the input:<description>
<div xmlns="http://www.w3.org/1999/xhtml">
<p>This is a chunk of XHTML.</p>
</div>
</description>
and produces:<description>
&lt;div xmlns="http://www.w3.org/1999/xhtml">
&lt;p>This is a chunk of XHTML.&lt;/p>
&lt;/div>
</description>
IdentityThe identity step makes a verbatim copy of its input
available on its output.InsertThe Insert step inserts the insertion port's document as a child of matching elements in the source port's document. The insertion copies the document element of the insertion
document into the position specified by the options on the step. In some cases, multiple insertions may be performed by this step.

The matching nodes in the source document are specified as a match pattern in the

match

option that must target an element. If no match pattern is supplied, the document element is the only match.If the

at-start

option has a value 'yes', the
insertion document will be inserted as the first child(ren) of the element,
otherwise it will be inserted as the last child. If the

at-start

option
is not specified, a value of 'yes' is assumed.As the inserted elements are part of the output of the step they are not considered in determining matching elements.Label ElementsThe Label Elements step labels each element with a unique
xml:id
value. If the element already has an
xml:id value, that value is
preserved. A user may specify the

prefix

and/or

suffix

options for prefixing or suffixing the
generated value of the
xml:id attribute. These prefixes or suffixes do
not affect existing xml:id values.If a

select

option is specified, only elements which match
that XPath 1.0 select expression are modified.If an existing xml:id value conflicts with a previously
generated value, the step fails.It is a dynamic error if
an existing xml:id value conflicts with a
previously generated value and the step fails.

LoadThe Load step has no inputs but takes a option that
specifies a URI of an XML resource that should be loaded and provided as
the result.

Load attempts to read an XML document from the specified URI. If the
document does not exist, or is not well-formed, the step fails. Otherwise,
the document read is produced on the result port.If the value of the

validate

option is 'yes', the XML
processor is invoked as a validating XML processor and
DTD validation is performed. If the document is not valid or the step
doesn't support validating processors, the step fails.Namespace RenameThe Namespace Rename step renames any namespace declaration or
use of a namespace in a document to a new URI value. The source
namespaces is identified by the

from

option and the
target namespace is identified by the

to

option.If the

from

option is the empty string, or is
not specified, then elements and attributes in no namespace are
renamed. If the

to

option is the empty string, or is
not specified, then elements and attributes in the specified

from

namespace are renamed into no namespace.If the XML namespace
(http://www.w3.org/XML/1998/namespace) or the XMLNS namespace
(http://www.w3.org/2000/xmlns/) is used in either the

from

or

to

options, the step fails.

It is not an error to specify the same namespace in the

from

and

to

options, but it will
have no observable effect.ParametersThe Parameters step exposes a set of parameters as a sequence of
c:parameter documents.Each parameter passed to the step is converted into a
c:parameter element and written to the
result port as a document. The step resolves
duplicate parameters in the normal way; the order in which the
parameters are written is implementation dependent.For consistency and user convenience, if any of the parameters
have names that are in a namespace, the
namespace attribute on the
c:parameter element must be used. Each
name will be an NCName.RenameThe rename step renames elements, attributes, or
processing-instruction targets in a document based on option
values.

Each element, attribute, or processing-instruction matched by
the match pattern specified in the

match

is renamed
to the name specified by the

name

option.The step fails if the specified name is not a valid name or if the
renaming would introduce a syntactic error into the document (i.e., if it would
create two attributes with the same name on the same element).ReplaceThe Replace step replaces a matching node in the
source input port's document with the
document element of the replacement port's document. The
result is a single document with all the matching nodes so replaced.
The elements to be replaced are specified by the match pattern
in the

match

option. If there are multiple matches,
only non-nested matches are replaced. That is, once an element is
replaced, its descendants cannot be matched.

Set AttributesThe Set Attributes step sets attribute values on the matching
elements using the attribute values provided on the
attributes port's document element. That is, it copies
the attributes on the document element on the attributes
port to the matching elements found in the source port's
document. If the same attribute exists on the matching element, the
value specified in the attribute port's document is used.
The result port of this step produces a copy of the
source port's document with the matching elements'
attributes modified.The matching elements are specified by the match pattern in the

match

option that must target an element. If there
are multiple matches, all elements are processed.

SinkThe Sink step accepts a sequence of documents and discards them. It has
no output.Split SequenceThe Split Sequence step accepts a sequence of documents and
produces two subsequences of that input by applying a test expression to
each document. For each document on the source input port, if the
XPath expression provided by the

test

option
evaluates to true, the document is reproduced on the “matched” output
port. If the expression evaluates to false, the document is reproduced
on the “notmatched” output port.

When the test XPath expression is evaluated, the XPath context
position is always bound to the position of the source document in the
input sequence. The XPath context size is bound to the total number of
documents in the sequence.In principle, this component cannot stream because it must buffer all
of the input sequence in order to find the context size. In practice, if the
test expression does not use the last() function, the
implementation can stream and ignore the context size.String ReplaceThe String Replace step matches nodes in the document
provided on the source input port and replaces them with the string result of evaluating
an XPath expression. The matched nodes are specified by the match pattern
in the

match

option. For each matching node, the
XPath expression provided by the

replace

option is
applied and the string value that results is used in constructing the
node's replacement in the output.The output of this step is much like an identity transformation with exceptions for the set of matching nodes as follows:For any matching node, the string value S is the result from evaluating the

replace

XPath expression with the matching node as the context node.For a matching node that is not an attribute, the node is replaced by the string value S in the output document.For a matching node that is an attribute, the string value of the attribute is replaced by the string value S in the output document.

StoreThe store step stores a serialized version of its input
to a URI. The URI is either specified explicitly by the 'href' option
or implicitly by the base URI of the document. This step outputs
a reference to the location of the stored document.

The step attempts to store the XML document to the specified URI. If that URI scheme is not supported or such storage is not allowed,
the step fails.The output of this step is a document containing a single c:result element whose href attribute contains the same value as the href option.The 'method' option controls the serialization method used by this component with standard values of 'html', 'xml', 'xhtml', and 'text'.A more direct “serialize-to-octet-stream” step may also be required.
One, for example, that supports the XSLT 2.0/XQuery 1.0 Serialization
specification.We need a reference to the serialization step and methods.Unescape MarkupThe Unescape Markup step takes the text value of the document
element and parses the content as if it was a unicode character stream
containing XML. The outcome is a single element with children from the
parsing of the XML content. This is the reverse of the
p:escape-markup
step.When the text value is parsed, a document element wrapper must
be assumed so that element siblings can be parsed back into XML.
Further, if the 'namespace' option is specified, the default
namespace is declared on that wrapper element.If the 'content-type' option is specified, an implementation
can use a different parser to produce XML content. Such a behavior is
implementation defined. For example, for the mime type 'text/html', an
implementation might provide an HTML to XHTML parser (e.g. Tidy).

For example, with the 'namespace' option set to the XHTML namespace, the following input:<description>
&lt;p>This is a chunk.&lt;/p>
&lt;p>This is a another chunk.&lt;/p>
</description>
would produce:<description>
<p xmlns="http://www.w3.org/1999/xhtml">This is a chunk.</p>
<p xmlns="http://www.w3.org/1999/xhtml">This is a another chunk.</p>
</description>
UnwrapThe Unwrap step matches a certain number of elements and
replaces them with their children. The source port input
document is processed by applying the match pattern specified by the

match

. If the match is an element, the element is
replaced with its children in the output document produced on the
result port. A single document is produced and if
unwrapping causes a non-well-formed document (e.g. more than one
document element), the step fails.

WrapThe Wrap step wraps matches items in the source port's input document with a new element. The document is processed by applying the match pattern specified by the

match

. For each match, the match is wrapped with a new element in the output document. The

wrapper

is used to specify the name of the element and takes a QName value whose prefixes are resolved as specified in . A single document is produced on the result output port.If the

group-by

is specified, adjacent matches are grouped by evaluating the XPath specified by this option with the content node set to the matched node. If the XPath evaluates to the same string value, the matches are grouped into the same wrapper element. While processing the document for grouping, two matches are considered adjacent if they are either siblings or all intervening siblings are whitespace text, comments, or processing instruction nodes.

Wrap SequenceThe Wrap Sequence step converts the sequence of documents on the
source port into a single document on the
result port. The document produced has a document
element whose name is specified via the wrapper option
and whose children are the contents of the documents received
on source port, in the order they were received. All of the
top-level nodes (white space, comments, processing-instructions,
and one element node) of each document are added to the children of the
new

wrapper

document. The

wrapper

option takes a QName value whose prefixes are resolved as specified in If the

group-by

is specified, adjacent documents in the sequence are grouped by evaluating the XPath specified by this option with the context node set to the document node. If the XPath evaluates to the same string value, the documents are grouped into the same wrapper element.

XIncludeThe XInclude step applies xinclude processing semantics
to the document. The referenced documents are calculated against the
base URI and are not provided as input to the step.It is a dynamic error if
an XInclude error occurs during processing and the step fails.XSLTThe xslt step applies an XSLT 1.0 transformation to a document.
The transformation is supplied by a single document on the input port
named 'transform'. That transformation is applied to the primary
source document supplied on the input port named 'source'. The result
of the transformation is a sequence of documents on its 'result'
port.All of the specified parameters are made available to the XSLT
processor. If the XSLT processor signals a fatal error, the step
fails. Otherwise, the result of the transformation is produced on the
result port.Optional StepsHTTP RequestThe HTTP Request step provides interactions with resources identified by URIs over HTTP or HTTPS. The input document provided on the source port specifies the request by a single c:http-request element. This element specifies the method, resource, and other request properties as well as possibly including an entity body (content) for the request.When the request is formulated, the step and/or protocol implementation may add headers as necessary to either complete
the request or as appropriate for the content specified (e.g. transfer encodings). A user of this step is guaranteed that their requested headers and content will be sent with the exception of any conflicts with protocol-related headers. If the user of the step requests a header value (e.g. content-type) that conflicts with a value the step and/or protocol implementation must set, the step will fail.The response received after making the request is handled as follows:A single c:http-response element is produced on the output port result with the status attribute containing the status of the response received.Each response header whose name does not start with "Content-" is translated into a c:header element.Unless the status-only attribute has a value 'yes', the entity body of the response is converted into
a c:body or c:multipart element via the rules given in this section.A request or response may be multipart per RFC 1521. In those situations, the entity is represented by a c:multipart element that contains multiple c:body elements inside. In the case of a request, the media type of the c:multipart must be a multipart media type (i.e. have a main type of 'multipart').If the media type of the response is a text type with a charset parameter that is a unicode character encoding, the children of the constructed c:body element is the translation of the text into a unicode character sequenceIf the media type of the response is an XML media type, the content of the constructed c:body element is the result of parsing the body with an XML parser. If the content is not well-formed, the step fails.For all other media types, the response is encoded as base64 and produced as text children of the c:body element.In the case of a multipart response, the same rules apply when constructing a c:body element for each body part encountered.The following sections describe the structure of the request and the response.c:http-requestA HTTP request is represented by a c:http-request element.The method attribute specifies the method to be used against the URI specified by
the href attribute. If the href attribute is a relative URI, it will resolve
against the base URI of the element.Both the status-only and override-content-type attributes are not used in formulating
the response. The override-content-type attribute controls interpretation of the response's content-type. If
this attribute is specified, the response will be treated as if it returned that content-type. If
the override-content-type value cannot be used (e.g. text/plain to override image/png), the
step fails. The original media type value will still be provided in the response XML.Also, if status-only attribute has the value yes, the entity of the response will not
be processed to produce a c:body or c:multipart element.c:headerA c:header is a name-value pair passed in the headers of the
request or its response.c:multipartThe c:multipart element holds a set of body parts for a message request or response.If the content-type attribute is not specified, a value of "multipart/mixed" will be assumed.c:bodyThe c:body element holds the body or body part of the message. Each of the attributes holds controls some aspect of the encoding of the body or body part when the request is formulated. These are specified as follows:The content-type attribute specifies the media type of the content. If the media type is not an XML type nor is it text, the content must be already be base64 encoded.The encoding attribute specifies the request's Content-Transfer-Encoding header. If the value of encoding is 'base64' but the content type does not require such an encoding, the c:body element is assumed to contain base64 encoded content of that media type.The id attribute specified the value of the Content-ID header for body parts.The description attribute specified the value of the Content-Description header for body parts.Any XML media type is assumed to be children of the c:body element in unescaped form. That is, the following is the request that sends a small XML document as the entity's body:

option is “yes”, then
the conventions of the RELAX NG DTD
Compatibility (@@ cite) are also applied.XML Schema ValidateThe XML Schema Validate step applies XML Schema's validity assessment
to an XML document input. The set of known available schemata are specified via a sequence of documents provided on the schema input port.If

assert-valid

option has a value of yes, the step will fail if the XML Schema validation assessment reports any errors. When XML Schema validation assessment is performed, the processor is invoked in the mode specified by the

mode

option. The result of the assessment produces an infoset with the PSVI annotations if the pipeline implementation supports such annotations. Otherwise, the input document is reproduced with any defaulting of attributes and elements performed as specified by the XML Schema recommendation.

XSLT 2.0The XSLT 2.0 step applies an XSLT 2.0 transformation to a document. The transformation is supplied on the port named 'transform' and the primary input document is supplied on the port named 'document'. The application of
the transformation produces the primary result document on the port named 'result'.

If a sequence of documents is provided on the input port named 'source', the first document is assumed to be the primary input document. By default, this sequence is also the default collection unless the 'allow-collections' option is set to 'no'.All of the specified parameters are made available to the
XSLT processor. If the XSLT processor signals a fatal error, the step fails.
Otherwise, the result of the transformation is produced on the
result port.The invocation of the transformation is controlled by the 'initial-mode' and 'template-name' option that set the initial mode and/or named template in the XSLT transformation that should initiate processing. If these values do not match the transformation specified, a dynamic error must be thrown.The 'allow-version-mismatch' option indicates whether an XSLT 1.0 transformation should be allowed to be run through the XSLT 2.0 processor. A value of 'yes' means that it should be allow.The 'output-base-uri' option sets the context's output base URI per the XSLT 2.0 specification.If more than one document is produced, the secondary result documents are produced on
the output port named 'secondary'. Otherwise, the 'secondary' port produces an empty sequence.XSL FormatterThe XSL Formatter step receives an XSL FO document and
renders the content. The result of rendering is stored to the
uri provided via the 'uri' option. A reference to that
result is produced on the output port.The output content type is controlled by the 'output'
option which contains the mime type of the output format. A formatter
may take any number of optional rendering parameters via the step's
parameters. Such parameters are defined by the XSL implementation used and
are implementation defined.

The output of this step is a document containing a single c:result whose href attribute points to the output of the XSL formatter.XQuery 1.0The XQuery 1.0 step applies an XQuery to a sequence of documents treated as the default collection. The 'source' input port allows a sequence of documents and specifies each document that should be in the default collection. The result of the xquery is a sequence of documents constructed from a XPath 2.0 sequence of elements. Each element in the sequence is assumed to be the document element of a separate document. It is an error if the sequence contains items other than elements. The 'query' port must receive a single document whose element is 'query' in the step vocabulary namespace. As XQuery is not necessarily well-formed XML, the text descendants of this element are considered the query.For example:
declare namespace atom="http://www.w3.org/2005/Atom";
/atom:feed/atom:entry
]]>
ConformanceConformant processors must implement all of the features
described in this specification except those that are explicitly identified
as optional.Some aspects of processor behavior are not completely specified; those
features are either implementation-dependent or
implementation-defined.An
implementation-dependent feature is one where the
implementation has discretion in how it is performed.
Implementations are not required to document or explain
how implementation-dependent features are performed.An
implementation-defined feature is one where the
implementation has discretion in how it is performed.
Conformant implementations must document
how implementation-defined features are performed.ReferencesXML Core ReqXML
Processing Model Requirements.
Dmitry Lenkov, Norman Walsh, editors. W3C
Working Group Note 05 April 2004
InfosetXML
Information Set (Second Edition). John Cowan,
Richard Tobin, editors. W3C Working Group Note 04 February 2004.
XML 1.0Extensible
Markup Language (XML) 1.0 (Fourth Edition). Tim Bray,
Jean Paoli, C. M. Sperberg-McQueen, et. al.
editors. W3C Recommendation 16 August 2006.Namespaces 1.0Namespaces
in XML 1.0 (Second Edition). Tim Bray,
Dave Hollander, Andrew Layman, et. al.,
editors. W3C Recommendation 16 August 2006.XML 1.1Extensible
Markup Language (XML) 1.1 (Second Edition). Tim Bray,
Jean Paoli, C. M. Sperberg-McQueen, et. al.
editors. W3C Recommendation 16 August 2006.Namespaces 1.1Namespaces
in XML 1.1 (Second Edition). Tim Bray,
Dave Hollander, Andrew Layman, et. al.,
editors. W3C Recommendation 16 August 2006.XPath 1.0XML Path Language (XPath)
Version 1.0. James Clark and Steve DeRose, editors.
W3C Recommendation. 16 November 1999.XSLT 1.0XSL Transformations (XSLT)
Version 1.0. James Clark, editor.
W3C Recommendation. 16 November 1999.XPath 2.0XML Path Language (XPath)
2.0. Anders Berglund, Scott Boag, Don Chamberlin, et. al., editors.
W3C Recommendation. 23 January 2007.XPath 2.0 Functions and OperatorsXQuery 1.0 and
XPath 2.0 Functions and Operators.
Ashok Malhotra, Jim Melton, and Norman Walsh, editors.
W3C Recommendation. 23 January 2007.XSLT 2.0XSL Transformations (XSLT)
Version 2.0. Michael Kay, editor.
W3C Recommendation. 23 January 2007.XSL 1.1Extensible Stylesheet
Language (XSL) Version 1.1.
Anders Berglund, editor. W3C Recommendation. 5 December 2006.XQuery 1.0XQuery 1.0: An XML
Query Language. Scott Boag, Don Chamberlin, Mary Fernández, et. al.,
editors. W3C Recommendation. 23 January 2007.RELAX NGISO/IEC JTC 1/SC 34.
ISO/IEC FDIS 19757-2:2002(E) Document Schema Definition
Languages (DSDL) — Part 2: Grammar-based validation — RELAX NG
2002.
SchematronISO/IEC JTC 1/SC 34.
ISO/IEC FDIS 19757-2:2002(E) Document Schema Definition
Languages (DSDL) — Part 3: Rule-based validation — Schematron
2004.
W3C XML Schema: Part 1XML Schema Part 1:
Structures Second Edition.
Henry S. Thompson, David Beech, Murray Maloney, et. al., editors.
World Wide Web Consortium, 28 October 2004.
W3C XML Schema: Part 2XML Schema Part 2:
Structures Second Edition.
Paul V. Biron and Ashok Malhotra, editors.
World Wide Web Consortium, 28 October 2004.
xml:idxml:id
Version 1.0. Jonathan Marsh, Daniel Veillard, and Norman Walsh, editors.
W3C Recommendation. 9 September 2005.XIncludeXML Inclusions
(XInclude) Version 1.0 (Second Edition). Jonathan Marsh,
David Orchard, and Daniel Veillard, editors.
W3C Recommendation. 15 November 2005.XML BaseXML Base.
Jonathan Marsh, editor.
W3C Recommendation. 27 June 2001.RFC 1521RFC 1521:
MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for
Specifying and Describing the Format of Internet Message
Bodies. N. Borenstein, N. Freed, editors. Internet
Engineering Task Force. September, 2003.RFC 2616RFC 2616:
Hypertext Transfer Protocol — HTTP/1.1.
R. Fielding, J. Gettys, J. Mogul, et. al., editors. Internet
Engineering Task Force. June, 1999.RFC 3548RFC 3548:
The Base16, Base32, and Base64 Data Encodings.
S. Josefsson, Editor. Internet
Engineering Task Force. July, 2003.Glossaryatomic stepAn
atomic step is a step that performs a unit of
XML processing, such as XInclude or transformation, and has no
internal subpipeline.bindingA binding associates an input
or output port with some data source.by URIA document is specified
by URI if it is referenced with a URI.by sourceA document is specified
by source if it references a specific port
on another step.compound stepA
compound step is a step that
contains additional steps. That is, a compound step differs from an
atomic step in that its semantics are at least partially determined by the
steps that it contains.contained stepsThe steps that occur
directly inside a compound step
are called contained steps.containerA compound step which immediately
contains another step is called its container.declared inputsThe input ports declared on
a step are its declared inputs.declared optionsThe options declared on a
step are its declared options.declared outputsThe output ports declared on a
step are its declared outputs.default readable portThe default readable port,
which may be undefined, is a specific step name/port name pair from the set of readable
ports.dynamic
errorA dynamic
error is one which occurs while a pipeline is being
evaluated.empty environmentThe
empty environment contains no readable ports,
no in-scope options, and an undefined default readable port.
empty sequenceAn
empty sequence of documents is specified with the
p:empty element.environmentThe
environment of a step is the static information
available to each instance of a step in a pipeline.extension attributeAn element from the
XProc namespace may have any attribute not from the
XProc namespace, provided that the expanded-QName of the attribute has
a non-null namespace URI. Such an attribute is called an
extension attribute.implementation-definedAn
implementation-defined feature is one where the
implementation has discretion in how it is performed.
Conformant implementations must document
how implementation-defined features are performed.implementation-dependentAn
implementation-dependent feature is one where the
implementation has discretion in how it is performed.
Implementations are not required to document or explain
how implementation-dependent features are performed.in-scope optionsThe
in-scope options are the set of options that
are visible to a step.inherited
environmentThe inherited
environment of a
contained step is an environment that is the same
as the environment of its container with the
standard modifications.
inline documentAn
inline document is specified directly in
the body of the element that binds it.last stepThe last step in a
subpipeline is the last step in document order within its container.
matchesA step
matches its signature if and only if it specifies
an input for each declared input, it specifies no inputs that are not
declared, it specifies
an option for each option that is declared to be required, and it
specifies no options that are not declared.optionAn option is
a name/value pair where the name is an
expanded name
and the value must be a string.parameterA parameter is
a name/value pair where the name is an
expanded name
and the value must be a string.pipelineA pipeline
is a set of connected steps, outputs flowing into inputs, without any loops (no step can
read its own output, directly or indirectly).primary input
portIf a step has exactly
one input port, or if one of its input ports is explicitly designated
as the primary, then that input port is the primary input
port of the step.primary output portIf a step has exactly
one output port, or if one of its output ports is explicitly
designated as the primary, then that output port is the
primary output port of the step.readable portsThe
readable ports are the step name/output
port name pairs that are visible to the step.signatureThe
signature of a step is the set of inputs,
outputs, and options that it is declared to accept.specified optionsThe options on a step which have
specified values, either because a p:option element specifies
a value or because the declaration included a default value,
are its specified options.static
errorA static
error is one which can be detected before pipeline evaluation
is even attempted.stepA step is the
basic computational unit of a pipeline. Steps are either atomic or
compound.subpipelineThe steps (and the
connections between them) within a compound step form a
subpipeline.Pipeline Language SummaryThis appendix summarizes the XProc pipeline language. Machine readable
descriptions of this language are available in
RELAX NG (and the
RELAX NG
compact syntax),
W3C XML Schema,
and
DTD syntaxes.The core steps are also summarized here.And the step vocabulary elements.The Error VocabularyIn general, it is very difficult to predict error behavior. Component failure
may be catastrophic (programmer error), or it may be be the result of user error,
resource failures, etc. Steps may detect more than one error, and the failure of
one step may cause other steps to fail as well.The p:try/p:catch mechanism gives pipeline authors
the opportunity to process the errors that caused the p:try to fail.
In order to facilitate some modicum of interoperability among processors, errors
that are reported on the #error port of a
p:catchshould conform to the format described
in this appendix.The elements in this vocabulary are in the http://www.w3.org/2007/03/xproc-error
namespace, represented by the prefix 'err:' in this appendix.err:errorsThe error vocabulary consists of a root element, err:errors
which contains zero or more err:error elements.err:errorEach specific error is represented by an err:error element:The name and type
attributes identify the name and type, respectively, of the step which failed.The code is a QName which identifies the error.
For steps which have defined error codes, this is an opportunity for the step
to identify the error in a machine-processable fashion. Many steps omit this
because they do not include the concept of errors identified by QNames.If the error was caused by a specific document, or by the location of some
erroneous construction in a specific document, the
href, line,
column, and offset
attributes identify this location. Generally, the error location is identified
either with line and column numbers or with an offset from the beginning of
the document, but not usually both.The content of the err:error element is any well-formed XML.
Specific steps, or specific implementations, may provide more detail about the
format of the content of an error message.Error ExampleConsider the following XSLT stylesheet:This stylesheet is pointless.
]]>If it was used in a step named “xform” in a p:try,
the following error document might be produced:This stylesheet is pointless.
]]>It is not an error for steps to generate non-standard error output as long
as it is well-formed.