This document defines the XML-binary Optimized Packaging (XOP)
convention, a means of more efficiently serializing XML Infosets that
have certain types of content.

English

Last Modified:"Editors Copy $Date: 2007/07/11 20:12:02 $

Introduction

This specification defines the XML-binary Optimized Packaging (XOP)
convention, a means of more efficiently serializing XML Infosets (see
) that have certain types of content.

A XOP package is created by placing a serialization of the XML Infoset
inside of an extensible packaging format (such a MIME
Multipart/Related, see ). Then, selected
portions of its content that are base64-encoded binary data are extracted and
re-encoded (i.e., the data is decoded from base64) and placed into the
package. The locations of those selected portions are marked in the XML
with a special element that links to the packaged data using URIs.

In a number of important XOP applications, binary data need never be
encoded in base64 form. If the data to be included is already available
as a binary octet stream, then either an application or other software
acting on its behalf can directly copy that data into a XOP package, at
the same time preparing suitable linking elements for use in the root
part; when parsing a XOP package, the binary data can be made available
directly to applications, or, if appropriate, the base64 binary
character representation can be computed from the binary data.

However, at the conceptual level, this binary data can be thought of as
being base64-encoded in the XML Document. As this conceptual form might
be needed during some processing of the XML Document (e.g., for signing
the XML document), it is necessary to have a one to one correspondence
between XML Infosets and XOP Packages. Therefore, the conceptual
representation of such binary data is as if it were base64-encoded,
using the canonical lexical form of XML Schema base64Binary
datatype (see 3.2.16
base64Binary). In the reverse direction, XOP is capable of
optimizing only base64-encoded Infoset data that is in the canonical
lexical form.

Only element content can be optimized; attributes,
non-base64-compatible character data, and data not in the canonical
representation of the base64Binary datatype cannot be
successfully optimized by XOP.

The remainder of this specification is organized in the following
fashion:

Section 2 describes the XOP Infoset, which preserves the
non-optimized content and structure of the original XML Infoset.

Section 3 specifies the XOP processing model.

Section 4 of this specification describes the form of the XOP
Package.

Section 5 describes how XOP Documents are identified.

Section 6 explores the security considerations of using the XOP
convention.

Terminology

This specification uses terminology from the XML Infoset (see ) when discussing XML content and structure. This
is only a convention for clear specification of XOP behavior.

The following terms are used in this specification:

Original XML Infoset - An XML Infoset to be optimized.
Optimized Content - Content which has been removed from
the XML Infoset.
XOP Infoset - The Original Infoset with
any Optimized Content removed and replaced by
xop:Includeelement information items.
XOP Document - A serialization of the XOP Infoset using
any W3C recommendation-level version of XML.
XOP Package - A package containing the XOP
Document and any Optimized Content. As a whole, the XOP Package
is an alternate serialization of the Original Infoset.
Reconstituted XML Infoset - An XML Infoset that has been
constructed from the parts of a XOP Package.

Architecture of the XOP framework

Example

1 shows an XML Infoset prior to XOP
processing. 2 shows the same
Infoset, serialized using the XOP format in a MIME Multipart/Related
package. The base64-encoded content of the m:photo and
m:sig elements have been replaced by a
xop:Include element, while
the binary octets have been serialized in separate MIME parts. Note
that those examples use to identify the
media type of the content of the m:photo and
m:sig elements. Note also that the sample base64 data is smaller
than would be typical and the binary octets are not shown; in practice, the
optimized form is likely to be much smaller than the original.

3 shows an XML Infoset prior to XOP
processing. 4 shows the same
Infoset, serialized using the XOP format in a MIME Multipart/Related
package. The base64-encoded content of the m:photo and
m:sig elements have been replaced by a
xop:Include element, while
the binary octets have been serialized in separate MIME parts. Note also
that the sample base64 data is smaller than would be typical and the
binary octets are not shown; in practice, the optimized form is
likely to be much smaller than the original.

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 .

This specification uses a number of namespace prefixes throughout;
they are listed below. Note that the choice of any namespace prefix
is arbitrary and not semantically significant.

Prefixes and Namespaces used in this
specification.

Prefix

Namespace

Notes

xop

http://www.w3.org/2004/08/xop/include

A non-normative XML Schema , document for the
http://www.w3.org/2004/08/xop/include
namespace can be found at http://www.w3.org/2004/08/xop/include. Note that XML Schema currently provides only for validation of XML 1.0 Infosets; accordingly, the schema may not be usable with XOP Infosets corresponding to later versions of XML.

xmime

http://www.w3.org/2005/05/xmlmime

The namespace for the content type attribute.

soap

http://www.w3.org/2003/05/soap-envelope

The SOAP 1.2 namespace.

xs

http://www.w3.org/2001/XMLSchema

The namespace of XML Schema data types .

XOP Infoset Constructs

XOP operates by extracting the Optimized Content from the Original
Infoset to create the XOP Infoset. In particular, the character
information item children of element information
items to be optimized are removed and replaced with an
element information item named xop:Include.
The xop:Includeelement information item
contains an attribute information item with a link to
the part of the XOP Package that carries a binary representation of the
data removed from the original element information item.
Details of the construction and processing of XOP serializations are
provided in .

The Infoset used as input to XOP processing MUST NOT contain any
element information item with a [namespace name]
property of http://www.w3.org/2004/08/xop/include
and a [local name] property of Include. Infosets
containing such element information items cannot be
serialized using XOP. This is because during infoset reconstruction
a processor is unable to differentiate between
xop:Includeelement information items inserted during XOP package
construction and those that were part of the original infoset.

The following subsections provide formal definitions for allowable content in the
element information item and attribute information
items used to construct a XOP serialization; content not explicitly specified
is disallowed. A non-normative XML Schema for serializations of those
element information item and attribute
information items can be found at http://www.w3.org/2004/08/xop/include.

xop:Include element information item

The xop:Includeelement information item
has:

A [local name] of Include.
A [namespace name] of
http://www.w3.org/2004/08/xop/include.
One or more attribute information items
amongst its [attributes] property as follows:
A mandatory hrefattribute information
item (see ).
Zero or more additional namespace qualified attribute information
items. Any such attribute information items MUST NOT have a [namespace name] of
http://www.w3.org/2004/08/xop/include, MUST NOT change
the semantics of processing the xop:Includeelement information item and
MUST be ignored if not recognized.
Zero or more namespace qualified element
information items in its [children] property. Any such
element information items MUST NOT have a [namespace name]
of http://www.w3.org/2004/08/xop/include, MUST
NOT change the semantics of processing the xop:Includeelement information item and MUST be ignored if not recognized.
href attribute information item

The hrefattribute information item has:

A [local name] of href.
An empty [namespace name].
A [normalized value] which is a representation of a URI (see
)
referencing the part of the package containing the data logically
included by the [owner element] (i.e., the xop:Includeelement information item). The [normalized value] MUST be a
valid URI per the cid: URI scheme (see ). In addition,
the [normalized value] MUST be a valid lexical form of the XML Schema xs:anyURI
datatype (see 3.2.17
anyURI).
An [owner element] which is the xop:Includeelement information item containing the
attribute information item.
XOP Processing Model

This section describes the processing model for creating XOP Packages
and interpreting XOP Packages. Unless otherwise stated, the result of
such processing MUST be semantically equivalent to performing the
specified steps separately, and in the order given.

Creating XOP Packages

To create a XOP Package from an Original XML Infoset:

Ensure that the Original XML Infoset contains no element
information item with a [namespace name] of
http://www.w3.org/2004/08/xop/include and a [local
name] of Include. As discussed in , XML Infosets with such element
information items cannot be represented using XOP.
Create an empty package.
Identify within the Original XML Infoset the element
information items to be optimized. To be optimized, the
characters comprising the [children] of the element
information item MUST be in the canonical form of
xs:base64Binary (see 3.2.16
base64Binary) and MUST NOT contain any whitespace
characters, preceding, inline with or following the non-whitespace
content.
Note that this rule requires that the [children] of the element
information item to be optimized contains only
character information items.
Create a XOP Infoset which is a copy of the Original XML Infoset,
but with the [children] of each element
information item
identified in the previous step replaced by a
xop:Includeelement
information item (see ) constructed as follows:
Transform the replaced characters into binary data by
processing them as base64-encoded data.
Serialize the binary data into a new part of the package, with
appropriate metadata corresponding to the [normalized value] of
the hrefattribute information item of
the xop:Includeelement information
item (see ).
If the element information item being optimized
(i.e., the [parent] of the newly inserted
xop:Includeelement information item)
has a xmime:contentTypeattribute
information item, its value SHOULD be reflected
appropriately in the metadata for the part.
Serialize the resulting XOP Infoset into the package using any W3C
recommendation-level version of XML (e.g., ,
) and identify it as the root part according
to the packaging mechanism's convention, labeling it with the
application/xop+xml media type, as described in .

Additional parts MAY be added to the package to satisfy application
specific requirements. Other content-specific metadata MAY be
reflected in the packaging metadata as appropriate.

If content cannot be successfully encoded into the XOP package,
implementations SHOULD behave as if that portion of the Original XML
Infoset was not nominated for optimization.

Interpreting XOP Packages

This section specifies the means by which the Original XML Infoset
can be reconstructed from a XOP Package that has been prepared
according to the rules of .

Note: conventions or error reporting mechanisms to be used in
processing packages that incorrectly purport to be XOP Packages are
beyond the scope of this specification.

To create a Reconstituted XML Infoset from a XOP Package:

Construct an XML Infoset by parsing the root part of the package as
an XML document. The document MUST be parsed according to the level
of the XML Recommendation identified by the XML declaration of that
document. If no XML declaration is present, then the document MUST
be parsed per .
Using that XML Infoset, for each element information
item, E, which has, as the sole member of its [children] property, a xop:Includeelement information item (as defined in ):
Locate the part of the package corresponding to the URI in the
hrefattribute information item of
the xop:Includeelement information item (i.e., corresponding to the URI
encoded in the attribute information item's
[normalized value]).
Replace the xop:Includeelement information item that appears in the
[children] property of E with character information items representing
the canonical base64 encoding of the entity body of the
identified package part (i.e., effectively replace the
xop:Includeelement information item
with the data reconstructed from the
package part).
XOP Packages

XOP is capable of using a variety of underlying packaging mechanisms.
Such packaging mechanisms MUST be able to represent, with full fidelity
all the parts created according to (see ), and MUST be used in a manner that provides a
means of designating a distinguished root (main, primary etc.) part.

The subsection below specifies normatively how a particular packaging
mechanism, MIME Multipart/Related, is used, but does not preclude the
use of other packaging mechanisms with the XOP convention.

MIME Multipart/Related XOP Packages

This section describes how MIME Multipart/Related packaging (as
specified in ) is used with XOP.

The root MIME part is the root part of the XOP package, MUST be a
serialization of the XOP Infoset using any W3C recommendation-level
version of XML (e.g., , ), and
MUST be identified with a media type of "application/xop+xml" (as defined below).
The "start-info" parameter of the package's media type MUST contain the content type
associated with the content's XML serialization. (i.e. it will contain the same
value as the "type" parameter of the root part).

Except for purposes of determining the root MIME part, as specified
by , ordering of MIME parts MUST NOT be
considered significant to XOP processing or to the construction of
the XOP Infoset.

Part metadata is reflected in MIME header fields. Specifically,
the URI used in the value of an hrefattribute
information item on a xop:Includeelement
information item contains a URI that uses the 'cid:' scheme
(see ), so the corresponding MIME
part MUST have a Content-ID header field (see
with a corresponding field-value.

Furthermore, if a xmime:contentTypeattribute
information item is found (as described in ), it SHOULD be reflected in the field value
of the MIME Content-Type header.

Identifying XOP Documents

XOP Documents, when used in MIME-like systems, are identified with
the "application/xop+xml" media type, with the required "type" parameter conveying
the original XML serialisation's associated content type. Note that when the type
parameter contains reserved characters, it needs to be appropriately quoted and escaped.

For example, a XOP package using MIME multipart/related packaging to serialize a
SOAP 1.2 message with an action parameter of "http://www.example.net/foo"
would label the package itself with the "multipart/related"
media type, and the root part with the "application/xop+xml" media type along with a type parameter
containing "application/soap+xml;action=\"http://www.example.net/foo\"".

Registration
MIME media type name:

application

MIME subtype name:

xop+xml

Required parameters:type

This parameter conveys the content type associated with the XML serialization of the XOP infoset, including parameters as appropriate.

Optional parameters:charset

This parameter has identical semantics to the charset parameter
of the application/xml media type as specified in
RFC 3023 .

Encoding considerations:

Identical to those of application/xml
as described in RFC 3023 ,
section 3.2.

Security considerations:

In addition to application-specific considerations, XOP has the same
security considerations described in RFC3023 ,
section 10.

Interoperability considerations:

There are no known interoperability issues.

Published specification:

This document

Applications which use this media type:

No known applications currently use this media type.

Additional information:File extension:

XOP

Fragment identifiers:

Identical to that of application/xml as described in
RFC 3023 ,
section 5.

Base URI:

As specified in RFC 3023 , section 6.

Macintosh File Type code:

TEXT

Person and email address to contact for further information:

Mark Nottingham <mnot@pobox.com>

Intended usage:

COMMON

Author/Change controller:

The XOP specification is a work product of the World Wide Web Consortium's XML Protocol Working Group. The W3C has change control over this specification.

Security Considerations
XOP Package Integrity

The integrity of Infosets optimized using XOP may need to be ensured.
As XOP packages can be transformed to recover such Infosets (see
), existing XML Digital
Signature techniques can be used to protect them. Note, however, that
a signature over the Infoset does not necessarily protect against
modifications of other aspects of the XOP packaging; for example, an
Infoset signature check might not protect against re-ordering of
non-root parts.

In the future a transform algorithm for use with XML Signature could
provide a more efficient processing model where the raw octets are
digested directly.

XOP Package Confidentiality

The confidentiality of XOP Packages may need to be ensured. As such
packages can be transformed to an XML Information Set, existing XML
Encryption (see ) techniques can be used to
protect such packages. Any part of a package can be encrypted,
whether it includes base64 characters or not. The resulting
CipherDataelement information item can then
be optimized because the content of such an element information
item is base64 characters.

In the future a transform algorithm for use with XML Encryption could
provide a more efficient processing model where the raw octets are
encrypted directly.

Relationship to other specifications

This appendix summarizes the XOP dependencies upon underlying
specifications, the nature of appropriate payloads for XOP and the
means of extending XOP.

Dependencies

The XOP convention builds upon a number of underlying specifications.
They are:

XML (e.g., , ) - The XOP
Document is encoded using any W3C recommendation-level version of
XML (see ). Formats that
use XOP MUST identify which versions of XML are permissible for
encoding the XOP Infoset. XOP does not constrain the use of any
mechanisms defined by XML, including those explicitly allowing
extensions, nor does it constrain the use of underlying
specifications.

Namespaces in XML (e.g., , ) - The XOP Document uses any W3C
recommendation-level version of Namespaces in XML compatible with
the version(s) of XML used. Formats that use XOP MUST identify
which versions of Namespaces in XML are permissible for encoding
the XOP Infoset. XOP does not constrain the use of any mechanisms
defined by Namespaces in XML, including those explicitly
allowing extensions, nor does it constrain the use of underlying
specifications.

Uniform Resource Identifiers (see ) - The
XOP Document uses URIs to locate parts in the XOP Package (see
. XOP does not constrain the use of any
mechanisms defined by URIs, including those explicitly allowing
extensions, nor does it constrain the use of underlying
specifications.

Packaging Mechanism - XOP requires the use of a packaging
mechanism that satisfies the requirements in . One such mechanism MUST be in use, but
XOP does not require a specific mechanism. Formats using XOP MUST
identify at least one such mechanism permissible for creating the
XOP Package, and MUST specify how each allowed mechanism is to
be used for building the XOP Package.

The relationship of one such mechanism to XOP, The MIME
Multipart/Related Content-type, is specified in .

Payload

The payload of a XOP Package is an XML Infoset. XOP constrains the
range of admissible characters in the payload to those contained in
the "Char" production of a W3C recommendation-level version of XML.
Additionally, the Original XML Infoset cannot contain an
element information item with a [local name] of
Include and a [namespace name] of
http://www.w3.org/2004/08/xop/include. Finally,
portions of the payload which are nominated for optimization in XOP
MUST be base64-encoded data in the canonical lexical form of XML
Schema base64Binary datatype (see 3.2.16
base64Binary).

Extension

XOP Documents allow extensions to the xop:Include element
when they do not change its semantics. Changes to the semantics MUST be
identified by a new namespace URI (i.e., they MUST define a new
Includeelement information item in another
namespace).

The extensibility of the specifications underlying XOP is not
constrained by their use in XOP.

Requirements

This document along with
and has been produced in conjunction
with the development of requirements
embodied in the document.

References
Normative References
Extensible
Markup Language (XML) 1.0 (Fourth Edition), Jean Paoli,
Eve Maler, Tim Bray, et. al., Editors.
World Wide Web Consortium, 16 August 2006.
This version is http://www.w3.org/TR/2006/REC-xml-20060816.
The latest version is
available at http://www.w3.org/TR/REC-xml.
Extensible Markup Language (XML) 1.1 (Second Edition),
Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler,
François Yergeau, John Cowan, Editors.
World Wide Web Consortium, 16 August 2006, edited in place 29 September 2006.
This version is http://www.w3.org/TR/2006/REC-xml11-20060816.
The latest version is
available at http://www.w3.org/TR/xml11/.SOAP Message
Transmission Optimization Mechanism, Hervé Ruellan,
Noah Mendelsohn, Martin Gudgin, and Mark Nottingham, Editors.
World Wide Web Consortium,
&mtomdraft.dd; &mtomdraft.month; &mtomdraft.year;.
This version is &dated-mtom;.
The latest version is
available at http://www.w3.org/TR/soap12-mtom/.Resource
Representation SOAP Header Block, Martin Gudgin,
Yves Lafon, and Anish Karmarkar, Editors.
World Wide Web Consortium,
&repdraft.dd; &repdraft.month; &repdraft.year;.
This version is &dated-rep;.
The latest version is
available at http://www.w3.org/TR/soap12-rep/.SOAP Optimized
Serialization Use Cases and Requirements, Tony Graham,
Mark Jones, and Anish Karmarkar, Editors.
World Wide Web Consortium, 08 June 2004.
This version is http://www.w3.org/TR/2004/WD-soap12-os-ucr-20040608/.
The latest version
is available at http://www.w3.org/TR/soap12-os-ucr/.
Namespaces in
XML (Second Edition), Tim Bray, Dave Hollander,
Andrew Layman, and Richard Tobin, Editors.
World Wide Web Consortium, 16 August 2006.
This version is http://www.w3.org/TR/2006/REC-xml-names-20060816.
The latest version is
available at http://www.w3.org/TR/REC-xml-names.Namespaces in
XML 1.1 (Second Edition), Richard Tobin, Andrew Layman, Tim Bray, and
Dave Hollander, Editors.
World Wide Web Consortium, 16 August 2006.
This version is http://www.w3.org/TR/2006/REC-xml-names11-20060816.
The latest version is
available at http://www.w3.org/TR/xml-names11/.XML
Information Set (Second Edition), Richard Tobin and
John Cowan, Editors.
World Wide Web Consortium, 04 February 2004.
This version is http://www.w3.org/TR/2004/REC-xml-infoset-20040204.
The latest version is
available at http://www.w3.org/TR/xml-infoset.
XML Schema Part 1:
Structures Second Edition, David Beech, Murray Maloney,
Henry S. Thompson, and Noah Mendelsohn, Editors.
World Wide Web Consortium, 28 October 2004.
This version is http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/
The latest version is
available at http://www.w3.org/TR/xmlschema-1/.
XML Schema Part 2:
Datatypes Second Edition,
Ashok Malhotra and Paul V. Biron, Editors.
World Wide Web Consortium, 28 October 2004.
This version is http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/.
The latest version is
available at http://www.w3.org/TR/xmlschema-2/.Describing Media Content of Binary Data in XML, Anish Karmarkar,
Ümit Yalçinalp, Editors.
World Wide Web Consortium, 04 May 2005.
This version is &mediatype;.
The latest version
is available at http://www.w3.org/TR/xml-media-types.Key words for
use in RFCs to Indicate Requirement Levels, S. Bradner,
Editor.
IETF, March 1997.
This RFC is available at http://www.ietf.org/rfc/rfc2119.txt.The MIME
Multipart/Related Content-type, E. Levinson,
Editor.
IETF, August 1998.
This RFC is available at http://www.ietf.org/rfc/rfc2387.txt.MIME
Encapsulation of Aggregate Documents, such as HTML (MHTML),
J. Palme, A. Hopmann and N. Shelness, Editors.
IETF, March 1999.
This RFC is available at http://www.ietf.org/rfc/rfc2557.txt.Content-ID and
Message-ID Uniform Resource Locators,
E. Levinson, Editor.
IETF, August 1998.
This RFC is available at http://www.ietf.org/rfc/rfc2392.txt.XML
Media Types, M. Murata, S. St.Laurent
and D. Kohn, Editors.
IETF, January 2001.
This RFC is available at http://www.ietf.org/rfc/rfc3023.txt.Uniform Resource
Identifiers (URI): Generic Syntax, T. Berners-Lee,
R. Fielding and L. Masinter, Editors.
IETF, January 2005.
Obsoletes: RFC 2396,
RFC 2732.
This RFC is available at http://www.ietf.org/rfc/rfc3986.txt.
Informative References
Canonical XML Version 1.0, John Boyer, Editor.
World Wide Web Consortium, 15 March 2001.
This version is http://www.w3.org/TR/2001/REC-xml-c14n-20010315.
The latest version is
available at http://www.w3.org/TR/xml-c14n.Exclusive XML Canonicalization Version 1.0,
Joseph Reagle, Donald E. Eastlake 3rd, and John Boyer, Editors.
World Wide Web Consortium, 18 July 2002.
This version is http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/.
The latest version is
available at http://www.w3.org/TR/xml-exc-c14n.XML Encryption Syntax and Processing,
Joseph Reagle and Donald Eastlake, Editors.
World Wide Web Consortium, 10 December 2002.
This version is http://www.w3.org/TR/2002/REC-xmlenc-core-20021210/.
The latest version is
available at http://www.w3.org/TR/xmlenc-core/.&name-part1;,
Martin Gudgin, Marc Hadley, Noah Mendelsohn, Jean-Jacques Moreau,
Henrik Frystyk Nielsen, Anish Karmarkar, Yves Lafon, Editors.
World Wide Web Consortium, &draft.day; &draft.month; &draft.year;.
This version is &dated-part1;.
The latest version is
available at http://www.w3.org/TR/soap12-part1/.&name-part2;,
Martin Gudgin, Marc Hadley, Noah Mendelsohn, Jean-Jacques Moreau,
Henrik Frystyk Nielsen, Anish Karmarkar, Yves Lafon, Editors.
World Wide Web Consortium, &draft.day; &draft.month; &draft.year;.
This version is &dated-part2;.
The latest version is
available at http://www.w3.org/TR/soap12-part2/.
Acknowledgements

This specification is the work of the W3C XML Protocol Working Group.

Participants in the Working Group are (at the time of writing, and by
alphabetical order): &wgmb;

Previous participants were: &prevwgmb;

The people who have contributed to discussions on
xml-dist-app@w3.org
are also gratefully acknowledged.