Abstract

This document defines the meaning of the attribute
xml:id as an ID attribute in XML documents and defines
processing of this attribute to identify IDs in the absence of
validation, without fetching external resources, and without relying
on an internal subset.

Status of this Document

This section describes the status of this document at the
time of its publication. Other documents may supersede this document.
A list of current W3C publications and the latest revision of this
technical report can be found in the
W3C technical reports index at
http://www.w3.org/TR/.

This document is a Proposed Recommendation (PR) of the W3C.
This document has been developed by the
W3C XML Core Working Group as
part of the XML
Activity. Publication as a Proposed Recommendation does not
imply endorsement by the W3C Membership. This is a draft document and
may be updated, replaced or obsoleted by other documents at any time.
It is inappropriate to cite this document as other than work in progress.
progress.

W3CThis document is based upon the
xml:id
Last Call Working Draft
published Advisory Committee Members are invited to send formalcontains
clarifications, review
commentseditorial to the W3C Team until 26 August 2005.
AdvisoryLast Call comments. The significant changes in Committee Representatives shouldare:
Added consult
Added their
WBS
questionnairesAdded .
The
(This appendix gives a longer and more public is invited to sendtopic
only comments to the public mailing list
public-xml-id@w3.orgdraft.)
(archive).
AfterChanged the review the Director
willassignment' announce the document's disposition.assignment'
Made This announcement should
notthe be expected sooner than 14 daysNamespace
versions after the end of theidentifying a valid review.

ThisClarified the scope of xml:id with respect to application documentprocessing
and is based upon theoccurs.
xml:id
CandidateClarified Recommendationof 8 February 2005. Feedback receivedguidelines.
Informed during that review
will be removed before xml:id
becomes a recommendation (in order to resulted in
clarificationsissues but noindependent
evolution major changes. A
review versionhighlighting
differencesAdded a betweennote about the CRimpact of xml:id on C14N processing.
Please send review draft and this PRCandidate Recommendation to
public-xml-id@w3.org
(archive).
We expect draft is available.
to determine its future will have been received by 10 March 2005.

1 Introduction

[XML 1.0] and [XML 1.1] provide a mechanism
for annotating elements with unique identifiers. This mechanism
consists of declaring the type of an attribute as "ID", after which
the parser will validate that

the ID value matches the allowed lexical
form,

the value is unique within the XML document, and that

each element has at most one single unique identifier

Declarations in either the
internal or external subset of an XML document can declare attributes
to be of type ID.
However, some specifications, notably
[SOAP], forbid an internal subset, and processing
the external subset is optional for conformant XML
processors, leaving no guarantee that all consumers of the XML
document will be able to successfully recognize the identifiers.

Identifiers can be declared through external mechanisms as well. Of
particular interest is [XML Schemas] which provides a
type "xs:ID" with the same uniqueness and validity constraints as XML.
However, there are no guarantees that consumers will have the
"correct" schema available, nor that they will process it if they
do.

A mechanism allowing unique element identifiers to be recognized by
all conformant XML
processors, whether they validate or not, is
desirable in making XML sub-resource linking robust.
This specification allows authors to identify elements with IDs
that can be recognized by any processor without regard to how, or if,
any internal or external declarations are available.

An additional problem is that DTD-based and XML Schema-based
identifiers are exposed through different conceptual mechanisms - the
attribute type infoset property,
and the type definition family of
properties respectively. A uniform mechanism for recognizing
identifiers is desirable.

This specification provides such a mechanism: it describes the
semantics of xml:id attributes. This specification has been
designed to be a separate layer in processing and to be compatible
with existing validation technologies. Implementors are encouraged to
support xml:id processing and to make
ID type assignment the default
behavior of their processors.

It has been a guiding principle in the design of this specification
that the result of xml:id processing should be the same
as if an appropriate declaration has been seen and used by the processor.

2 Terminology

[Definition: The key words
must, must not, required,
shall, shall not, should,
should not, recommended, may,
and optional in this specification are to be interpreted
as described in [IETF RFC 2119].]

[Definition: An
xml:id processor is a software module that works in
conjunction with an XML
processor to provide access to the IDs in an XML document.]

[Definition: An
xml:id error is a non-fatal error that occurs when an
xml:id processor finds that
a document has violated the constraints of this specification.]

Validation is the process of comparing an XML document (or part of
an XML document) against a grammar or set of rules to determine if the
actual structure of the document satisfies the constraints of the
grammar or the rules. Some validation technologies also perform type
assignment, determining not only if the document satisfies the
specified constraints but also determining, for example, which
(elements and/or) attributes are of type “ID”.

Although often performed together, validation and type assignment
are not the same process. A non-validating XML 1.0 processor, for
example, can perform type assignment using only declarations from the
internal subset, without ever having any information about the
structural validity of the document.

[Definition: The
process of ID type assignment causes an xml:id
attribute value to be an ID.] This is often achieved by changing the
type of the attribute to be "ID" in the infoset or PSVI, but that is
not the only possible mechanism.

Note:

Application-level processing of IDs, including which elements can
actually be addressed by which ID values, is beyond the scope of this
specification.

3 Syntax

Per [Namespaces in XML] (and [Namespaces in XML 1.1]), prefixes
beginning “xml” are reserved for use by XML and XML-related
specifications. This specification licenses the use of the attribute
“xml:id” for use as a common syntax for identifiers in XML with the
semantics specified herein.

Authors of XML documents are encouraged to name their ID attributes
"xml:id" to increase the interoperability of these identifiers on the
Web.

In namespace-aware XML processors, the "xml" prefix is bound to the
namespace name http://www.w3.org/XML/1998/namespace as
described in Namespaces in XML [Namespaces in XML] (and [Namespaces in XML 1.1]). Note that xml:id can be still used by
non-namespace-aware XML processors.

An xml:id processor
must assure that the following
constraints hold for all xml:id attributes:

The normalized value of the attribute is an
NCName according to the
Namespaces in XML
Recommendation which has the same version as
the document in which this attribute occurs
(NCName
for XML 1.0, or
NCName
for XML 1.1).

The declared type of the attribute, if it has one, is “ID”.
All declarations for xml:id attributes
must
specify “ID” as the type of the attribute.

An xml:id processor
should assure that the following
constraints hold:

The values of all xml:id attributes and all attributes
of type “ID” within a document are unique.

An xml:id error occurs
for any xml:id attribute that does not satisfy the
constraints.

The xml:id processor performs
ID type assignment on all
xml:id attributes, even those that do not satisfy
the enumerated constraints.

An xml:id processor
should update the
references infoset property, as
described in Section 2.3 of [XML Information Set], and update any
implementation-dependent structuresto used for cross-referencing to
reflect the results of ID assignment.

Many validation technologies impose the constraint that an XML
element can have at most one attribute of type ID. That constraint is
not imposed by xml:id processing.

This specification defines xml:id processing, but it is up to the
application to determine when such processing occurs. Users of
applications that provide facilities for modifying XML documents may
reasonably expect xml:id processing to occur whenever a change is
made to an ID value.

5 Informing the Application

ID type assignment may be
performed when xml:id attributes are processed.
If ID type assignment
occurs, then the xml:id processormust report the assigned
xml:id attributes
to the application. How this is reported is implementation
dependent.

For applications that operate conceptually on the Infoset, an
xml:id processor can use the
attribute type Infoset
property:

The xml:id processor may report the results of ID type assignment in a
DTD compatible manner by setting the attribute type infoset property of the
attribute to ID.

For applications that operate conceptually on the PSVI, an
xml:id processor can use the
type definition family
of PSVI properties:

The xml:id processor may
report the results of ID type assignment
in an XML Schema compatible manner by setting the PSVI
type definition property of the
attribute to xs:ID.

For applications that operate on data models defined in other ways,
the mechanisms are implementation dependent:

The xml:id processor may
report the results of ID type assignment in some other way.

The key requirement is that the application be made aware of the results
of ID type assignment.

6 Errors

A violation of the constraints in this specification results in an
xml:id error.
Such errors are not fatal, but should
be reported by the
xml:id processor.
In the interest of interoperability, it is strongly recommended that
xml:iderrors not beapplication silently ignored.

7 Conformance

7.1 Conformance to xml:id

Conformance to xml:id for applications that rely on
XML
processors using validation technologies consists in the
use of the xml:id construct as explained in
4 Processing xml:id Attributes and by conformance to both the constraints
of this specification and the rules of the validation technology.

Conformance to xml:id for applications that rely on
non-validating
XML
processors is defined by the recognition of xml:id
attributes as explained in
4 Processing xml:id Attributes and by conformance to the constraints
of this specification.

Conformance to constraints that
“must” be assured is mandatory.
It is recommended that applications assure the other constraints as well.
This specification defines no simply optional constraints.

A document is conformant to this specification if it generates no
xml:id errors.

7.2 XML Information Set Conformance

This specification conforms to the [XML Information Set].
The following information items must
be present in the input infosets to enable correct processing:

Element Information Items with
attributes property.

Attribute Information Items with
namespace name,
local name and
normalized value properties.

In addition, the following properties might be present in the output infoset:

attribute type properties on Attribute Information Items.

8 Extensibility

This specification is not extensible. There are no provisions for application
designers to alter the name of the xml:id attribute, the set of
attribute values that are considered IDs, the location(s) where they can
occur, or make any other extensions.

C Impact on CanonicalizationOther Standards (Non-Normative)

This appendix is informative for use during development of xml:id.
It will be removed before xml:id becomes a Recommendation.
XPath 1.0: The id() function only recognizes IDs
declared in the DTD. If xml:id processing is performed before
the document is provided to the XPath 1.0 processor, and if ID type assignment is
performed by setting the
attribute type infoset property,
XPath 1.0 will recognize xml:id attributes as IDs.
An erratum to XPath 1.0 would be required to allow
Schema-declared IDs to be included in the results of this function.
XPath 2.0: No change required. The id() function
recognizes both DTD- and Schema-declared identifiers, and as such
would also recognize xml:id attributes identified with a
minimally conforming schema processor.
Fragment Identifiers and the XPointer Framework:
No change required. Barename
fragment identifiers and the element() scheme recognize both DTD- and
Schema-declared identifiers, and as such would also recognize
xml:id attributes identified with a minimally conforming
schema processor.
DOM: To the extent that a DOM instance is constructed from
an Infoset or PSVI, no change is required. The xml:id processorʼs
ID type assignment of xml:id attributes will be visible to the
DOM construction process.
CSS: To the extent that a CSS operates on a tree constructed from
an Infoset or PSVI, no change is required. The xml:id processorʼs
ID type assignment of xml:id attributes will be visible to the
construction process.
Canonicalization: The
Canonical XML
Version 1.0 specification
describes
a process
whereby attributes in the xml: namespace are inherited in a
canonicalized document. While this produces a reasonable result with
xml:lang or
xml:space attributes, processing
xml:id attributes in this way is likely to produce
documents that contain
xml:id errors, specifically
specifically xml:id attribute values that are not unique.

D.2 With XML Schema Validation

XML Schema authors are encouraged to use xml:id
attributes when providing identifiers for elements declaredexclusively in their
schemas. Noteto that this can most easily be accomplished by importing
the schema for the XML
namespaceand using the attribute declaration it contains.identifiers.

The following XML Schema fragment for the XML namespace illustrates a
sample declaration for the xml:id attribute:

XML Schema authors are encouraged to declare attributes named
xml:id with the type xs:ID.
A document that uses xml:id attributes that have a declared
type other than xs:ID will always generate xml:id errors.

Consumers of documents validating the xml:id attributes
against an appropriate schema for the XML namespace can recognize IDs
through the type definition
family of PSVI properties.

Note that the effects of a Minimally Conforming Schema Processor,
processing the above schema, are approximated by simply looking for
attributes named xml:id, ensuring the value of such
attributes has the correct lexical form (NCName),
and the value is unique within the document.

D.3 With RELAX NG Validation

RELAX NG Grammar authors are encouraged to use xml:id
attributes when providing identifiers for elements declared in theirelement
schemas.

The following RELAX NG fragment illustrates a
sample declaration for the xml:id attribute:

RELAX NG Grammar authors are encouraged to declare attributes named
xml:id with the type xs:ID.
A document that uses xml:id attributes that have a declared
type other than xs:ID will always generate xml:id errors.

E Attribute Value Normalization on IDs (Non-Normative)

Parsers are required to
normalize
all attribute values. Normalization expands character references, expands
entity references, and cleans up line end characters. Attributes of
type ID are subject to additional normalization rules: removing leading
and trailing space characterswhitespace and replacing sequences of spaces with a single
space.

The xml:id processor has to assure that both kinds of normalization
are performed all attributes named xml:id. In particular,
the parser may not have performed the additional normalization
required for attributes of type ID because the attribute may not be
declared or may be declared as an ID.

The initial value of xml:id on doc will be
“one” because the parser knew that it was an ID. The initial value
on para will be “ two ”. Because the parser didn't know it
was an ID, it will not have performed the additional normalizations
required.

After xml:id processing, the value of the xml:id attributes
on doc and para will be “one” and “two”, respectively.
These properly normalized values will be stored in the
normalized value property in the
infoset. Performing xml:id processing changes the infoset if there
are incompletely normalized xml:id attributes.

Note:

For interoperability, document producers should use fully normalized values that are
legal NCNames in
xml:id attributes.