Abstract

This
document specifies the XML Schema Definition Language,
which offers facilities for describing the structure and constraining the contents
of XML documents, including those which
exploit the XML Namespace facility. The schema language, which is itself
represented
in an XML vocabulary and uses
namespaces, substantially reconstructs and considerably
extends the capabilities found in XML
document type definitions (DTDs). This specification depends on
XML Schema Definition Language 1.1 Part 2: Datatypes.

Status of this Document

This section describes the status of this document at the
time of its publication. Other documents may supersede this document.
A list of current W3C publications and the latest revision of this
technical report can be found in the
W3C technical reports index at
http://www.w3.org/TR/.

This is a
Last Call
Public Working Draft of W3C XML Schema Definition Language (XSD) 1.1. It
is here made
available for review by W3C members
and the public. XSD 1.1 retains all
the essential features of XSD 1.0, but adds several new
features to support functionality requested by users,
fixes many errors in XSD 1.0,
and clarifies wording.

This draft was published
on 20 June 2008.
The
previous
working draft of 30 August 2007 was a Last-Call Working Draft which
elicited numerous comments and suggestions for improvements. All
substantive issues have now been resolved, although some editorial
issues remain open.
The major revisions since the previous draft
include
the following:

The minimal subset of XPath which processors were required
to support for assertions has been eliminated; processors
must support all of XPath.

A new wildcard keyword ##definedSibling has been
added to allow a wildcard to match any element except
one mentioned explicitly elsewhere in the current content model.

The definitions of must and ·error· have been
revised to require that processors detect and report
errors (although the quality and level of detail of
the error messages are not constrained).

An <override> element has been defined
to allow the declarations or definitions of specified
components in other schema documents to be overridden.

XML Representation Constraints no longer refer to
the component level; they can now be checked for
schema documents in isolation.

Numerous editorial changes and clarifications have been
made and numerous small errors corrected.

For those primarily interested in the changes since version 1.0,
the appendix
Changes since version 1.0 (non-normative) (§H) is the recommended starting
point. It summarizes both changes made
since XSD 1.0 and some changes which were expected (and predicted
in earlier drafts of this specification) but have not been made
after all.
Accompanying versions of this document display in color
all changes to normative text since version 1.0 and since the
previous Working Draft.

Although feedback based on any
aspect of this specification is welcome, there are certain aspects of
the design presented herein for which the Working Group is
particularly interested in feedback. These are designated
"priority feedback" aspects of the design, and
identified as such in editorial notes at appropriate points in this
draft.
Any feature mentioned in a
priority feedback note should be considered a "feature
at risk": the feature may be retained as is, modified, or
dropped, depending on the feedback received from readers,
schema authors, schema users, and implementors.

Publication as a Working Draft does not imply endorsement by the
W3C Membership. This is a draft document and may be updated, replaced
or obsoleted by other documents at any time. It is inappropriate to
cite this document as other than work in progress.

1 Introduction

This document sets out the structural part of the XML Schema Definition Language.

Chapter 2 presents a Conceptual Framework (§2) for XSD, including an introduction to the
nature of XSD schemas and an introduction to the XSD
abstract data model, along with other terminology used throughout
this document.

Chapter 3, Schema Component Details (§3), specifies the precise
semantics of each component of the abstract model, the
representation of each component in XML, with reference to a DTD
and an XSD schema for
an XSD document type, along with a detailed mapping
between the elements and attribute vocabulary of this
representation and the components and properties of the abstract
model.

This document is primarily intended as a language definition
reference. As such, although it contains a few examples, it is
not primarily designed to serve as a motivating
introduction to the design and its features, or as a tutorial for
new users. Rather it presents a careful and fully explicit
definition of that design, suitable for guiding implementations.
For those in search of a step-by-step introduction to the design,
the non-normative [XML Schema: Primer]
is a much better starting point than this document.

1.1 Introduction to Version 1.1

The Working Group has three main goals for this version of W3C
XML Schema:

Significant improvements in simplicity of design and
clarity of exposition without loss of backward
or forward compatibility;

Provision of support for versioning of XML languages
defined using this
specification, including the XML vocabulary
specified here for use in schema documents.

Provision of support for
co-occurrence constraints, that is constraints which make the
presence of an attribute or element, or the values allowable
for it, depend on the value or presence of other attributes or
elements.

These goals are
in tension with one another. The Working Group's strategic guidelines
for changes between versions 1.0 and 1.1 can be summarized as follows:

Support
for versioning (acknowledging that this may be
slightly disruptive to the XML transfer syntax at the
margins)

Support for co-occurrence
constraints (which will certainly involve additions to the XML
transfer syntax, which will not be understood by 1.0
processors)

Bug fixes (unless in specific
cases we decide that the fix is too disruptive for a point
release)

Editorial changes

Design cleanup will possibly change behavior in edge
cases

Non-disruptive changes to type hierarchy
(to better support current and forthcoming international
standards and W3C recommendations)

No changes
to XML transfer syntax except those required by version control
hooks, co-occurrence
constraints and bug fixes

The aim with regard
to compatibility is that

All schema documents conformant to version 1.0 of this
specification should also conform to version 1.1, and should
have the same validation behavior across 1.0 and 1.1 implementations
(except possibly in edge cases and in the details of the
resulting PSVI);

The vast majority of schema documents conformant to
version 1.1 of this specification should also conform to
version 1.0, leaving aside any incompatibilities arising from
support for versioning or
co-occurrence constraints, and when they are
conformant to version 1.0 (or are made conformant by the
removal of versioning information), should have the same
validation behavior
across 1.0 and 1.1 implementations (again except possibly in
edge cases and in the details of the resulting PSVI);

1.2 Purpose

The purpose of XML Schema Definition Language: Structures is to define the nature of
XSD schemas and their component parts,
provide an inventory of XML markup constructs with which to
represent schemas, and define the application of schemas to XML
documents.

The purpose of an XSD schema is to define and describe a
class of XML documents by using schema components to constrain
and document the meaning, usage and relationships of their
constituent parts: datatypes, elements and their content and
attributes and their values. Schemas can also provide for
the specification of additional document information, such as
normalization and defaulting of attribute and element values.
Schemas have facilities for self-documentation. Thus, XML Schema Definition Language: Structures can
be used to define, describe and catalogue XML vocabularies for
classes of XML documents.

Any application that consumes well-formed XML can use the
formalism defined here to express
syntactic, structural and value constraints applicable to its
document instances. The XSD formalism allows a useful level of
constraint checking to be described and implemented for a wide
spectrum of XML applications. However, the language defined by
this specification does not attempt to provide all
the facilities that might be needed by applications. Some applications
will require constraint capabilities not expressible in this
language, and so will need to perform their own additional
validations.

1.3.1 XSD Namespaces

1.3.1.1 The Schema Namespace (xs)

The XML representation of schema components uses a vocabulary
identified by the namespace name http://www.w3.org/2001/XMLSchema.
For brevity, the text and examples in this specification use
the prefix xs: to stand for this
namespace; in practice, any prefix can be used.

Note:
The namespace for schema documents is unchanged from version
1.0 of this specification, because any schema document valid
under the rules of version 1.0 has essentially the same
validation semantics under this specification as it did under
version 1.0 (Second Edition).
There are a few exceptions to this rule, involving errors in
version 1.0 of this specification which were not reparable by
errata and which have therefore been fixed only in this
version of this specification, not in version 1.0.

Note:
The data model used by [XPath 2.0] and other
specifications, namely [XDM], makes use of
type labels in the
XSD namespace (untyped,
untypedAtomic) which are not defined in this
specification; see the [XDM]
specification for details of those types.

Users of the namespaces defined here should be aware, as a
matter of namespace policy, that more names
in this namespace may be given
definitions in future versions of this or other
specifications.

1.3.1.2 The Schema Instance Namespace (xsi)

This specification defines
several attributes for direct use in any XML documents, as
described in Schema-Related Markup in Documents Being Validated (§2.6).
These attributes are in the namespace whose name is http://www.w3.org/2001/XMLSchema-instance.
For brevity, the text and examples in this specification use
the prefix xsi: to stand for this namespace; in
practice, any prefix can be used.

Users of the namespaces defined here should be aware, as a
matter of namespace policy, that more names
in this namespace may be given
definitions in future versions of this or other
specifications.

1.3.1.3 The Schema Versioning Namespace (vc)

The pre-processing of schema documents described in
Conditional inclusion (§4.2.1) uses
attributes in the namespace
http://www.w3.org/2007/XMLSchema-versioning.
For brevity, the text and examples in this specification use
the prefix vc: to stand for this
namespace; in practice, any prefix can be used.

Users of the namespaces defined here should be aware, as a
matter of namespace policy, that more names in this namespace
may be given definitions in future versions of this or other
specifications.

1.3.2 Namespaces with Special Status

Except as otherwise specified elsewhere in this specification,
if components are ·present· in a schema, or source
declarations are included in an XSD schema document, for
components in any of the following namespaces, then the
components, or the declarations, should agree with the
descriptions given in the relevant specifications and with the
declarations given in any applicable XSD schema documents
maintained by the World Wide Web Consortium for these
namespaces. If they do not, the effect is ·implementation-dependent·
and not defined by this specification.

http://www.w3.org/XML/1998/namespace

http://www.w3.org/2001/XMLSchema

http://www.w3.org/2001/XMLSchema-instance

http://www.w3.org/2007/XMLSchema-versioning

Note: Depending on implementation details, some processors may
be able to process and use (for example) variant forms of the
schema for schema documents devised for specialized purposes;
if so, this specification does not forbid the use of such variant
components. Other processors, however, may find it
impossible to validate and use alternative components for
these namespaces; this specification does not require them
to do so. Users who have an interest in such specialized
processing should be aware of the attending interoperability
problems and should exercise caution.

This flexibility does not extend to the components described in
this specification or in [XML Schema: Datatypes] as being
included in every schema, such as those for the primitive and
other built-in datatypes. Since those components are by
definition part of evey schema, it is not possible to have
different components with the same expanded names present in
the schema without violating constraints defined elsewhere
against multiple components with the same expanded names.

Components and source declarations must not specify
http://www.w3.org/2000/xmlns/ as their
target namespace. If they do, then the schema
and/or schema document is in ·error·.

Note: Any confusion in the use, structure, or meaning of this namespace
would have catastrophic effects on the interpretability of
this specification.

1.3.3 Conventional Namespace Bindings

Several namespace prefixes are conventionally used in this
document for notational convenience. The following bindings are
assumed.

xs bound to http://www.w3.org/2001/XMLSchema
(defined in this and related specifications)

xsi bound to
http://www.w3.org/2001/XMLSchema-instance (defined in this and
related specifications)

xsl bound to
http://www.w3.org/1999/XSL/Transform

In practice, any prefix bound to the appropriate namespace
name may be used (unless otherwise specified by the definition
of the namespace in question, as for xml and
xmlns).

1.3.4 Schema Language Identifiers

Sometimes other specifications or Application Programming
Interfaces (APIs) need to refer to the XML Schema Definition Language in
general, sometimes they need to refer to a specific version of
the language. To make such references easy and enable consistent identifiers to be used, we provide the following
URIs to identify these
concepts.

http://www.w3.org/XML/XMLSchema

Identifies the XML Schema Definition Language in general, without referring
to a specific version of it.

http://www.w3.org/XML/XMLSchema/vX.Y

Identifies the language described in version X.Y of the XSD specification. URIs of this form refer to
a numbered version
of the language in general. They do not distinguish among different working drafts or
editions of that version. For example,
http://www.w3.org/XML/XMLSchema/v1.0 identifies
XSD version 1.0 and http://www.w3.org/XML/XMLSchema/v1.1 identifies
XSD version 1.1.

http://www.w3.org/XML/XMLSchema/vX.Y/Ne

Identifies the language described in the N-th edition of version X.Y of
the XSD specification. For example, http://www.w3.org/XML/XMLSchema/v1.0/2e
identifies the second edition of XSD version 1.0.

http://www.w3.org/XML/XMLSchema/vX.Y/Ne/yyyymmdd

Identifies the language described in the N-th edition of version
X.Y of
the XSD specification published on the particular date
yyyy-mm-dd. For example,
http://www.w3.org/XML/XMLSchema/v1.0/1e/20001024
identifies the language
defined in the XSD version 1.0 Candidate
Recommendation (CR) published on 24 October 2000, and
http://www.w3.org/XML/XMLSchema/v1.0/2e/20040318
identifies the language
defined in the XSD version 1.0 Second Edition Proposed
Edited Recommendation (PER)
published on 18 March 2004.

Conforming implementations of this specification may provide
either the 1.1-based datatypes or the 1.0-based datatypes, or
both. If both are supported, the choice of which datatypes to
use in a particular assessment episode should be under user
control.

Note:
It is a consequence of the
rule just given that implementations
may provide the heuristic of using the 1.1
datatypes if the input is labeled as XML 1.1, and the 1.0
datatypes if the input is labeled 1.0. It should be noted
however that the XML version number is not required to be
present in the input to an assessment episode, and in any case
the heuristic should be subject to override by users, to
support cases where users wish to accept XML 1.1 input but
validate it using the 1.0 datatypes, or accept XML 1.0 input and
validate it using the 1.1 datatypes.

Note:
Some users will perhaps wish to accept only XML 1.1 input, or
only XML 1.0 input. The rules
just given ensure that conforming implementations of this
specification which accept XML input may accept XML 1.0, XML
1.1, or both and may provide user control over which versions
of XML to accept.

1.5 Documentation Conventions and Terminology

The section introduces the highlighting and typography as used
in this document to present technical material.

Unless otherwise noted, the entire text of
this specification is normative. Exceptions include:

notes

sections explicitly marked non-normative

examples and their commentary

informal descriptions of the consequences of rules
formally and normatively stated elsewhere (such informal
descriptions are typically introduced by phrases like
"Informally, ..." or "It is a
consequence of ... that ...")

Explicit statements that some material is normative are not
to be taken as implying that material not so described
is non-normative
(other than that mentioned in the list just given).

Special terms are defined at their point of introduction in the
text. For example [Definition:] a term is something used with a
special meaning. The definition is labeled as such
and the term it defines is displayed in boldface. The end of the
definition is not specially marked in the displayed or printed
text. Uses of defined terms are links to their definitions, set
off with middle dots, for instance ·term·.

Non-normative examples are set off in boxes and accompanied by
a brief explanation:

References to properties of schema components are links to the
relevant definition as exemplified above, set off with curly
braces, for instance
{example property}.

For a given component C, an expression
of the form "C.{example property}"
denotes the (value of the) property
{example property} for component C.
The leading "C." (or more) is sometimes omitted,
if the identity of the component and any other omitted properties
is understood from the context.
This "dot operator" is left-associative, so
"C.{p1}.{p2}"
means the same as
"(C.{p1}) .
{p2}"
and denotes the value of property {p2}
within the component or ·property record· which itself
is the value of C's {p1} property.
White space on either side of the dot operator has no significance
and is used (rarely) solely for legibility.

For components C1 and C2, an expression
of the form "C1 . {example property 1} = C2 . {example property 2}"
means that C1 and C2 have the same value for the
property (or properties) in question. Similarly,
"C1 = C2" means that C1 and C2 are
identical, and "C1.{example property}
= C2" that C2 is the value of
C1.{example property}.

The correspondence between an element information item which is
part of the XML representation of a schema and one or more schema
components is presented in a tableau which illustrates the
element information item(s) involved. This is followed by a
tabulation of the correspondence between properties of the
component and properties of the information item. Where context
determines which of several
different components corresponds to the
source declaration, several tabulations, one per
context, are given. The property correspondences are normative,
as are the illustrations of the XML representation element
information items.

In the XML representation, bold-face attribute names (e.g.
count below) indicate a required attribute
information item, and the rest are optional. Where an attribute
information item has an enumerated type definition, the values
are shown separated by vertical bars, as for size
below; if there is a default value, it is shown following a
colon. Where an attribute information item has a built-in simple
type definition defined in [XML Schema: Datatypes], a hyperlink
to
its definition therein is given.

The allowed content of the information item is shown as a
grammar fragment, using the Kleene operators ?,
* and +. Each element name therein is
a hyperlink to its own illustration.

Description of what
the property corresponds to, e.g. the value of the
size[attribute]

References to elements in the text are links to the relevant
illustration as exemplified above, set off with angle brackets,
for instance <example>.

Unless otherwise specified, references to attribute values
are references to the ·actual value· of the attribute information
item in question, not to its ·normalized value· or to other forms
or varieties of "value" associated with it.
For a given element information item E, expressions of the
form "E has att1 = V"
are short-hand for "there is an attribute information
item named att1 among the [attributes] of E and
its ·actual value·
is V."
If the identity of E is clear from context, expressions
of the form "att1 = V"
are sometimes used.
The form "att1 ≠ V" is also used
to specify that the ·actual value· of att1 is
notV.

References to properties of information items as defined in
[XML-Infoset] are notated as links to the relevant
section thereof, set off with square brackets, for example
[children].

Properties which this specification defines for information
items are introduced as follows:

References to properties of information items defined in this
specification are notated as links to their introduction as
exemplified above, set off with square brackets, for example
[new property].

The "dot operator" described above
for components and their properties is also used for information items
and their properties. For a given information item I, an expression
of the form "I . [new property]"
denotes the (value of the) property
[new property] for item I.

Lists of normative constraints are typically introduced with
phrase like
"all of the following are true" (or "... apply"),
"one of the following is true",
"at least one of the following is true",
"one or more of the following is true",
"the appropriate case among the following is true",
etc.
The phrase "one of the following is true"
is used in cases where the authors believe the items listed
to be mutually exclusive (so that the distinction between
"exactly one" and "one or more"
does not arise). If the items in such a list are not in fact
mutually exclusive, the phrase "one of the following"
should be interpreted as meaning "one or more of the
following".
The phrase "the appropriate case among the following"
is used only when the cases are thought by the authors to be
mutually exclusive; if the cases in such a list are not in fact
mutually exclusive, the first applicable case should be
taken. Once a case has been encountered with a true condition,
subsequent cases must not be tested.

The following highlighting is used for non-normative commentary
in this document:

Note: General comments directed to all readers.

Within normative prose in this
specification, the words may,
should,
must and must not are
defined as follows:

may

Schemas,
schema documents, and processors are
permitted to but need not behave as described.

should

It is recommended that schemas,
schema documents,
and
processors behave as described, but there
can be valid reasons for them not to; it is important that the
full implications be understood and carefully weighed before
adopting behavior at variance with the recommendation.

must

(Of schemas and
schema documents:)
Schemas and documents are required to behave as
described; otherwise they are in ·error·.

(Of
processors:)
Processors are
required to behave as described.

must not

Schemas,
schema documents, and processors
are forbidden to behave as
described; schemas and documents which nevertheless
do so are in ·error·.

error

A failure of a schema
or schema
document to conform to the rules of this
specification.

Except as otherwise specified,
processors must distinguish error-free (conforming) schemas
and schema documents used in ·assessment· from those with errors;
if a schema used in ·assessment·
or a schema document used in constructing a schema
is in error,
processors must report the fact;
if more than one is in error, it is ·implementation-dependent·
whether more than one is reported as being in error.
If one or more of the constraint codes given
in Outcome Tabulations (normative) (§C) is applicable, it is
·implementation-dependent· how many of them, and which,
are reported.

Note: Failure of an XML document to be valid against a particular
schema is not (except for the special case of a schema
document consulted in the course of building a schema) in
itself a failure to conform to this specification and thus,
for purposes of this specification, not an error.

Note: Notwithstanding the fact that (as just noted) failure to be
schema-valid is not a violation of this specification and
thus not strictly speaking an error as defined here,
the names of the PSVI properties [schema error code] (for attributes) and [schema error code] (for elements) are retained for
compatibility with other versions of this specification, and
because in many applications of XSD, non-conforming
documents are "in error" for
purposes of those applications.

deprecated

A feature or construct defined in this specification
described as deprecated is retained in this
specification for compatibility with previous versions
of the specification, and but its use is not advisable and
schema authors should avoid its use if possible.

Deprecation has no effect on the conformance of schemas
or schema documents which use deprecated features.
Since deprecated features are part of the specification,
processors must support them, although some processors
may choose to issue warning messages when deprecated
features are encountered.

Features deprecated in this version of this specification
may be removed entirely in future versions, if any.

These definitions describe in terms
specific to this document the meanings assigned to these terms by
[IETF RFC 2119]. The specific wording follows
that of [XML 1.1].

Where these terms appear without special highlighting,
they are used in their ordinary senses and do not express conformance
requirements. Where these terms appear highlighted within
non-normative material (e.g. notes), they are recapitulating
rules normatively stated elsewhere.

2 Conceptual Framework

This chapter gives an overview of XML Schema Definition Language: Structures at the level of its
abstract data model. Schema Component Details (§3) provides details
on this model, including a normative representation in XML for the
components of the model. Readers interested primarily in learning
to write schema documents will find it most
useful first to read [XML Schema: Primer] for a
tutorial introduction, and only then to consult the sub-sections of
Schema Component Details (§3) named XML Representation of
... for the details.

2.1 Overview of XSD

An XSD schema is
a set of components such as type definitions and
element declarations. These can be used to assess the validity of
well-formed element and attribute information items (as defined
in [XML-Infoset]), and furthermore
may
specify augmentations to those items and their descendants. This
augmentation makes explicit information implicit in the original
document, such as normalized and/or default values for attributes
and elements and the types of element and attribute information
items. The input information set
can also be augmented with information about the validity of the
item, or about other properties described in this
specification. [Definition:] We refer to the augmented infoset which
results from conformant processing as defined in this
specification as the post-schema-validation
infoset, or PSVI. Conforming processors may provide
access to some or
all of the PSVI, as described in Subset of the Post-schema-validation Infoset (§D.1). The mechanisms by which
processors provide such
access to the PSVI are neither defined nor constrained by this
specification.

As it is used in this specification, the
term schema-validity assessment has two aspects:

1Determining local schema-validity, that is
whether an element or attribute information item satisfies
the constraints embodied in the relevant components of an
XSD schema;

2 Synthesizing an overall validation
outcome for the item, combining local schema-validity with
the results of schema-validity assessments of its
descendants, if any, and adding appropriate augmentations to
the infoset to record this
outcome.

Throughout this specification, [Definition:] the word valid and its derivatives are
used to refer to clause 1 above, the
determination of local schema-validity.

Throughout this specification, [Definition:] the word assessment is used to
refer to the overall process of local validation,
schema-validity assessment and infoset
augmentation.

During ·assessment·, some or
all of the element and attribute information items in the input
document are associated with declarations and/or type
definitions; these declarations and type definitions are then
used in the ·assessment· of those items, in a
recursive process. [Definition:] The declaration associated with an information
item, if any, and with respect to which its validity is ·assessed· in a given assessment episode
is said to govern the item, or to be its
governing element or attribute declaration.
Similarly the type definition with respect to which the
type-validity of an item is assessed is its
governing type definition.

Just as [XML 1.1] and
[XML-Namespaces 1.1] can be described in terms of
information items, XSD schemas can be described in terms of
an abstract data model. In defining schemas in terms of
an abstract data model, this specification rigorously specifies
the information which must be available to a conforming
XSD processor. The abstract model for schemas is
conceptual only, and does not mandate any particular
implementation or representation of this information. To
facilitate interoperation and sharing of schema information, a
normative XML interchange format for schemas is provided.

[Definition:] Schema
component is the generic term for the building blocks
that make up the abstract data model
of the schema. [Definition:] An XSD schema is a set of ·schema components·. There are
several kinds of schema component, falling
into three groups. The primary schema components, which may (type
definitions) or must (element and attribute declarations) have
names, are as follows:

Simple type definitions

Complex type definitions

Attribute declarations

Element declarations

The secondary schema components, are as
follows:

Attribute group definitions

Identity-constraint definitions

Type alternatives

Assertions

Model group definitions

Notation declarations

Finally, the "helper" schema components provide small
parts of other schema components; they are dependent on their context:

Annotations

Model groups

Particles

Wildcards

Attribute Uses

The
name [Definition:] Component covers all the different kinds of
schema component defined in this specification.

Note: At the abstract level, there is no requirement that the
components of a schema share a ·target namespace·. Any schema for
use in ·assessment· of documents
containing names from more than one namespace will of necessity
include components with different ·target namespaces·. This contrasts
with the situation at the level of the XML representation of
components, in which each schema document contributes
definitions and declarations to a single target namespace.

·Validation·, defined in detail
in Schema Component Details (§3), is a relation between information
items and schema components. For example, an attribute
information item is ·validated·
with respect to an attribute declaration, a list of element
information items with respect to a
content model, and so on. The following sections briefly
introduce the kinds of components in the schema abstract data
model, other major features of the abstract model, and how they
contribute to ·validation·.

2.2.1 Type Definition Components

The abstract model provides two kinds of type definition
component: simple and complex.

[Definition:] This specification
uses the phrase type definition in cases where no
distinction need be made between simple and complex
types.

Type definitions form a hierarchy with a single root. The
subsections below first describe characteristics of that
hierarchy, then provide an introduction to simple and complex
type definitions themselves.

[Definition:] A
type defined with the same constraints as its ·base type definition·, or with more, is
said to be a restriction. The added constraints might include narrowed
ranges or reduced alternatives. Given two types A and B, if the definition of
A is a ·restriction· of the
definition of B, then members of type A are always locally
valid against type B as well.

[Definition:] A complex
type definition which allows element or attribute content in
addition to that allowed by another specified type definition
is said to be an extension.

[Definition:] A special complex type
definition, (referred to in earlier versions of this
specification as 'the ur-type definition') whose
name is anyType in the XSD namespace, is
present in each ·XSD schema·. The definition of
anyType serves as default
type definition for element declarations whose XML
representation does not specify one.

[Definition:] A special simple type
definition, whose name is error in the XSD
namespace, is also present in each ·XSD schema·. The
XSD error type
has no valid instances. It can be used in any place where
other types are normally used; in particular, it can be used
in conditional type assignment to cause elements which satisfy
certain conditions to be invalid.

2.2.1.2 Simple Type Definition

A simple type definition is a set of constraints on strings
and information about the values they encode, applicable to the
·normalized value· of an attribute information item or of an element
information item with no element children. Informally, it
applies to the values of attributes and the text-only content
of elements.

[Definition:] There is a further special datatype
called anyAtomicType, a
·restriction· of
·xs:anySimpleType·, which is the ·base type definition·
of all the primitive
datatypes. This type definition is often referred
to simply as "xs:anyAtomicType".
It too is
considered to have an unconstrained lexical space. Its value
space consists of the union of the value spaces of all the
primitive datatypes.

The mapping from lexical space to value space is unspecified
for items whose type definition is ·xs:anySimpleType· or ·xs:anyAtomicType·. Accordingly
this specification does not constrain processors'
behavior in areas
where this mapping is implicated, for example checking such
items against enumerations, constructing default attributes or
elements whose declared type definition is ·xs:anySimpleType·
or ·xs:anyAtomicType·,
checking identity constraints involving such items.

Note: The Working Group expects to return to this area in a future
version of this specification.

[XML Schema: Datatypes]
provides mechanisms for defining new simple type definitions
by ·restricting·
some primitive
or ordinary datatype. It also
provides mechanisms for constructing new simple type
definitions whose members are lists of items
themselves constrained by some other simple type definition, or
whose membership is the union of the memberships of some other
simple type definitions. Such list and union simple type
definitions are also ·restrictions· of
·xs:anySimpleType·.

2.2.1.3 Complex Type Definition

A complex type definition is a set of attribute declarations
and a content type, applicable to the [attributes] and
[children] of an element information item respectively. The
content type may require the [children] to contain
neither element nor character information items (that is, to be
empty), or to be a
string which belongs to a particular simple type, or to contain a sequence of
element information items which conforms to a particular model
group, with or without character information items as well.

A complex type which extends another does so by having
additional content model particles at the end of the other
definition's content model, or by having additional attribute
declarations, or both.

Note: For the most part, this
specification allows only appending, and not other kinds of
extensions. This decision simplifies application processing
required to cast instances from derived to base type.
A special case allows the
extension of all-groups in ways that do not
guarantee that the new material occurs only at the end of
the content. Future versions may allow more kinds
of extension, requiring more complex transformations to
effect casting.

2.2.2 Declaration Components

There are three kinds of declaration component: element, attribute,
and notation. Each is described in a section below. Also
included is a discussion of element substitution groups, which
is a feature provided in conjunction with element
declarations.

2.2.2.1 Element Declaration

An element declaration is an association of a name with a
type definition, either simple or complex, an (optional)
default value and a (possibly empty) set of identity-constraint
definitions. The association is either global or scoped to a
containing complex type definition. A top-level element
declaration with name 'A' is broadly comparable to a pair of
DTD declarations as follows, where the associated type
definition fills in the ellipses:

<!ELEMENT A . . .>
<!ATTLIST A . . .>

Element declarations contribute to ·validation· as part of model group
·validation·, when their defaults
and type components are checked against an element information
item with a matching name and namespace, and by triggering
identity-constraint definition ·validation·.

2.2.2.2 Element Substitution Group

In XML, the name
and content of an element must correspond exactly to the
element type referenced in the corresponding content model.

[Definition:] Through the new mechanism of element substitution
groups, XSD provides a more powerful model
supporting substitution of one named element for
another. Any top-level element declaration can serve
as the defining member, or head, for an element ·substitution group·.
Other top-level element declarations, regardless of target
namespace, can be designated as members of the ·substitution group·
headed by this element. In a suitably enabled content model, a
reference to the head ·validates·
not just the head itself, but elements corresponding to any
other member of the ·substitution group· as well.

All such members must have type definitions which are
either the same as the head's type definition or derived
from it. Therefore, although the names of elements
can vary widely as new namespaces and members of the
·substitution group· are defined, the content of member elements is
constrained by the type
definition of the
·substitution group· head.

Note that element substitution groups are not represented as
separate components. They are specified in the property values
for element declarations (see
Element Declarations (§3.3)).

2.2.2.3 Attribute Declaration

An attribute declaration is an association between a name and
a simple type definition, together with occurrence information
and (optionally) a default value. The association is either
global, or local to its containing complex type definition.
Attribute declarations contribute to ·validation· as part of complex type
definition ·validation·, when
their occurrence, defaults and type components are checked
against an attribute information item with a matching name and
namespace.

2.2.2.4 Notation Declaration

A notation declaration is an association between a name and
an identifier for a notation. For an attribute or element information item to
be ·valid· with respect to a
NOTATION simple type definition, its value must
have been declared with a notation declaration.

2.2.3 Model Group Components

The model group, particle, and wildcard components
contribute to the portion of a complex type definition that
controls an element information item's content.

2.2.3.1 Model Group

A model group is a constraint in the form of a grammar
fragment that applies to lists of element information items. It
consists of a list of particles, i.e. element declarations,
wildcards and model groups. There are three varieties of model
group:

Conjunction (the element information items match the
particles, in any order);

Disjunction (the element information items match one
of the particles).

Each model group denotes a set of
sequences of element information items. Regarding that set of
sequences as a language, the set of sequences recognized by a
group G may be written L(G). [Definition:] A model group G is said to accept
or recognize the members of L(G).

2.2.3.2 Particle

A particle is a term in the grammar for element content,
consisting of either an element declaration, a wildcard or a
model group, together with occurrence constraints.
Particles contribute to ·validation· as part of complex type
definition ·validation·, when
they allow anywhere from zero to many element information items
or sequences thereof, depending on their contents and
occurrence constraints.

Each content model, indeed each
particle and each term,
denotes a set of sequences of element information items. Regarding
that set of sequences as a language, the set of sequences recognized
by a particle P may be written L(P).
[Definition:] A particle P is said to
accept or recognize the members of
L(P). Similarly, a term Taccepts or recognizes the members
of L(T).

Note: The language accepted by a content model plays a role in determining
whether an element information item is locally valid or not: if the
appropriate content model does not accept the sequence of elements
among its children, then the element information item is not locally
valid. (Some additional constraints must
also be met: not every
sequence in L(P) is locally valid against P. See
Principles of Validation against Groups (§3.8.4.2).)

No assumption is made, in the definition above,
that the items in the sequence are themselves valid; only the
expanded names of the items in the sequence are relevant in
determining whether the sequence is accepted by a particle.
Their validity does affect whether their parent is (recursively)
valid as well as locally valid.

If a sequence S is a member of L(P),
then it is necessarily possible to trace a path through the
·basic particles·
within P, with each item within S corresponding to a matching particle
within P. The sequence of particles within P corresponding to S
is called the ·path· of S in P.

Note: This ·path· has nothing to do with
XPath expressions.
When there may otherwise be danger of confusion, the ·path·
described here may be referred to as the ·match path· of S
in P.

2.2.3.3 Attribute Use

An attribute use plays a role similar to that of a
particle, but for attribute declarations: an attribute
declaration within a complex type definition is embedded within
an attribute use, which specifies whether the declaration
requires or merely allows its attribute, and whether it has a
default or fixed value.

2.2.3.4 Wildcard

A wildcard is a special kind of particle which matches element
and attribute information items dependent on their namespace
names and optionally on their local names.

2.2.4 Constraint Components

2.2.4.1 Identity-constraint Definition

An identity-constraint definition is an association between a name
and one of several varieties of identity-constraint related to
uniqueness and reference. All the varieties use [XPath 2.0] expressions to pick out sets of information
items relative to particular target element information items
which are unique, or a key, or a ·valid· reference, within a specified
scope. An element information item is only ·valid· with respect to an element
declaration with identity-constraint definitions if those definitions
are all satisfied for all the descendants of that element
information item which they pick out.

2.2.4.2 Type Alternative

A type-alternative component
(type alternative for short)
associates a type definition with a predicate.
Type alternatives are used in conditional
type assignment, in which the choice of ·governing type definition·
for elements governed by a particular element declaration
depends on properties of the document instance. An element
declaration may have a {type table} which contains a
sequence of type alternatives; the predicates on the alternatives
are tested, and when a predicate is satisfied, the type
definition paired with it is chosen as the element instance's
·governing type definition·.

Note: The provisions for conditional type assignment are inspired by,
but not identical to, those of [SchemaPath].

2.2.4.3 Assertion

An assertion is a predicate associated with a type, which is
checked for each instance of the type. If an element or attribute information item
fails to satisfy an assertion associated with a given type,
then that information item is not locally ·valid·
with respect to that type.

Assertions are currently only allowed to be specified in
complex types. It may be deemed useful also to include
assertions in named model group definitions and/or attribute
groups, or even simple types. The XML Schema Working Group solicits input from
implementors and users of this specification on this question.

2.2.5 Group Definition Components

There are two kinds of convenience definitions provided to
enable the re-use of pieces of complex type definitions: model
group definitions and attribute group definitions.

2.2.5.1 Model Group Definition

A model group definition is an association between a name and
a model group, enabling re-use of the same model group in
several complex type definitions.

2.3 Constraints and Validation Rules

The [XML 1.1] specification describes two kinds
of constraints on XML documents: well-formedness and
validity constraints. Informally, the
well-formedness constraints are those imposed by the definition
of XML itself (such as the rules for the use of the < and >
characters and the rules for proper nesting of elements), while
validity constraints are the further constraints on document
structure provided by a particular DTD.

The preceding section focused on ·validation·, that is the constraints on
information items which schema components supply. In fact
however this specification provides four different kinds of
normative statements about schema components, their
representations in XML and their contribution to the ·validation· of information items:

The last of these, schema information set contributions, are
not as new as they might at first seem. XML validation augments the XML information set in similar
ways, for example by providing values for attributes not present
in instances, and by implicitly exploiting type information for
normalization or access. (As an example of the latter case,
consider the effect of NMTOKENS on attribute white
space, and the semantics of ID and
IDREF.) By including schema information set
contributions, this specification makes explicit some features
that XML leaves implicit.

Note: While conformance of schema documents is a precondition for
the mapping from schema documents to schema components described
in this specification, conformance of the schema documents does
not guarantee that the result of that mapping will be a schema
that conforms to this specification. Some constraints (e.g. the
rule that there must be at most one top-level element
declaration with a particular expanded name) can only be
checked in the context of the schema as a whole.
Because component correctness
depends in part upon the other components present, the
XML mapping rules defined in this specification do not always
map conforming schema documents into components that satisfy
all constraints. In some cases, the mapping will produce
components which violate constraints imposed at the component
level; in others, no component at all will be produced.

Note:
In this version of this specification, Schema Representation
Constraints concern only properties of the schema document which
can be checked in isolation. In version 1.0 of this
specification, some Schema Representation Constraints could not
be checked against the schema document in isolation, and so it
was not always possible to say, for a given schema document,
whether it satisfied the constraints or not.

This specification describes three levels of conformance for
schema aware processors. The first is required of all processors.
Support for the other two will depend on the application
environments for which the processor is intended.

Note: By separating the conformance requirements relating to the
concrete syntax of ·schema documents·, this specification
admits processors which use schemas stored in optimized binary
representations, dynamically created schemas represented as
programming language data structures, or implementations in
which particular schemas are compiled into executable code such
as C or Java. Such processors can be said to be ·minimally conforming·
but not necessarily ·schema-document aware·.

Note: In version 1.0 of this specification the class of ·schema-document aware· processors was termed "conformant
to the XML Representation of Schemas". Similarly, the
class of ·Web-aware· processors was
called "fully conforming".

Note: Although this specification provides just these three
standard levels of conformance, it is anticipated that other
conventions can be established in the future. For example, the
World Wide Web Consortium is considering conventions for
packaging on the Web a variety of resources relating to
individual documents and namespaces. Should such developments
lead to new conventions for representing schemas, or for
accessing them on the Web, new levels of conformance can be
established and named at that time. There is no need to modify
or republish this specification to define such additional levels
of conformance.

2.5 Names and Symbol Spaces

As discussed in XSD Abstract Data Model (§2.2), most
schema components (may) have ·names·. If all such names were
assigned from the same "pool", then it would be
impossible to have, for example, a simple type definition and an
element declaration both with the name "title" in a
given ·target namespace·.

Therefore [Definition:] this specification introduces the term symbol
space to denote a collection of names, each of which is
unique with respect to the others.
There is a single distinct symbol space within a given ·target namespace· for each kind of
definition and declaration component identified in XSD Abstract Data Model (§2.2), except that within a target namespace,
simple type definitions and complex type definitions share a
symbol space. Within a given symbol space, names
must be unique, but
the same name may appear in more than one symbol space without
conflict. For example, the same name can appear in both a type
definition and an element declaration, without conflict or
necessary relation between the two.

Locally scoped attribute and element declarations are special
with regard to symbol spaces. Every complex type definition
defines its own local attribute and element declaration symbol
spaces, where these symbol spaces are distinct from each other
and from any of the other symbol spaces. So, for example, two
complex type definitions having the same target namespace can
contain a local attribute declaration for the unqualified name
"priority", or contain a local element declaration
for the name "address", without conflict or
necessary relation between the two.

Note: As described above (Conventional Namespace Bindings (§1.3.3)), the
attributes described in this section are referred to in this
specification as "xsi:type",
"xsi:nil", etc. This is shorthand for
"an attribute information item whose [namespace
name] is
http://www.w3.org/2001/XMLSchema-instance and whose [local
name] is type" (or
nil, etc.).

2.6.2 xsi:nil

XML Schema Definition Language: Structures introduces a mechanism for signaling that an element
must be accepted as ·valid·
when it has no content despite a content type which does not
require or even necessarily allow empty content. An element
can be ·valid·
without content if it has the attribute xsi:nil
with the value true. An element so labeled must
be empty, but can carry attributes if permitted by the
corresponding complex type.

3 Schema Component Details

3.1 Introduction

The following sections provide full details on the composition
of all schema components, together with their XML representations
and their contributions to ·assessment·. Each section is devoted to a
single component, with separate subsections for

The sub-sections immediately below introduce conventions
and terminology used throughout the component sections.

3.1.1 Components and Properties

Components are defined in terms of their properties, and each
property in turn is defined by giving its range, that is the
values it may have. This can be understood as defining a
schema as a labeled directed graph, where the root is a schema,
every other vertex is a schema component or a literal (string,
boolean, decimal) and every labeled edge is a property.
The graph is not acyclic: multiple copies of
components with the same name in the same ·symbol space·must not exist, so in some cases re-entrant
chains of properties will exist. Equality of components for
the purposes of this specification is always defined as equality
of names (including target namespaces) within symbol spaces.

Note: A schema and its components as defined in this chapter are an
idealization of the information a schema-aware processor
requires: implementations are not constrained in how they
provide it. In particular, no implications about literal
embedding versus indirection follow from the use below of
language such as "properties . . . having . . .
components as values".

Component properties are simply named
values. Most properties have either other components or
literals (that is, strings or booleans or enumerated keywords)
for values, but in a few cases, where more complex values are
involved, [Definition:] a property
value may itself be a collection of named values, which we call
a property record.

[Definition:] Throughout this
specification, the term absent is used as a
distinguished property value denoting absence. Again this should not be
interpreting as
constraining implementations, as for instance between using a
null value for such properties or not representing
them at all.
[Definition:]
A property value
which is not ·absent· is present.

Any property not defined as optional is always
present; optional properties which are not present are
taken to have ·absent· as their
value. Any property identified as a having a set, subset or
list value might have an empty value unless this is explicitly
ruled out: this is not the same as ·absent·. Any property value identified
as a superset or subset of some set
might be equal to that set,
unless a proper superset or subset is explicitly called for. By
'string' in Part 1 of this specification is meant a sequence of
ISO 10646 characters identified as legal XML
characters in [XML 1.1].

3.1.2 XML Representations of Components

The principal purpose of XML Schema Definition Language: Structures is to define a set of schema
components that constrain the contents of instances and augment
the information sets thereof. Although no external
representation of schemas is required for this purpose, such
representations will obviously be widely used. To provide for
this in an appropriate and interoperable way, this specification
provides a normative XML representation for schemas which makes
provision for every kind of schema component. [Definition:] A document in this
form (i.e. a <schema> element information item)
is a schema document. For the schema
document as a whole, and its constituents, the sections below
define correspondences between element information items (with
declarations in
Schema for Schema Documents (Structures) (normative) (§A) and DTD for Schemas (non-normative) (§L)) and schema components. The key element information items in
the XML representation of a schema are in the XSD namespace, that
is their [namespace
name] is
http://www.w3.org/2001/XMLSchema. Although a common way of creating
the XML Infosets which are or contain ·schema documents· will be
using an XML parser, this is not required: any mechanism which
constructs conformant infosets as defined in [XML-Infoset] is a possible starting
point.

Two aspects of the XML representations of components presented
in the following sections are constant across them all:

All of them allow attributes qualified with namespace names
other than the XSD namespace itself: these appear as
annotations in the corresponding schema component;

All of them allow an <annotation> as their
first child, for human-readable documentation and/or
machine-targeted information.

A recurrent pattern in the XML
representation of schemas may also be mentioned here. In many
cases, the same element name (e.g. element or
attribute or attributeGroup), serves
both to define a particular schema component and to incorporate
it by reference. In the first case the name
attribute is required, in the second the ref
attribute is required. These
two usages are mutually exclusive, and sometimes also depend on
context.

3.1.3 The Mapping between XML Representations and
Components

For each kind of schema component there is a corresponding
normative XML representation. The sections below describe the
correspondences between the properties of each kind of schema
component on the one hand and the properties of information
items in that XML representation on the other, together with
constraints on that representation above and beyond those
expressed in the
Schema for Schema Documents (Structures) (normative) (§A).

The language used is as if the correspondences were mappings
from XML representation to schema component, but the mapping in
the other direction, and therefore the correspondence in the
abstract, can always be constructed therefrom.

In discussing the mapping from XML representations to schema
components below, the value of a component property is often
determined by the value of an attribute information item, one of
the [attributes] of an element information item. Since schema
documents are constrained by the
Schema for Schema Documents (Structures) (normative) (§A), there is always a simple type
definition associated with any such attribute information item.
[Definition:] With reference to any
string, interpreted as denoting
an instance of a given datatype, the term
actual value denotes the value to which the
lexical mapping of that datatype maps the string.
In the case of attributes in
schema documents, the string used as the
lexical representation is normally the ·normalized value· of the
attribute. The associated datatype is, unless otherwise specified,
the one identified in the declaration of the attribute, in the
schema for schema documents; in some cases (e.g. the
enumeration
facet, or fixed and default values
for elements and attributes) the associated datatype will
be a more specific one,
as specified in the appropriate
XML mapping rules. The ·actual value·
will often be a string, but can also be an integer, a
boolean, a URI reference, etc. This term is also occasionally
used with respect to element or attribute information items in a
document being ·validated·.

Many properties are identified below as having other schema
components or sets of components as values. For the purposes of
exposition, the definitions in this section assume that (unless
the property is explicitly identified as optional) all such
values are in fact present. When schema components are
constructed from XML representations involving reference by name
to other components, this assumption will in some
cases be violated if one or more references cannot be
·resolved·. This specification addresses the matter of
missing components in a uniform manner, described in
Missing Sub-components (§5.3): no mention of handling missing
components will be found in the individual component
descriptions below.

Forward reference to named definitions and declarations
is allowed, both within and between
·schema documents·. By the time the component corresponding to
an XML representation which contains a forward reference is
actually needed for ·validation·,
it is possible that an appropriately-named component
will have become available to discharge the reference: see
Schemas and Namespaces: Access and Composition (§4) for details.

3.1.4 White Space Normalization during Validation

Throughout this specification, [Definition:] the
initial value of some
attribute information item is the value of the
[normalized
value] property of that item. Similarly, the initial value of an element information item is the string composed of, in order, the
[character code] of each character information item in the [children] of that
element information item.

The above definition means that comments and processing instructions,
even in the midst of text, are ignored for all ·validation· purposes.

Subsequent to the replacements specified above under
replace, contiguous sequences of
#x20s are collapsed to a single
#x20, and initial and/or final
#x20s are deleted.

Similarly, the
normalized value of any string with respect to a
given simple type definition is the string resulting from
normalization using the whiteSpace facet
and any other pre-lexical facets, associated with that simple type definition.

These three levels of normalization correspond to the processing mandated
in XML for element content, CDATA attribute
content and tokenized
attributed content, respectively. See
Attribute Value Normalization
in [XML 1.1] for the precedent for replace and
collapse for attributes. Extending this processing to element
content is necessary to ensure
consistent ·validation·
semantics for simple types, regardless of whether they are applied to attributes
or elements. Performing it twice in the case of attributes whose
[normalized
value] has already been subject to replacement or collapse on the basis of
information in a DTD is necessary to ensure consistent treatment of attributes
regardless of the extent to which DTD-based information has been made use of
during infoset construction.

Note: Even when DTD-based information has been appealed
to, and Attribute Value
Normalization has taken place, it
is possible that
further normalization will
take place, as for instance when character entity references
in attribute values result in white space characters other than spaces
in their ·initial value·s.

Note: The values replace and
collapse may appear to provide a
convenient way to "unwrap" text (i.e. undo the effects of
pretty-printing and word-wrapping). In some cases, especially
highly constrained data consisting of lists of artificial tokens
such as part numbers or other identifiers, this appearance is
correct. For natural-language data, however, the whitespace
processing prescribed for these values is not only unreliable but
will systematically remove the information needed to perform
unwrapping correctly. For Asian scripts, for example, a correct
unwrapping process will replace line boundaries not with blanks but
with zero-width separators or nothing. In consequence, it is
normally unwise to use these values for natural-language data, or
for any data other than lists of highly constrained tokens.

The
{value constraint} property reproduces the functions of
XML default and
#FIXED attribute values. A {variety} of
default specifies that the attribute is to
appear unconditionally in the ·post-schema-validation infoset·, with {value} and {lexical form} used whenever the attribute is not
actually present; fixed indicates that the attribute
value if present must be equal to {value}, and if absent receives {value} and {lexical form} as for default. Note that
it is values that are checked, not
strings,
and that the test is for equality, not identity.

[XML-Infoset] distinguishes attributes with names such as xmlns or xmlns:xsl from
ordinary attributes, identifying them as [namespace attributes]. Accordingly, it is unnecessary and in fact not possible for
schemas to contain attribute declarations corresponding to such
namespace declarations, see xmlns Not Allowed (§3.2.6.3). No means is provided in
this specification to supply a
default value for a namespace declaration.

3.2.2 XML Representation of Attribute Declaration Schema
Components

The XML representation for an attribute declaration schema
component is an
<attribute> element information item. It specifies a
simple type definition for an attribute either by reference or
explicitly, and may provide default information. The
correspondences between the properties of the information item and
properties of the component are given in this section.

Attribute declarations can appear at the top level of a schema
document, or within complex type definitions, either as complete
(local) declarations, or by reference to top-level declarations,
or within attribute group definitions. For complete
declarations, top-level or local, the type
attribute is used when the declaration can use a built-in or
pre-declared simple type definition. Otherwise an anonymous
<simpleType> is provided inline. When no simple type definition is
referenced or provided, the default is ·xs:anySimpleType·, which
imposes no constraints at all.

Earlier versions of this specification did not
allow a targetNamespace attribute on attribute
declarations; it has been added in this version to make
restriction of complex types easier. The XML Schema Working Group
has designated the targetNamespace attribute
a ‘feature at risk’: it may be dropped from future
drafts of this specification if implementation or usage experience
shows that its costs outweigh its benefits.
The XML Schema Working Group solicits input from implementors and
users of this specification as to whether the addition of this
attribute is desirable and acceptable.

The names for top-level attribute declarations are in their
own ·symbol space·. The
names of locally-scoped attribute declarations reside in symbol
spaces local to the type definition which contains them.

The following sections specify several
sets of XML mapping rules which apply in different
circumstances.

3.2.2.2 Mapping Rules for Local Attribute Declarations

If
the <attribute> element information item has
<complexType> or <attributeGroup> as
an ancestor and the ref[attribute] is absent,
it maps both to an attribute
declaration (see below) and
to an attribute use with properties as follows
(unless use='prohibited', in which case the item
corresponds to nothing at all):

If
the
<attribute> element information item has
<complexType> or <attributeGroup> as an
ancestor and the ref[attribute] is
present, it
maps to an attribute use with properties as follows
(unless use='prohibited', in which case the item
corresponds to nothing at all):

3.2.4 Attribute Declaration Validation Rules

3.2.4.1 Attribute Locally Valid

Informally, an attribute in an XML
instance is locally ·valid·
against an attribute declaration if and only if (a)
the name of the attribute matches
the name of the declaration, (b) after
whitespace normalization its ·normalized value· is locally valid
against the type declared for the attribute, and
(c) the
attribute obeys any relevant value constraint. Additionally,
for xsi:type, it is required that the type named
by the attribute be present in the schema.
A logical prerequisite for checking the local validity of an
attribute against an attribute declaration is that the attribute
declaration itself and the type definition it identifies
both be present in the schema.

[Definition:] For
attribute information items,
there is no difference between assessment and strict
assessment, so
the attribute information item has
been strictly assessed
if and only if its schema-validity has been assessed.

Note: The
[type definition type],
[type definition namespace],
[type definition name], and
[type definition anonymous] properties
are redundant with the
[type definition] property;
they are defined for the convenience of implementations
which wish to expose those specific properties
but not the entire type definition.

The first (·item isomorphic·)
alternative
above is provided for applications such as query
processors which need access to the full range of details about an
item's ·assessment·, for example the
type hierarchy; the second, for lighter-weight processors for whom
representing the significant parts of the type hierarchy as
information items might be a significant burden.

3.2.6.3 xmlns Not Allowed

Note: The {name} of an attribute is an ·NCName·, which implicitly
prohibits attribute declarations of the form xmlns:*.

3.2.6.4 xsi: Not Allowed

Schema Component Constraint: xsi: Not Allowed

The {target namespace} of an attribute declaration,
whether local or top-level, must not match http://www.w3.org/2001/XMLSchema-instance
(unless it is one of the four built-in declarations given in the next section).

Note: This reinforces the special status of these attributes, so that they not
only need not be declared to be allowed in instances, but
in consequence of the rule just given
must not be declared.

Note: It is legal for Attribute Uses that
refer to xsi: attributes to specify default or fixed value
constraints (e.g. in a component corresponding to a schema document construct
of the form <xs:attribute ref="xsi:type" default="xs:integer"/>),
but the practice is not recommended; including such attribute uses will tend
to mislead readers of the schema document, because the attribute uses would
have no effect; see Element Locally Valid (Complex Type) (§3.4.4.2) and
Attribute Default Value (§3.4.5.1) for details.

3.2.7 Built-in Attribute Declarations

There are four attribute declarations present in every
schema by definition:

3.2.7.1 xsi:type

The xsi:type attribute
is used to signal use of a type other than the declared type of
an element. See xsi:type (§2.6.1).

Note: The provision of defaults for elements goes beyond what is
possible in XML DTDs,
and does not exactly correspond to defaults for attributes. In
particular, an element with a non-empty {value constraint} whose simple type definition includes the empty
string in its lexical space will nonetheless never receive that
value, because the {value constraint} will override it.

3.3.2 XML Representation of Element Declaration Schema Components

The XML representation for an element declaration schema
component is an <element> element information
item. It specifies a type definition for an element either by
reference or explicitly, and may provide occurrence and
default information. The correspondences between the properties
of the information item and properties of the component(s) it
corresponds to are given in this section.

Earlier versions of this specification did not
allow a targetNamespace attribute on element
declarations; it has been added in this version to make
restriction of complex types easier. The XML Schema Working Group
has designated the targetNamespace attribute
a ‘feature at risk’: it may be dropped from future
drafts of this specification if implementation or usage experience
shows that its costs outweigh its benefits.
The XML Schema Working Group solicits input from implementors and
users of this specification as to whether the addition of this
attribute is desirable and acceptable.

<element> corresponds to an element declaration, and allows
the type definition of that declaration to be specified either by reference or
by explicit inclusion.

<element>s within <schema> produce
global element declarations; <element>s within <group> or <complexType> produce either particles which contain global element declarations (if there's a ref attribute) or local declarations (otherwise). For complete declarations, top-level or local, the type attribute is used when the declaration can use a
built-in or pre-declared type definition. Otherwise an
anonymous <simpleType> or <complexType> is provided inline.

As noted above the names for top-level element declarations are in a separate
·symbol space· from the symbol spaces for
the names of type definitions, so there can (but need
not be) a simple or complex type definition with the same name as a
top-level element. As with attribute names, the names of locally-scoped
element declarations with no {target namespace} reside in symbol spaces local to the type definition which contains
them.

Note that the above allows for two levels of defaulting for unspecified
type definitions. An <element> with no referenced or included type definition will
correspond to an element declaration which has
the
same type definition as the first
substitution-group head named in the
substitutionGroup[attribute], if present,
otherwise ·xs:anyType·.
This has the important consequence that the minimum valid element declaration,
that is, one with only a name attribute and no contents,
is also (nearly) the most general, validating any combination of text and
element content and allowing any attributes, and providing for recursive
validation where possible.

If the <element> element information item has
minOccurs=maxOccurs=0,
then it maps to no component at all.

Note: The minOccurs and maxOccurs
attributes are not allowed on top-level
<element> elements, so in valid schema
documents this will happen only when the <element> element information item has
<complexType> or <group> as an
ancestor.

A set
depending on the ·actual value· of the block[attribute], if present, otherwise on the ·actual value· of the
blockDefault[attribute] of the ancestor
<schema> element information item, if present,
otherwise on the empty string. Call this the
EBV (for effective block value). Then the
value of this property is
the appropriate case among the following:

1 If the EBV is the empty string, then the empty set;

2 If the EBV is #all, then {extension,
restriction,
substitution};

3 otherwise a set with members drawn from the set
above, each being present or absent depending on whether
the ·actual value· (which is a list) contains an equivalently
named item.

Note: Although the blockDefault[attribute] of
<schema>may include values other than
extension, restriction or
substitution, those values are ignored in the
determination of {disallowed substitutions} for element
declarations (they are used elsewhere).

The first example above declares an element whose type, by default, is
·xs:anyType·
The second uses an embedded anonymous complex
type definition.

The last two examples illustrate the use of local element declarations. Instances of myLocalElement within
contextOne will be constrained by myFirstType,
while those within contextTwo will be constrained by
mySecondType.

Note: The possibility that differing attribute declarations and/or content models
would apply to elements with the same name in different contexts is an
extension beyond the expressive power of a DTD in XML.

An example from a previous version of the schema for datatypes. The
facet type is defined
and the facet element is declared to use it. The facet element is abstract -- it's
only defined to stand as the head for a ·substitution group·. Two further
elements are declared, each a member of the facet·substitution group·. Finally a type is defined which refers to facet, thereby
allowing eitherperiod or encoding (or
any other member of the group).

Example

The following example illustrates conditional type assignment
to an element, based on the value of one of the element's attributes.
Each instance of the message element will be
assigned either to type messageType or to a more
specific type derived from it.

The type messageType accepts any well-formed XML
or character sequence as content, and carries a kind
attribute which can be used to describe the kind or format of
the message. The value of kind is either one of a
few well known keywords or, failing that, any string.

Three restrictions of messageType are defined, each
corresponding to one of the three well-known formats:
messageTypeString for kind="string",
messageTypeBase64 for kind="base64"
and kind="binary", and
messageTypeXML for kind="xml" or
kind="XML".

The message element itself uses
messageType both as its declared type and
as its default type, and uses test attributes on its
<alternative>[children] to assign the appropriate
specialized message type to messages with the well known
values for the kind attribute:

[Definition:] A type definition S is
validly substitutable for another type T,
subject to a
set of blocking keywords K (typically drawn from the set
{substitution, extension,
restriction, list, union} used in
the {disallowed substitutions} and
{prohibited substitutions} of
element declarations and type definitions), if and
only if either

[Definition:] If the set of keywords controlling whether
a type S is ·validly substitutable· for another type T is the
empty set, then S is said to be validly
substitutable for Twithout limitation
or absolutely. The phrase validly
substitutable, without mention of any set of blocking
keywords, means "validly substitutable without
limitation".

Sometimes one type S is
·validly substitutable· for another type T only if S is derived
from T by a chain of restrictions, or if T is a union type
and S a member type of the union. The concept of ·valid substitutability· is
appealed to often enough in such contexts that it is convenient
to define a term to cover this specific case. [Definition:] A type definition S is validly
substitutable as a restriction for another type T if
and only if S is ·validly substitutable· for T, subject to the
blocking keywords {extension, list,
union}.

Informally, an element is locally valid
against an element declaration when:

The declaration is present in the schema
and the name of the element matches the name of the declaration.

The element is declared concrete (i.e. not abstract).

Any xsi:nil attribute on the element obeys the
rules. The element is allowed to have an xsi:nil
attribute only if the element is declared nillable, and
xsi:nil = 'true' is allowed only if the element
itself is empty. If the element declaration specifies a
fixed value for the element, xsi:nil='true'
will make the element invalid.

The element's content satisfies the appropriate constraints:
If the element is empty and the declaration specifies a
default value, the default is checked against the
appropriate type definitions.
Otherwise, the content of the element is checked against
the ·governing type definition·; additionally, if the element
declaration specifies a fixed value, the content is
checked against that value.

The element satisfies all the identity constraints specified
on the element declaration.

Additionally, on the ·validation root·, document-level
ID and IDREF constraints are checked.

The following validation rule gives
the normative formal definition of local validity of an element
against an element declaration.

Validation Rule: Element Locally Valid (Element)

For an element information item E to be locally ·valid· with respect to an element
declaration Dall of the following must be true:

Informally, local validity against a type requires first
that the type definition be present in the schema and not declared abstract.
For a simple type definition, the element must lack attributes
(except for namespace declarations and the special attributes
in the xsi namespace) and child elements, and must
be type-valid against that simple type definition.
For a complex type definition, the element must
be locally valid against that complex type definition.
Also, if the element has an xsi:type attribute,
then it is not locally valid against any type other than the
one named by that attribute.

Validation Rule: Element Locally Valid (Type)

For an element information item E
to be locally ·valid· with respect to
a type definition Tall of the following must be true:

3.3.4.5 Validation Root Valid (ID/IDREF)

The following validation rule
specifies document-level ID/IDREF constraints checked on the
·validation root· if it is an element; this rule is not checked on other
elements. Informally, the requirement is that each ID
identifies a single element within the ·validation root·,
and that each IDREF value matches one ID.

Note: The first clause above applies when there is a reference to an undefined
ID. The second applies when there is a multiply-defined ID. They
are separated out to ensure that distinct error codes (see
Outcome Tabulations (normative) (§C)) are associated with these two
cases.

Note: Although this rule applies at the ·validation root·, in
practice processors, particularly streaming processors,
will perhaps wish to detect and signal the
clause 2 case as it arises.

Note: This reconstruction of [XML 1.1]'s
ID/IDREF functionality is imperfect in that if
the ·validation root· is not the document element of an XML
document, the results will not necessarily be the same as
those a validating parser would give were the document to have
a DTD with equivalent declarations.

3.3.4.6 Schema-Validity Assessment (Element)

This section gives the top-level rule
for ·assessment· of an element information item. Informally:

The schema-validity assessment of an element information item
depends on its ·validation· and
the ·assessment· of its element
information item children and associated attribute information
items, if any.

[Definition:] For the
schema-validity of an
element information item E
to be strictly assessedall of the following must be true:

In version 1.0 of this specification, the fallback to lax
validation described in the preceding paragraph was optional,
not required. The XML Schema Working Group solicits input from implementors and
users of this specification as to whether this change is
desirable and acceptable.

Note:
If more than one identity
constraint fails to be satisfied, it is ·implementation-dependent·
which of the failed identity constraints are included
here. As a result, processors may produce different values for
this property.

Note:
If more than one assertion fails
to be satisfied, it is ·implementation-dependent· which of the failed
assertions are included here. As a result,
processors
may produce
different values for this property.

Note: The [type definition
type], [type definition
namespace], [type
definition name], and [type definition anonymous]
properties are redundant with the [type definition] property; they are
defined for the convenience of implementations which wish to
expose those specific properties but not the entire type
definition.

The first (·item isomorphic·)
alternative above is provided for applications such as query
processors which need access to the full range of details about
an item's ·assessment·, for
example the type hierarchy; the second, for lighter-weight
processors for whom representing the significant parts of the
type hierarchy as information items might be a significant
burden.

Constraining element information item [children] to be empty,
or to conform to a specified element-only or mixed content model, or else
constraining the character information item [children] to conform to a
specified simple type definition.

Constraining
elements and attributes to exist,
not to exist, or to have specified values, with Assertion (§2.2.4.3)s.

A complex type with an empty specification for {final} can be used as a
{base type definition} for other types derived by either of
extension or restriction; the explicit values extension, and restriction prevent further
derivations by extension and restriction respectively. If all values are specified, then [Definition:] the complex type is said to be
final, because no
further derivations are possible. Finality is not
inherited, that is, a type definition derived by restriction from a type
definition which is final for extension is not itself, in the absence of any
explicit final attribute of its own, final for anything.

The {context} property is only relevant for anonymous type
definitions, for which its value is the component in which this type
definition appears as the value of a property, e.g.
{type definition}.

The {prohibited substitutions} property of a complex type definition T determines
whether type definitions derived from T are or are not
·validly substitutable· for T. Examples include (but are not limited
to) the substitution of another type definition:

{assertions} constrain
elements and attributes
to exist, not to exist, or to
have specified values.
Though specified as a sequence, the order
among the assertions is not significant during assessment.
See Assertions (§3.13).

3.4.2 XML Representation of Complex Type Definition Schema Components

The XML representation for a complex type definition schema component is a
<complexType> element information item.

The XML representation for complex type definitions with a
{content type} with {variety}simple is significantly different from that
of those with other {content type}s, and this is reflected in the presentation below,
which describes
the mappings for the two cases in separate subsections.
Common mapping rules are factored out and given in
separate sections.

Note:
It is
a consequence of the concrete syntax given above that
a top-level
type definition need consist of no more than a name, i.e. that
<complexType name="anyThing"/> is allowed.

Note:
Aside from the simple coherence requirements outlined below, the requirement that type
definitions identified as restrictions actually be
restrictions — that is, the requirement that they accept
as valid only a subset of the items which are accepted as valid
by their base type definition — is enforced in Constraints on Complex Type Definition Schema Components (§3.4.6).

The following sections describe
different sets of mapping rules for complex types; some
are common to all or many source declarations, others
only in specific circumstances.

Where convenient, the mapping rules are
described exclusively in terms of the schema document's
information set. The mappings, however, depend not only upon
the source declaration but also upon the schema context. Some
mappings, that is, depend on the properties of other components
in the schema. In particular, several of the mapping rules
given in the following sections depend upon the {base type definition} having
been identified before they apply.

3.4.2.1 Common Mapping Rules for Complex Type Definitions

Whichever
alternative for the content of <complexType> is
chosen, the following property mappings
apply.
Except where otherwise specified, attributes and child
elements are to be sought among the [attributes] and
[children] of the <complexType> element.

A set
corresponding to the ·actual value· of the block[attribute], if present, otherwise to the ·actual value· of the
blockDefault[attribute] of the ancestor
<schema> element information item, if present,
otherwise on the empty string. Call this the
EBV (for effective block value). Then the
value of this property is
the appropriate case among the following:

1 If the EBV is the empty string, then the empty set;

2 If the EBV is #all, then {extension,
restriction};

3 otherwise a set with members drawn from the set
above, each being present or absent depending on whether
the ·actual value· (which is a list) contains an equivalently
named item.

Note: Although the blockDefault[attribute] of
<schema>may include values other than
restriction or extension, those values
are ignored in the determination of {prohibited substitutions} for complex type
definitions (they are used elsewhere).

Note: The mapping rule below refers here and there to elements
not necessarily present within a <complexType>
source declaration. For purposes of evaluating tests like
"If the abc attribute is present
on the xyz element", if no xyz
element information item is present, then no
abc attribute is present on the
(non-existent) xyz element.

Note:
It is a consequence of clause 4.2 above that
when a type definition is extended, the same particles appear
in both the base type definition and the extension;
the particles are reused without being copied.

Note:
The only substantive function of the value
prohibited for the use attribute of an
<attribute> is in
establishing the correspondence between a complex type defined
by restriction and its XML representation. It serves to
prevent inheritance of an identically named attribute use from
the {base type definition}. Such an <attribute> does not correspond to any component, and
hence there is no interaction with either explicit or
inherited wildcards in the operation of Complex Type Definition Validation Rules (§3.4.4) or Constraints on Complex Type Definition Schema Components (§3.4.6).
It is pointless, though not an
error, for the use attribute to have the value
prohibited in other contexts (e.g. in complex type
extensions or named model group definitions), in which cases
the <attribute> element is simply ignored, provided that
it does not violate other constraints in this
specification.

3.4.2.6 Examples of Complex Type Definitions

Example: Three ways to define a type for length

The following declaration defines a type for specifications of length
by creating a complex type with simple content, with
xs:nonNegativeInteger as the type of the
content, and a unit attribute to give the
unit of measurement.

A simplified type definition
derived from the base type from the previous example by restriction, eliminating
one optional child and
fixing another to occur exactly once; an element declared by reference to it,
and a ·valid· instance thereof.

A complex type definition that
allows three explicitly declared child
elements, in the specified order (but not necessarily adjacent), and
furthermore allows additional elements of any name from any namespace other
than the target namespace to appear anywhere in the children.

Example

To restrict away a local element declaration that ·competes· with
a wildcard, use a wildcard in the derived type that explicitly
disallows the element's expanded name. For example:

The restriction type quietComputer has
a lax wildcard, which ·matches· any element but one with the name
speaker.

Without the specification of the notQName attribute,
the wildcard would ·match· elements named
speaker, as well. In that case, the restriction
would be valid only if there is a
top-level declaration for speaker that also has type
speakerType or a type derived from it.
Otherwise, there would be instances locally valid against the restriction
quietComputer that are not locally valid against the base type
computer.

For example, if there is no notQName attribute on the wildcard and
no top-level declaration for speaker, then the following is allowed
by quietComputer, but not by computer:

Note: When an {attribute wildcard} is
present, this does not introduce any ambiguity with
respect to how attribute information items for which an attribute use
is present amongst the {attribute uses} whose name and target namespace match are
·assessed·. In such cases the attribute
use always takes precedence, and the ·assessment· of such items stands or falls
entirely on the basis of the attribute use and its {attribute declaration}. This follows from the details of
clause 2.

3.4.4.4 Attribution of Elements to Particles

[Definition:]
During ·validation· of an element
information item against its (complex) ·governing type definition·,
associations between element and attribute information items among the
[children] and [attributes] on the one hand, and attribute uses,
attribute wildcards, particles and open contents on the other, are
established. The element or attribute information item is
attributed to the corresponding component.

Note:
The above definition
makes sure that
·attribution· happens even when the
sequence of element information items is not
·locally valid· with respect to a
Content Type. For example, if a complex type definition has the
following content model:

2.3T has {derivation method}extension,
and ST is identical to SB,
and E and B together satisfy this constraint.

Note: This constraint has (by clause 2.2) the effect of ensuring
that if T is a restriction of B, then any type
conditionally assigned to E in the context of T is a
restriction of the type which would be assigned to E in the
context of B.

The constraint Conditional Type Substitutable in Restriction (§3.4.4.5) above is
intended to ensure that the use of Type Tables for
conditional type assignment does not violate the usual principles of
complex type restriction.
More specifically, if T is a complex type definition derived from
its base type B by restriction, then the rule seeks to ensure that
a type definition conditionally assigned by T to some child element
is always derived by restriction from that assigned by B to the same child.
The current design enforces this using a "run-time" rule: instead of
marking T as invalid if it could possibly assign types incompatible
with those assigned by B, the run-time rule accepts the schema as valid
if the usual constraints on the declared {type definition}s are satisified,
without checking the details of the {type table}s. Element instances are
then checked as part of validation, and any instances that would cause
T (or any type in T's {base type definition} chain) to assign the incompatible
types are made invalid with respect to T.
This rule may prove hard to understand or implement. The Working Group is
uncertain whether the current design has made the right trade-off and
whether we should use a simpler but more restrictive rule. We solicit
input from implementors and users of this specification as to whether
the current run-time rule should be retained.

[Definition:] When
default values are supplied for attributes, namespace fixup
may be required, to ensure that the ·post-schema-validation infoset· includes
the namespace bindings needed and maintains the consistency
of the namespace information in the infoset. To perform
namespace fixup on an element information item E for
a namespace N:

1 If the [in-scope namespaces] of E contains a binding for N, no
namespace fixup is needed; the properties of E
are not changed.

1.5 It is in principle
possible to deriveT in two steps, the first
an extension and the second a restriction (possibly
vacuous), from that type definition among its ancestors
whose {base type definition}
is ·xs:anyType·.

Note: This requirement ensures that
nothing removed by a restriction is subsequently added
back by an extension in an incompatible way (for example,
with a conflicting type assignment or value
constraint).

Constructing the intermediate type definition to
check this constraint is straightforward: simply
re-order the derivation to put all the extension
steps first, then collapse them into a single
extension. If the resulting definition can be the
basis for a valid restriction to the desired
definition, the constraint is satisfied.

Note: Valid
restriction involves both a subset relation on the set of
elements valid against T and those valid against B, and a derivation relation, explicit in the
type hierarchy, between the types assigned to attributes and
child elements by T and those assigned to the same
attributes and children by B.

The constraint just given,
like other constraints on schemas,
must be satisfied by every complex type T to which it
applies.

However, under certain conditions conforming processors
need not (although they may) detect some violations of this constraint.
If (1) the type definition being checked
has T . {content type} . {particle} . {term} . {compositor}
= all
and (2) an implementation is unable to determine
by examination of the schema in isolation
whether or not clause 2.4.2
is satisfied, then the implementation may
provisionally accept the derivation.
If any instance encountered in the ·assessment· episode
is valid against T but not against T.{base type definition},
then the derivation of T does not satisfy this
constraint, the schema does not conform to this
specification, and no ·assessment· can be performed
using that schema.

It is ·implementation-defined· whether a processor (a) always
detects violations of clause 2.4.2
by examination of the schema in isolation, (b)
detects them only when some element information item
in the input document is valid against T but not
against T.{base type definition}, or (c) sometimes detects
such violations by examination of the schema in isolation
and sometimes not. In the latter case, the circumstances
in which the processor does one or the other are
·implementation-dependent·.

3.4.6.5 Type Derivation OK (Complex)

The following constraint defines a relation appealed to elsewhere
in this specification.

Schema Component Constraint: Type Derivation OK (Complex)

For a complex type definition (call it D, for
derived) to be validly derived from a type definition (call this
B, for base) subject to
the blocking keywords in
a subset of {extension,
restriction}
all of the following must be true:

1 If B and D are not the same type
definition, then the {derivation method} of
D is not
in the subset.

Note: This constraint is used to check that when someone uses a type in a
context where another type was expected (either via xsi:type or
·substitution groups·), that the type used is actually derived from the expected
type, and that that derivation does not involve a form of derivation which was
ruled out by the expected type.

Note: The wording of clause 2.1 above appeals to a notion of component identity which
is only incompletely defined by this version of this specification.
In some cases, the wording of this specification does make clear the
rules for component identity. These cases include:

When they are both top-level components with the same component type,
namespace name, and local name;

When they are necessarily the same type definition (for example, when
the two type
definitions in question are the type definitions associated with
two attribute or element declarations, which are discovered to be the same
declaration);

When they are the same by construction (for example, when an element's
type definition defaults to being the same type definition as that of its
substitution-group head or when a complex type definition inherits an attribute
declaration from its base type definition).

In other cases
it is possible
that conforming implementations will
disagree as to whether components are identical.

Note: When a complex type definition S is said to be
"validly derived" from a type definition T,
without mention of any specific set of blocking keywords,
or with the explicit phrase "without limitation",
then what is meant is that S is validly derived from
T, subject to the empty set of blocking keywords,
i.e. without any particular limitations.

3.4.7 Built-in Complex Type Definition

There is a complex
type definition for ·xs:anyType· present in every schema
by definition. It has the following properties:

Note: This specification does not provide an inventory of built-in complex
type definitions for use in user schemas. A preliminary library of complex type
definitions is available which includes both mathematical (e.g.
rational) and utility (e.g. array) type definitions.
In particular, there is a text type definition which is recommended for use
as the type definition in element declarations intended for general text
content, as it makes sensible provision for various aspects of
internationalization. For more details, see the schema document for the type
library at its namespace name: http://www.w3.org/2001/03/XMLSchema/TypeLibrary.xsd.

An attribute use is a utility component which controls the occurrence and
defaulting behavior of attribute declarations. It plays the same role for
attribute declarations in complex types that particles play for element declarations.

A schema can name a group of attribute declarations so that they can be incorporated as a
group into complex type definitions.

Attribute group definitions do not participate in ·validation· as such, but the
{attribute uses} and {attribute wildcard}
of one or
more complex type definitions may be constructed in whole or part by reference
to an attribute group. Thus, attribute group definitions provide a
replacement for some uses of XML's
parameter entity facility.
Attribute group definitions are provided primarily for reference from the XML
representation of schema components
(see <complexType> and <attributeGroup>).

XML representations for attribute group definitions. The effect is as if the attribute
declarations in the group were present in the type definition.

The example above illustrates the pattern
mentioned in XML Representations of Components (§3.1.2): The same
element, in this case attributeGroup, serves both to
define and to incorporate by reference. In the first
attributeGroup element in the example, the
name attribute is required and the
ref attribute is forbidden; in the second the
ref attribute is required, the
name attribute is forbidden.

3.6.1 The Attribute Group Definition Schema Component

The attribute group definition schema component has the
following properties:

3.6.2.1 XML Mapping Rule for Named Attribute Groups

The XML representation for an attribute group definition
schema component is an <attributeGroup> element information item. It provides for naming a group of
attribute declarations and an attribute wildcard for use by
reference in the XML representation of complex type definitions
and other attribute group definitions. The correspondences between the
properties of the information item and properties of the
component it corresponds to are given in this section.

Circular reference is not disallowed. That is, it
is not an error if B, or some <attributeGroup>
element referred to by B (directly, or indirectly at some
remove) contains a reference to A. An <attributeGroup>
element involved in such a reference cycle maps to a
component whose {attribute uses}
and {attribute wildcard} properties
reflect all the <attribute> and <any>
elements contained in, or referred to (directly or indirectly)
by elements in the cycle.

3.6.2.2 Common Rules for Attribute Wildcards

The following mapping for attribute-wildcards forms part of the
XML mapping rules for different kinds of source declaration
(most prominently <attributeGroup>). It can be
applied to any element which can have an <anyAttribute>
element as a child, and produces as a result either a
Wildcard or the special value ·absent·.
The mapping depends on the concept of the ·local wildcard·:

[Definition:]
The
local wildcard of
an element information item E
is the appropriate case among the following:

3.7.2 XML Representation of Model Group Definition Schema Components

The XML representation for a model group definition schema component is a
<group> element information item.
It provides for
naming a model group for use by reference in the XML representation of
complex type definitions and model groups. The correspondences between the
properties of the information item and
properties of the component it corresponds to are given in this section.

Otherwise, the <group>
has minOccurs=maxOccurs=0, in which
case it maps to no component at all.

The name of this section is slightly misleading, in that the
second, un-named, case above (with a ref and no
name) is not really a named model group at all, but
a reference to one. Also note that in the first (named) case
above no reference is made to minOccurs or
maxOccurs: this is because the schema for schema documents does not
allow them on the child of <group> when it is
named. This in
turn is because the {min occurs} and
{max occurs} of the particles which
refer to the definition are what count.

specifies a sequential (sequence),
disjunctive (choice) or conjunctive (all) interpretation of
the {particles}. This in turn
determines whether the element
information item [children]·validated· by the model group must:

(all) correspond to the specified {particles}. The elements can occur in any
order.

When two or more particles contained directly or indirectly in the
{particles} of a model group have identically named
element declarations as their
{term}, the type definitions of those declarations must be the
same. By 'indirectly' is meant particles within the {particles}
of a group which is itself the {term} of a directly contained
particle, and so on recursively.

3.8.2 XML Representation of Model Group Schema Components

The XML representation for a model group schema component is
either an
<all>, a <choice> or a <sequence>
element information item. The correspondences between the
properties of those information items and
properties of the component they correspond to are given in this section.

3.8.3 Constraints on XML Representations of Model Groups

None as such.

3.8.4 Model Group Validation Rules

In order to define the validation rules for model
groups clearly, it will be useful to define some basic terminology;
this is done in the next two sections, before the validation rules
themselves are formulated.

3.8.4.1 Language Recognition by Groups

Each model group M denotes a language
L(M), whose members are the sequences of element information items
·accepted· by M.

Within L(M) a smaller language V(M) can be
identified, which is of particular importance for schema-validity
assessment. The difference between the two languages is that
V(M) enforces some constraints which are ignored in the definition
of L(M).
Informally L(M) is the set of sequences which are accepted by a model
group if no account is taken of the schema component
constraint Unique Particle Attribution (§3.8.6.4) or the related provisions
in the validation rules which specify how to choose a unique ·path·
in a non-deterministic model group. By contrast, V(M) takes
account of those constraints and includes only the sequences which are
·locally valid· against M. For all model groups M, V(M) is a
subset of L(M). L(M) and related concepts are described in this
section; V(M) is described in the next section, Principles of Validation against Groups (§3.8.4.2).

[Definition:] When a sequence S of element information
items is checked against a model group M, the sequence of
·basic particles·
which the items of S match, in order, is a
path of S in M. For a given S and
M, the
path of S in
M is not necessarily unique.
Detailed rules for the matching, and thus for the construction of
paths, are given in Language Recognition by Groups (§3.8.4.1) and Principles of Validation against Particles (§3.9.4.1).
Not every sequence has a path in every model group, but every
sequence accepted by the model group does have a path.
[Definition:] For
a model group M and a sequence S in L(M), the path
of S in M is a complete path; prefixes of
complete paths which are themselves not complete paths
are incomplete paths.
For example, in the model group

3.8.4.1.1 Sequences

This section defines L(M), the set of
·paths· in M, and V(M), if M
is a sequence group.

If M is a Model Group,
and the {compositor} of M is sequence,
and the {particles} of M is the sequence P1, P2, ...,
Pn, then L(M) is the set of sequences S = S1 + S2 + ... +
Sn (taking "+" as the concatenation operator), where
Si is in L(Pi) for 0 < i ≤ n.
The sequence of sequences S1, S2, ..., Sn is a ·partition· of
S.
Less formally, when M is a sequence of P1, P2, ... Pn, then
L(M) is the set of sequences formed by taking one sequence which is
accepted by P1, then one accepted by P2, and so on, up through
Pn, and then concatenating them together in order.

[Definition:] A
partition of a sequence is a sequence of sub-sequences,
some or all of which may be empty, such that concatenating all
the sub-sequences yields the original sequence.

When M is a sequence group
and S is a sequence of input items, the set of ·paths· of S
in M is the set of all
paths Q = Q1 + Q2 + ... + Qj, where

where n = 3, j = 2, then
S1 is (<a/>),
S2 is (<b/>),
and
S has a ·path· in M, even though S is not in
L(M). The ·path· has two items, first the Particle
for the a element, then the Particle for the
b element.

3.8.4.1.2 Choices

This section defines L(M), the set of
·paths· in M, and V(M), if M
is a choice group.

When the {compositor} of M is choice, and the {particles} of M is the sequence P1, P2, ..., Pn,
then
L(M) is
L(P1) ∪ L(P2) ∪ ... ∪ L(Pn),
and the set of ·paths· of S in P is the set
Q = Q1 ∪ Q2 ∪ ... ∪ Qn, where
Qi is the set of ·paths· of S in Pi, for
0 < i ≤ n.
Less formally, when M is a choice of P1, P2, ... Pn, then
L(M) contains any sequence accepted by any of the particles P1, P2, ... Pn,
and any ·path· of S in any of the particles P1, P2, ... Pn
is a ·path· of S in P.

3.8.4.1.3 All-groups

This section defines L(M), the set of
·paths· in M, and V(M), if M
is an all-group.

When the {compositor} of M is all, and the {particles} of M is the sequence P1, P2, ..., Pn,
then
L(M) is the set of sequences
S = S1 × S2 × ... × Sn
(taking "×" as the interleave operator),
where
for 0 < i ≤ n, Si is in L(Pi).
The set of sequences
{S1, S2, ..., Sn} is a ·grouping· of S.
The set of ·paths· of S in P is
the set of all ·paths·Q = Q1 × Q2 × ... × Qn,
where Qi is a ·path· of Si in Pi,
for 0 < i ≤ n.

Less formally, when M is an all-group of P1, P2, ... Pn, then
L(M) is the set of sequences formed by taking one sequence which
is accepted by P1,
then one accepted by P2,
and so on, up through Pn, and then interleaving them
together. Equivalently, L(M) is the set of sequences S
such that the set {S1, S2, ..., Sn} is a
·grouping· of S, and
for 0 < i ≤ n, Si is in L(Pi).

[Definition:] A
grouping of a sequence is a set of sub-sequences, some or
all of which may be empty, such that each member of the original
sequence appears once and only once in one of the sub-sequences and
all members of all sub-sequences are in the original
sequence.

where n = 3, then
S1 is (<a/><a/>),
S2 is (<b/>),
and the ·path· of
S in M is the sequence containing first the Particle
for the a element, then the Particle for the
b element, then once more the
Particle for the a element.

which can match the sequence (<a/><b/>)
in more than one way.
It may also be the case with unambiguous model groups, if
they do not correspond to a deterministic
expression (as it is termed in [XML 1.1])
or a "1-unambiguous" expression, as it
is defined by [One-Unambiguous Regular Languages].
For example,

3.8.4.2 Principles of Validation against Groups

As noted above, each model group M denotes a
language L(M), whose members are sequences of element information
items. Each member of L(M) has one or more ·paths· in M, as do
other sequences of element information items.

[Definition:] Two
ParticlesP1 and P2 contained in some ParticlePcompete with each other if and only if some sequence S
of element information items has two ·paths· in P which are
identical except that one path has P1 as its last item and the other
has P2.

the sequence (<a/><b/>) has two paths,
one (Q1) consisting of the Particle whose {term} is
the declaration for a followed by the
Particle whose {term} is
the declaration for b, and
a second (Q2) consisting of the Particle whose {term} is
the declaration for a followed by the
Particle whose {term} is
the wildcard. The sequences Q1 and Q2 are
identical except for their last items, and so the
two Particles which are the last items of Q1 and
Q2 are said to ·compete· with each other.

[Definition:] A sequence S of
element information items is locally valid against
a particle P if and only if
S has a ·validation-path· in P. The set of all such
sequences is written V(P).

3.8.4.3 Element Sequence Valid

Validation Rule: Element Sequence Valid

For a sequence S (possibly empty) of element information items to be
locally ·valid· with respect to
a model group M, Smust be in V(M).

Note: It is possible to define groups whose {particles}
is empty. When a choice-group M has an empty
{particles} property, then
L(M) is the empty set.
When M is a sequence- or all-group with an empty
{particles} property, then
L(M) is the set containing the empty (zero-length) sequence.

3.8.6.3 Element Declarations Consistent

Schema Component Constraint: Element Declarations Consistent

If the {particles}
property contains, either
directly, indirectly (that is, within the {particles}
property of a
contained model group, recursively),
or ·implicitly·, two or more
element
declarations with the same expanded name, then all their type
definitions must be the same top-level definition, that is,
all of the following must be true:

Since this constraint is expressed at the component level, it
applies to content models whose origins (e.g. via type derivation and
references to named model groups) are no longer evident. So particles at
different points in the content model are always distinct from one another,
even if they originated from the same named model group.

unbounded if the {max occurs} of any wildcard or element
declaration particle in G.{particles} or the maximum
part of the effective total range of any of the group particles in
G.{particles} is unbounded,
or if any of those is non-zero
and P.{max occurs}
= unbounded,
otherwise the product of P.{max occurs} and the
sum of the {max occurs} of every wildcard or element
declaration particle in G.{particles} and the maximum
part of the effective total range of each of the group particles in
G.{particles}
(or 0 if there are no
{particles}).

3.8.6.6 Effective Total Range (choice)

Schema Component Constraint: Effective Total Range (choice)

The effective total range of a particle P
whose {term} is a group G
whose {compositor} is
choice is a pair of minimum and maximum, as follows:

unbounded if the {max occurs} of any wildcard or element
declaration particle in G.{particles} or the maximum
part of the effective total range of any of the group particles in
G.{particles} is unbounded,
or if any of those is non-zero and
P.{max occurs} = unbounded,
otherwise the product of P.{max occurs} and the
maximum of the {max occurs} of every wildcard or element
declaration particle in G.{particles} and the maximum
part of the effective total range of each of the group particles in
G.{particles}
(or 0 if there are no {particles}).

In general, multiple element
information item [children], possibly with intervening character [children] if the content type
is mixed, can be ·validated· with
respect to a single particle. When the {term} is an element
declaration or wildcard, {min occurs} determines the minimum number of such element [children] that can occur. The number of such children must be greater than or equal to {min occurs}. If {min occurs} is 0, then occurrence of such children is optional.

Again, when the {term} is an element
declaration or wildcard, the number of such element [children]must be less than or equal to any numeric specification of
{max occurs}; if {max occurs} is unbounded, then there is no
upper bound on the number of such children.

3.9.4.1.1 Language Recognition for Repetitions

When P.{min occurs} = P.{max occurs} = n,
and P.{term} = T,
then L(P) is the set of sequences S = S1 + S2 + ... + Sn such that Si is in L(T) for 0 < i ≤ n.
Less formally: L(P) is
the
set of sequences which have ·partitions· into n sub-sequences
for which each of the n subsequences
is in the language accepted by the {term} of P.

When P.{min occurs} = j
and P.{max occurs} = k,
and P.{term} = T,
then L(P) is the set of sequences S = S1, + S2 + ... + Sn, i.e. the
set of sequences which have ·partitions· into n sub-sequences
such that n ≥ j and n ≤ k (or k is unbounded)
and Si is in L(T) for 0 < i ≤ n.

If T is a ·basic term·, then the (sole) ·path· of S in P
is a sequence of n occurrences of P.

Note:
Informally: the path of an input sequence S in a
particle P may go through the ·basic particles· in
P as many times as is allowed by
P.{max occurs}.
If the path goes through P more than once, each
time before the last one must correspond to a sequence
accepted by
P.{term};
because the last
iteration in the path
may not be complete, it need not be accepted by the
{term}.

Note: The rule just given does not require that the
content model be deterministic. In practice, however,
most
non-determinism in content models is ruled out by the schema
component constraint Unique Particle Attribution (§3.8.6.4).
Non-determinism can occur despite that constraint for
several reasons.
In some such cases,
some particular element information item may be accepted by either a
Wildcard or an Element Declaration. In such situations,
the validation process defined in this specification matches the
element information item against the Element Declaration, both in
identifying the Element Declaration as the item's
·context-determined declaration·,
and in choosing alternative paths through a content model.
Other cases of non-determinism involve nested particles each of
which has {max occurs} greater than 1,
where the input sequence can be partitioned in multiple ways.
In those cases, there is no fixed rule for eliminating the
non-determinism.

Note: clause 1 and clause 2.3.2 do not
interact: an element information item validatable by a declaration
with a substitution group head is
not validatable by a wildcard which accepts the head's
(namespace, name) pair but not its own.

In order to exploit the full potential for extensibility offered by XML
plus namespaces, more provision is needed than DTDs allow for targeted flexibility in content
models and attribute declarations. A wildcard provides for ·validation· of
attribute and element information items dependent on their namespace
names
and optionally on their local names.

({variety}not and
{namespaces} a set whose members are either namespace names or
·absent·) have any namespace
other than the specified namespaces and/or, if ·absent· is included in the set, are
namespace-qualified;

({variety}enumeration and {namespaces} a set
whose members are either namespace names or ·absent·) have any of the specified
namespaces and/or, if ·absent· is included in the set, are
unqualified.

The keywords
defined and
sibling allow a kind of wildcard which matches only
elements not declared in the current schema or contained
within the current complex type,
respectively. They are
new in this version of this specification. The Working Group is
uncertain whether their value outweighs their
liabilities; we solicit input from implementors and users of
this specification as to whether they should be retained or not.

3.10.2 XML Representation of Wildcard Schema Components

The XML representation for a wildcard schema component is an
<any> or <anyAttribute> element
information item.

An <any> information item
corresponds both to a wildcard component and to
a particle containing that wildcard
(unless minOccurs=maxOccurs=0, in which case the
item corresponds to no component at
all).
The mapping rules are given in the following two subsections.

The mapping from an <any> information item to a wildcard component
is as follows. This mapping is also used for mapping
<anyAttribute> information items to wildcards,
although in some cases the result of the mapping is further
modified, as specified in the rules for
<attributeGroup> and
<complexType>.

1 If neither namespace nor
notNamespace is present, then the empty set;

2 If namespace = "##any", then the empty set;

3 If namespace
= "##other", then a set consisting of
·absent·
and, if the targetNamespace [attribute] of
the <schema> ancestor element
information item is present, its ·actual value·;

4 otherwise a set whose members are namespace
names corresponding to the space-delimited substrings of
the ·actual value· of the namespace or
notNamespace[attribute] (whichever is
present), except

4.1 if one such substring is
##targetNamespace, the corresponding
member is the ·actual value· of the
targetNamespace[attribute] of the
<schema> ancestor
element information item if present, otherwise
·absent·;

4.2 if one such substring is ##local, the
corresponding member is ·absent·.

Wildcards are subject to the same ambiguity constraints
(Unique Particle Attribution (§3.8.6.4)) as other content model
particles:
If an instance element could match one
of two wildcards, within the content model of a type, that model
is in error.

3.10.3 Constraints on XML Representations of Wildcards

Schema Representation Constraint: Wildcard Representation OK

In addition to the conditions imposed on <any>
element information items by the schema for schema documents, namespace and
notNamespace attributes must not both be
present.

[Definition:]
An element or attribute information item is skipped if it
is ·attributed· to a skip wildcard or if one of its ancestor
elements is.

3.10.4.2 Wildcard allows Expanded Name

Validation Rule: Wildcard allows Expanded Name

For an expanded nameE, i.e. a
(namespace name, local name) pair,
to be ·valid· with respect to a
namespace constraint C
which appears as the value of
a {namespace constraint} belonging to a wildcard W,
all of the following must be true:

Note: If a wildcard
union is inexpressible, any rule requiring
that one Namespace Constraint be that union cannot be
satisfied.

In the case where there are more than two Namespace Constraints to be combined, the wildcard
union is determined by identifying the wildcard union of two
of them as above, then the wildcard union of the result with
the third (providing the first union was expressible), and so on as required.
If some of the Namespace Constraints contain
defined in their {disallowed names},
then the union operation must be applied first to those who do not
contain
that
keyword,
to maximize the chance of expressibility.

In the case where there are more than two Namespace Constraints to be combined, the wildcard
intersection is determined by identifying the wildcard intersection of two
of them as above, then the wildcard intersection of the result with
the third, and so on as required.

(unique) the
identity-constraint
definition asserts uniqueness, with respect to the content
identified by {selector}, of the tuples resulting from
evaluation of the {fields} XPath expression(s).

(key) the
identity-constraint
definition asserts uniqueness as for
unique. key further asserts that all selected content
actually has such tuples.

(keyref) the
identity-constraint
definition asserts a correspondence, with respect to the content
identified by {selector}, of the tuples resulting from
evaluation of the {fields} XPath expression(s), with those of the {referenced key}.

These constraints are specified along side the specification of types for the
attributes and elements involved, i.e. something declared as of type integer
can also serve as a key. Each constraint declaration has a name, which exists in a
single symbol space for constraints. The
equality and inequality
conditions
appealed to in checking these constraints apply to the
values of
the fields selected, not their
lexical representation, so that for example 3.0 and 3
would be conflicting keys if they were both
decimal, but non-conflicting if
they were both strings, or one was a string and one a decimal.
When equality and
identity differ for the simple types involved, all three
forms of identity-constraint test for equality, not identity,
of values.

Overall the augmentations to XML's ID/IDREF mechanism are:

Functioning as a part of an identity-constraint is in addition to, not instead of,
having a type;

Not just attribute values, but also element content and combinations
of values and content can be declared to be unique;

Identity-constraints are specified to hold within the scope of particular elements;

(Combinations of) attribute values and/or element content can be
declared to be keys, that is, not only unique, but always present and non-nillable;

The comparison between keyref{fields} and
key or unique{fields} is by value equality,
not by string equality.

{selector} specifies a restricted XPath
([XPath 2.0]) expression relative to
instances of the element being declared. This must identify a
sequence of element nodes that are
contained within the declared element
to which the constraint applies.

{fields} specifies XPath expressions relative to each
element selected by a {selector}.
Each
XPath expression in the {fields} property
must identify
a single node (element or attribute),
whose content or value, which must be
of a simple type, is used in the constraint. It is possible to specify an
ordered list of {fields}s, to cater to multi-field keys,
keyrefs, and uniqueness constraints.

The XML representation for an identity-constraint definition schema
component is either a <key>, a <keyref> or a <unique> element information
item. The correspondences between the properties of those
information items and properties of the component they
correspond to are as follows:

A state element is defined, which
contains a code child and some vehicle and person
children. A vehicle in turn has a plateNumber attribute,
which is an integer, and a state attribute. State's
codes are a key for them within the document. Vehicle's
plateNumbers are a key for them within states, and
state and
plateNumber is asserted to be a key for
vehicle within the document as a whole. Furthermore, a person element has
an empty car child, with regState and
regPlate attributes, which are then asserted together to refer to
vehicles via the carRef constraint. The requirement
that a vehicle's state match its containing
state's code is not expressed here.

A list of state elements can appear as child elements
under stateList. A key constraint can be used to
ensure that there is no duplicate state code. We already
defined a key in the above example for the exact same purpose
(the key constraint is named "state"). We can reuse it
directly via the ref attribute on the key
element.

3.11.4 Identity-constraint Definition Validation Rules

Validation Rule: Identity-constraint Satisfied

For an element information item E
to be locally ·valid· with respect to an identity-constraint
all of the following must be true:

1 A data model instance is constructed
from the input information set, as described in [XDM].
The {selector}, with
the element node corresponding to E as the
context node, evaluates to a
sequence of element nodes, as defined in
XPath Evaluation (§3.13.4.2).
[Definition:]
The target node set is the
set of nodes in that sequence,
excluding
element nodes corresponding to element
information items that are
·skipped·.

2 Each node in the ·target node set· is
either the context node
or an
element node among its descendants.

3 For each node in the ·target node
set· all of the {fields}, with
that node as the context node, evaluates to
a
sequence of nodes (as defined in
XPath Evaluation (§3.13.4.2)) that only contains
·skipped· nodes and at most one node whose ·governing·
type definition is either a simple type definition or a complex type
definition with {variety}simple.
[Definition:] Call the
sequence of the
[schema actual value]s
of the element and/or attribute information items in
those node-sets in order the key-sequence of the
node.

Note: The use of [schema actual value] in the definition
of ·key sequence· above means that
default or fixed value constraints may play a part in ·key sequence·s.

4
[Definition:] Call the subset of the ·target node set· for
which all the {fields} evaluate to a node-set
one of whose members
is an element or attribute node with a
·non-absent·[schema actual value]
the qualified node set.
The appropriate case among the following is true:

Note:
For unique identity constraints, the ·qualified node set· is
allowed to be different from the ·target node set·. That is, a
selected unique node may have fields that do not have corresponding
[schema actual value]s.

Note: Because the validation of keyref (see clause 4.3) depends on finding appropriate entries in a
element information item's ·node
table·, and ·node
tables· are assembled strictly recursively from the
node tables of descendants, only element information items
within the sub-tree rooted at the element information item
being ·validated· can be referenced successfully.

For purposes of checking
identity-constraints, single atomic values are not distinguished
from lists with single items. An atomic value V and a list
L with a single item are treated as equal
, for
purposes of this specification, if V is equal
to the atomic value which is the single item of L.

provided no two entries have the same ·key-sequence· but distinct nodes. Potential
conflicts are resolved by not including any conflicting entries which
would have owed their inclusion to clause 1 above. Note
that if all the conflicting entries arose under clause 1 above, this means no entry at all will appear for the
offending ·key-sequence·.

Note: The complexity of the above arises from the fact that
keyref identity-constraints can be defined on domains distinct from the
embedded domain of the identity-constraint they reference, or on domains which are the
same but self-embedding at some depth. In either case the ·node
table· for the referenced identity-constraint needs to propagate upwards, with
conflict resolution.

The Identity-constraint Binding information item, unlike
others in this specification, is essentially an internal bookkeeping
mechanism. It is introduced to support the definition of Identity-constraint Satisfied (§3.11.4)
above.