Hi Folks,
There has been a considerable amount of discussion (and confusion) on
how an XML instance document indicates the XML Schema(s) that it
conforms to. I am not sure that it is yet clear in people's minds on
how to do it. I will take a stab at explaining it, based upon the
discussions. However, we really need this to be verified by someone
from the Schema WG.
[Henry, I haven't fully digested your most recent message. Hopefully
the following is consistent with what you said.]
[Also, thanks a lot to Henry Thompson, Andrew Layman, and Rick Jelliffe
for taking the time to answer my endless barrage of questions. I hope
that these questions and their answers are useful to all.]
Case 1. Entire instance document conforms to a single XML Schema
Let's use the example that Gabe Beged-Dov gave yesterday. Here's the
skeleton of the XML Schema:
<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "structures.dtd">
<schema xmlns="http://www.w3.org/1999/XMLSchema"
targetNamespace="urn:person-schema">
...
</schema>
Let's assume that the URI for this schema is:
urn:person-schema/person-schema.xsd
Thus the namespace for the elements and attributes that are declared in
person-schema.xsd is urn:person-schema.
An XML instance document that wishes to indicate that all or part of it
conforms to person-schema.xsd must use the attribute, schemaLocation.
The value of schemaLocation must include a pair of values - the
namespace (urn:person-schema) and the URI to the Schema
(urn:person-schema/person-schema.xsd). Thus, for this case the value of
schemaLocation is:
schemaLocation="urn:person-schema
urn:person-schema/person-schema.xsd"
A Schema-validating parser will use the URI to fetch the schema
document, and then will verify that the targetNamespace value matches
the namespace in schemaLocation.
The schemaLocation attribute is defined in the schema instance
namespace. So, to use it in our instance document we first need to
define a qualifier for the schema instance namespace and then prefix
schemaLocation:
xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
xsi:schemaLocation="urn:person-schema
urn:person-schema/person-schema.xsd"
Now then, is that all that's needed in the XML instance document -
simply add schemaLocation as an attribute to the root element, i.e.
<?xml version="1.0"?>
<Person xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
xsi:schemaLocation="urn:person-schema
urn:person-schema/person-schema.xsd">
<fname>Helen</fname>
<lname>Jones</lname>
</Person>
Based upon Andrew Layman's messages yesterday, the answer is no. I
believe that I now understand why. In the above instance document we
have not declared a namespace for the elements - Person, fname, and
lname. Thus, they are in the document's namespace. However, with the
schemaLocation attribute we are asserting that the elements declared in
the schema are in the urn:person-schema namespace. Thus, in our
instance document we must make a namespace declaration to indicate that
the elements in the instance document also are in the urn:person-schema
namespace. Since we want to declare that all the instance document
elements come from the urn:person-schema namespace, we can use it as the
default namespace. Thus, our instance document looks like this:
<?xml version="1.0"?>
<Person xmlns="urn:person-schema"
xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
xsi:schemaLocation="urn:person-schema
urn:person-schema/person-schema.xsd">
<fname>Helen</fname>
<lname>Jones</lname>
</Person>
Using the default namespace declaration, all the elements in the
instance document have the same namespace as the schema namespace.
Thus, the entire instance document will get schema-validated.
Case 2. Part of the instance document conforms to a single XML Schema
Let's use the same schema as above and the same instance document.
However, in this case let's suppose that we just want to validate
"fname" against the schema. What would the instance document look like?
As usual we use the schemaLocation attribute to indicate the schema that
we are using. In the instance document we need to distinguish between
those elements that are in the document namespace versus the fname
element which is in the urn:person-schema. We can do this anywhere, but
for simplicity let's do it at the root element:
<?xml version="1.0"?>
<Person xmlns:p="urn:person-schema"
xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
xsi:schemaLocation="urn:person-schema
urn:person-schema/person-schema.xsd">
<p:fname>Helen</p:fname>
<lname>Jones</lname>
</Person>
The only element in the instance document which has the same namespace
as the schema namespace is fname. Thus, it is the only element which
will get schema-validated.
Case 3. Instance document conforms to multiple XML Schemas
Let's suppose that we have a second schema. This second schema
specializes in defining last names (I know, it's silly):
<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "structures.dtd">
<schema xmlns="http://www.w3.org/1999/XMLSchema"
targetNamespace="urn:last-name-schema">
...
</schema>
Note that this second schema's namespace is:
urn:last-name-schema
Let's continue to use the same instance document. However, let's assume
that we want to validate fname against the first schema and lname
against the second schema. For the Person element, we don't want any
validation.
Our schemaLocation attribute now will have two pairs of values - the
first pair is for the first schema and the second pair is for the second
schema. We will declare the two different namespaces and prefix fname
and lname appropriately. Thus, the instance document is:
<?xml version="1.0"?>
<Person xmlns:p="urn:person-schema"
xmlns:l="urn:last-name-schema"
xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
xsi:schemaLocation="urn:person-schema
urn:person-schema/person-schema.xsd
urn:last-name-schema
urn:last-name-schema/last-name-schema.xsd">
<p:fname>Helen</p:fname>
<l:lname>Jones</l:lname>
</Person>
Well, I am getting tired of writing. Hopefully this makes sense. Even
more, hopefully it is correct. Comments? /Roger