XML

Using Schematron

If you use XML schema languages, you should consider Schematron. This powerful rule-based language lets you make distinctions which other languages find difficult or even impossible to handle. Best of all, you can use it in conjunction with other schema languages. This is the first part of a three-part series.

XML schema languages are important because they allow multiple parties to exchange information in a standardized way. Any party, by reading the schema definition, will know what to expect in an XML document validated against the schema. A number of XML schema languages currently exist, created with different objectives in mind, with the newer languages attempting to remedy the faults of the older languages.

XML Schema is the W3C-recommended schema language. In it, you can define exactly which elements and attributes can appear in an XML document, exactly what order they'll appear in, and exactly which types of data they can contain. This makes XML Schema a grammar-based schema language because it deals with the structure of things, just as grammar deals with the structure of things in spoken languages.

Often, however, structure isn't enough, and different kinds of constraints need to be introduced. Pieces of data in an XML document often have special relationships which should be validated, for example. Consider an XML document that describes the residents of a given area. Each resident could be represented by a resident tag, and under each resident tag could be an age tag and an eligible_to_vote tag that would indicate whether or not the resident is eligible to vote in an election. Of course, ordinarily, more data would be described, but let's keep it simple for the purpose of this example. The markup might look something like this:

<?xmlversion="1.0"encoding="utf-8"?>

<residents>

<resident>

<age>37</age>

<eligible_to_vote>yes</eligible_to_vote>

</resident>

<resident>

<age>14</age>

<eligible_to_vote>yes</eligible_to_vote>

</resident>

</residents>

Notice, however, that there's something wrong. The second resident is only fourteen years old yet is eligible to vote. I know of no area where this is legal. As a result, the markup really isn't "valid" even though its structure might be valid according to a grammar-based schema language.