InDesign, InDesign Server and InCopy development

Validating IDML Based Files

IDML is designed to be generated and manipulated by XML tools and programmers. To support this, IDML can be validated against a RelaxNG schema.

When it comes to schemas and validation, there are two types of IDML files:

There are many single-file variants (snippets, assignments, ICML, etc.). These files need to be validated with the snippet schema.

Packages are multi-file ZIP archives that represent an entire InDesign document. Packages need to be validated with the package schema.
For more information on IDML files see the “IDML File Types” post.

Generating Schema Files

IDML can be extended by the scripting support in third-party plug-ins. For this reason, the schema used for validation must match a plug-in configuration. Rather than providing a static schema, InDesign provides a means to create a schema from your plug-in configuration. The following sample code demonstrates producing snippet and package schemas with JavaScript.

JavaScript

1

2

3

4

// Generate a non-package schema

app.generateIDMLSchema(Folder("/idml-schema/snippet"),false);

// Generate package schema

app.generateIDMLSchema(Folder("/idml-schema/package"),true);

This snippet schema is used to validate all single-file variants of IDML. It comprises two files:

File

Purpose

datatype.rnc

Shared data type file included by all schema files.

IDMarkupLanguage.rnc

Validates all single file IDML variants (ICML, IDMS, ICMA, etc.).

The package schema comprises one shared file and schema file for each type of XML file that can appear in an IDML package:

File

Purpose

datatype.rnc

Shared data type file included by all schema files.

designmap.rnc

Validates designmap.xml

MasterSpreads/MasterSpread.rnc

Validates all master spread files in the MasterSpreads directory.

Resources/Fonts.rnc

Validates Fonts.xml.

Resources/Graphic.rnc

Validates Graphic.xml.

Resources/Preferences.rnc

Validates Preferences.xml.

Resources/Styles.rnc

Validates Style.xml.

Spreads/Spread.rnc

Validates all spread files in the Spreads directory.

Stories/Story.rnc

Validates all story files in the Story directory.

XML/BackingStory.rnc

Validates XML/BackingStory.xml

XML/Mapping.rnc

Validates XML/Mapping.xml

XML/Tags.rnc

Validates XML/Tags.xml

Finding Errors in IDML

For demonstration purposes, we need files that contain errors. Imagine the following IDML fragment in both a snippet file (test.idms) and package (test.idml) file. The IDML contains four fairly obvious errors; try to spot all four.

XHTML

1

2

3

4

5

<Spread>

<Rectangle foo="Test"Self="uec"…>

<RectData>...</RectData>

<Propertie>...</Propertie>

</Spread>

You may have found it difficult to spot all four errors. Imagine if this was buried in a huge XML file. Instead of trying to find errors ourselves, we use schema validation.

Schema Validation Basics

A RelaxNG schema can be used to verify the structural correctness of a document. It checks to make sure all XML nodes (elements, attributes, text data, etc.) are used at the right places in the document. It detects any unknown or unexpected nodes and ensures that required nodes are present.

InDesign’s RelaxNG schemas can be used to check the structure of a document; however, it does not check the content of these nodes. For example, it doesn’t check that all IDML references exist. It’s possible to do some non-RelaxNG-based error detection. This is discussed in “Additional Error Detection” below.

You can validate IDML files with any software that supports the compact form of RelaxNG. For snippet files, this is relatively straightforward: it amounts to pointing whatever validation engine you are using to the IDMarkupLanguage.rnc file (which includes the datatyps.rnc file).

An IDML package comprises many XML files. The package schema comprises several schema files: there is one schema for each type of file that can appear in an IDML file. To validate a package, you need to match each XML file with its appropriate schema file.

Validating with IDMLTools

The InDesign CS4 Products SDK includes a Java package called IDMLTools. This package contains a validation application based on the Jing RelaxNG Validator, which handles both snippets and package files. It’s especially handy for package files, because it unzips the files and matches XML files to the appropriate schema file.
For information about setting up IDMLTools, see the IDML ReadMe. This amounts to the following:

Add the IDMLTOOLS_HOME environment variable. This should contain the path to your IDMLTools folder. (Do not terminate with a trailing \ or /.)

Once set up, you can validate files by running the appropriate platform script, validate.bat on Windows and validate.sh on Mac OS. These scripts set the appropriate Java classpath and run the validation application. The validation application can be used to validate both types of IDML files (snippets and package files). Running the platform scripts with no arguments produces the following usage message:

Validator SchemaPath PackagePath [PackagePath…]

This means you validate by specifying a path to the schema folder, followed by paths to one or more package files that you want to validate.

Validating a Snippet

To validate the test.idms snippet, specify the path to the snippet schema, followed by the path to the actual snippet:

validate.bat “c:\idml-schema\snippet” test.idms

The validation application writes errors to standard error. Here are the results from validating test.idms:

Notice that the XML file containing the error is reported on the left. In this case, the error is in the Spread_ubd.xml file. Because package validation deals with multiple XML files in one pass, error results can come from several files.

Interpreting the Results

From the results above, we can deduce the four errors:

The Spread element is missing a required attribute. Unfortunately, Jing does not report which attribute is missing, but it is easy enough to look at the schema or Adobe InDesign CS4 IDML File Format Specification and determine that it is the Self attribute that is missing.

There is no “foo” attribute on the Rectangle element.

“RectData” is not a child of the Rectangle element.

Rectangle does not have a child element called “Propertie.” Looking at the schema, Adobe InDesign CS4 IDML File Format Specification, and numerous other examples, we can conclude that this element should be called “Properties” (with an ‘s’).

Additional Error Detection

The IDMLTools validation application includes some non-RelaxNG-based error detection. Currently, it checks for the following errors:

Missing designmap.xml file.

Missing or improper processing instruction at the top of the designmap.xml file.

Missing package files included in designmap.xml; e.g., Spreads/Spread_ubd.xml in the following output:

Because Adobe distributes the source for IDMLTools, it’s possible to add additional error detection. You’ll find the code for these items in the Validator.preVerifyPackage() method.