Converting SAE J2008 to an XML DTD Using Near & Far Designer 3.0

XML DTD Design/Conversion Tool

Considerable time and resources have been spent in every vertical industry
to develop industry standard SGML Document Type Definitions, or DTDs. The
aerospace, defense, automotive, telecommunications, semi-conductor, railroads,
health, scholarly journal and newspaper industries each have their own SGML
DTDs. These DTDs set the rules by which data is coded for interchange. Now
there is a great interest in providing industry standard XML DTDs to facilitate
Web delivery of richly encoded data to the desktop. Because industry DTDs
encode complex data constructs for a vertical industry, these DTDs tend to be
quite complex. A semi-automated method of converting existing DTDs from SGML to
XML will not only prove cost-effective from a time and resource perspective,
but can aide each industry to make the transition to Web delivery in the most
immediate and timely fashion.

SAE J2008 DTD for the Automotive and Trucking Industries

In the automotive industry, the development of the SAE J2008 suite of
standards was a direct response to the requirements of the 1990 Clean Air
Act. Paragraph 202(m)(5) of the Act addresses the requirement for Information
Availability and tasks automotive manufacturers to "provide any and
all information needed to make use of emission control diagnostics systems
including instructions for making emissions related diagnostics and repair."

Availability of vehicle service information is the key element to effective
automotive diagnosis and repair. And effective diagnosis and repair is
believed to have a direct impact on air quality. Studies have shown that
automotive technicians will only use proper service procedures if information
access is fast and easy. If information is not readily available alternate
service procedures (which may be less effective) will be employed. Rather than
develop information systems specific to emission defects, the mission of SAE
J2008 was broadened to accommodate all other vehicle service information as
well. In 1997, the mission further expanded to include service information for
all on-road vehicles including heavy trucks and construction equipment.

The SAE J2008 Task Force studied a variety of information modeling and
exchange methodologies. Because there is no standardization in the industry in
terms of automotive service document specifications, work began with the
development of a relational data model. The SGML definition was based on the
data model rather than being based on any particular type of service manual.

Converting SAE J2008 to XML

During 1997, the Draft DTD within SAE J2008 was updated for presentation as
a final SAE Standard. Massive changes were made to the DTD in order to support
the addition of heavy truck data and to make the DTD reflect the Data Model in
the most concise manner possible. Near & Far was used extensively to
create this new version of SAE J2008. All members of the SGML Working Group
were experienced SGML designers, so the tool was used in the SGML Symbology
mode. Using Near & Far Designer, the group could prototype alternate
models quickly and easily. The once daunting task of recording hand drawn
structure charts in SGML and parsing became an automatic function using the
Microstar tool. SGML Working Group members agree that the DTD could not have
been so completely re-worked in such a short amount of time without employing
Near & Far ®.

It is important to note that the version of SAE J2008 that will be balloted
late in 1998 is not an XML DTD. Due to time constraints, this first formal
version of the standard contains an SGML-compliant DTD. The SGML Declaration,
however, was updated to be XML compatible because there is interest in being
able to directly deliver XML- compliant data to desktop browsers.

In order to develop an XML DTD for SAE J2008, the existing SGML DTD must be
converted. Microstar's Near & Far Designer® 3.0 was used to help
automate this activity. This new version of Near & Far incorporates XML
into the familiar Microstar SGML DTD design tool. Implementers can use the tool
to create a new XML DTD using a graphical user interface. But more
interesting to those working with SAE J2008 is the ability of the tool to
assist with the conversion from an SGML DTD to an XML DTD.

Automatic Conversions from SGML to XML

XML is a narrow profile of SGML. It has a concrete syntax prescribed by the
XML SGML Declaration which has become the standard syntax for Web delivery of
SGML data. In addition, the syntax of declarations within the DTD have been
limited to assure creation of well-formed, self-describing documents.

Near & Far Designer® 3.0 automates the conversion of one-for-one
differences between SGML and XML. These conversions are straightforward and are
performed when a user selects the "Convert to XML" option in the
Tools pull-down menu.

Figure 1. Convert to XML Pull-Down Menu

Automatic conversion routines include:

XML Declaration: Each SGML DTD simply begins with a <!DOCTYPE
statement. An XML DTD has the following processing instruction that specifies
this is an XML DTD, its version and encoding:

<?xml version="1.0" encoding="ISO-8859-1"?>

Omitted Tag Minimization Rules: In SGML you can omit start or end tags.
In XML tag omission is outlawed. So SGML DTDs which indicate tag omission must
be edited to eliminate the minimization field altogether.

Grouped Element and Attribute Declarations: In SGML one can group element
or attribute type declarations. In XML, each element type and attribute list
must have its own separate declaration. SGML DTDs must be edited to create
individual declarations where groupings occurred in the SGML DTD.

In-Line Comments Not Allowed: In SGML you can add comments inside any of
the declarations. In XML DTDs, all comments must stand alone. So SGML DTDs
must be edited to create stand alone comments in place of comments embedded
inside other declarations.

Quoted Default Attribute Values: In SGML there is an option for how a
default attribute value is specified. In XML, all default attribute values must
be surrounded by a pair of quotes.

Parameter Entity Specifications: In SGML the semicolon which follows the
name of a parameter entity is optional. In XML the semicolon is required. SGML
DTDs must be edited to assure the semicolon always follows the use of a
parameter entity.

Syntax for Conditional Statements: XML does not allow spaces on either
side of the keyword that introduces a conditional statement. These spaces must
be removed from SGML DTDs.

Assisted Conversion of Attribute Values and Defaults

In XML, a number of SGML attribute values are not allowed. These
restrictions were implemented so that XML-coded data could be "self
describing and well-formed". In XML, only CDATA, NMTOKEN, NMTOKENS, ID,
IDREF, IDREFS, ENTITY, and ENTITIES are allowed as attribute values. For
defaults, only declared defaults, #REQUIRED, #FIXED, and #IMPLIED are allowed.

Specific SGML attribute values that are forbidden in XML include:

NAME and NAMES

NUMBER and NUMBERS

NUTOKEN and NUTOKENS

And these SGML attribute defaults are not allowed in XML DTDs:

#CURRENT

#CONREF

Clearly to convert from SGML to XML DTDs, we need to review the attribute
values and defaults and change them to acceptable XML attribute values. This is
not an automatic one-to-one mapping as were the conversions discussed in the
previous section. However, this conversion can be automated once a mapping has
been established.

Near & Far Designer® 3.0 enables us to specify standard SGML-to-XML
mappings for attribute values and defaults using the "Tools"
pull-down menu. Simply select "Options" and then "XML".
At this point you can use check boxes to indicate replacements you wish to make
automatically. For example, you can select "Replace NUTOKENS with
NMTOKENS" or you can select "Replace NAMES with NMTOKENS."
For the conversion of SAE J2008 to XML, the standard replacements suggested by
check boxes in the XML menu were used.

Assisted Conversion of Declared Content

In XML, the SGML declared content CDATA and RCDATA are not allowed within
content models in the DTD. Again, a one-to-one mapping from SGML to XML does
not exist. So conversion cannot proceed automatically. However, as was the case
with attribute values and defaults, this conversion can be automated once a
mapping has been established.

Near & Far Designer® 3.0 enables us to specify standard SGML-to-XML
mappings for declared content using the "Tools" pull-down menu.
Simply select "Options" and then "XML". At this point you
can use check boxes to indicate replacements you wish to make automatically.
For SAE J2008, any CDATA and RCDATA specifications were directly replaced with
#PCDATA.

Completing the Conversion

In addition to the conversion items which you can automate with Near &
Far Designer® 3.0, certain issues will remain which cannot be resolved
either with a one-for-one replacement or with user-defined mappings. In these
cases Near & Far Designer® 3.0 assists you by providing a lists of
discrepancies:

Figure 6. XML Near & Far Designer XML Conversion Report

For SAE J2008, the remaining errors fell into several classes which will be
described in the following sections. Types of errors included:

The AND Connector Not Allowed: In XML, the AND connector (&) is not
allowed in content models. Some other way to model this content must be
developed in order to convert from SGML to XML.

Inclusions Not Allowed: In XML inclusions, or elements which can float
anywhere within another element are not allowed. Some other way to model this
content must be developed in order to convert from SGML to XML.

Exclusions Not Allowed: In XML exclusions, or elements which can be
banned from within certain elements are not allowed. Some other way to model
this content must be developed in order to convert from SGML to XML.

System ID Required: In XML, a system ID is required when general entities
are specified. To convert from an SGML DTD to an XML DTD, this must be added.

Not only does Near & Far Designer® 3.0 identify discrepancies
between your SGML DTD and an XML-compliant DTD, but by simply clicking on each
error message, you can quickly link to the exact location. At that point you
are ready to resolve the discrepancy and move on to the next work item.

Eliminating Inclusions

Inclusion exceptions are allowed in SGML DTDs, but not in XML DTDs. In XML
it is expected that each content model be precisely declared. Eliminating
inclusion exceptions is a relatively easy task if the inclusion falls in the
terminal node of a DTD (at the #PCDATA level). Because of the certainty of
white space handling, mixed content can be used to eliminate inclusions.
Rather than using an inclusion, the same effect can be achieved with mixed
content.

Figure 7. Eliminating Inclusions at Paragraph Level

Fortunately, in SAE J2008 inclusions do not happen at a high level. In
fact, they only happen at the PCDATA level. So eliminating inclusions in SAE
J2008 was a relatively simple task. All inclusions were placed in a mixed
content OR group with PCDATA as is allowed by the XML standard.

Eliminating AND Connectors

In XML, AND connectors (&) are not allowed. AND connectors are used to
specify that elements may occur in any order. When AND connectors are used to
connect two or three elements, the number of possible element combinations is
not significant. However, when the AND connector is used with a large group of
elements, the possible element combinations become staggering. AND was
eliminated from XML to promote simpler, more precise data models. To convert
from SGML to XML DTDs, AND connectors must be eliminated.

In SAE J2008, AND connectors were never used so no conversion was required.
Figure 8 shows how AND connectors can be modeled into content should that be
required for XML DTD conversion.

Figure 8. Eliminating the AND Connector in Content Models

Eliminating Exclusions

Exclusion exceptions are allowed in SGML DTDs, but not in XML DTDs. In XML
it is expected that each content model be precisely declared. Eliminating
exclusions can involve making some hard design decisions. First let's look at
a valid exclusion. In this model a paragraph is either text or footnotes
(mixed content). A footnote is defined as being either text or paragraphs.
But in this model if we put a paragraph inside a footnote, we also allow a
footnote (which can be inside a paragraph) inside a footnote. So to prevent a
footnote from falling within a footnote, we use an exclusion to say that a
footnote cannot occur within a footnote. This sort of exclusion is not allowed
in XML.

Figure 9. Eliminating Exclusions

Handling exclusions is usually not straight forward. One solution would be
to simply ignore specification of the exclusion and to assume that good
authoring practice would prevent a footnote from happening within a footnote.

The second, more precise solution is to give the elements within the
structure where the exclusion is specified a unique (usually fully qualified)
name. With a unique name, <ftnote.para> can have a unique content model
which does not allow for the occurrence of footnote. This solution is clearly
quite precise, but it is not upwardly compatible with the original SGML DTD.
It also adds new tags which users must learn to use. Using this solution
requires a transformation to deliver SGML data with an XML DTD as XML-coded
data on the Web.

For SAE J2008, new elements with unique names were developed to eliminate
exclusions. For example, new elements were developed to prevent attentions from
occurring within other attentions and to prevent tables and figures from
occurring within tables. At times these models became quite complex. See
Figure 10.

Figure 10. Eliminating Exclusions in SAE J2008

Adding System File IDs:

In XML system file IDs are required. The system file ID is usually a
host-specific file name.

<!ENTITY x33445 SYSTEM"file://c:/graphics/x3345.tif">

To make this task easier, Near and Far Designer® 3.0 notifies us
whenever such system file IDs must be added.

Summary

Near & Far Designer® 3.0 was specially designed to help make the
transition from SGML to XML as smooth and straightforward as possible. Near &
Far Designer® 3.0 can evaluate any valid SGML DTD and interactively
convert all mappings that are one-for-one. It will also highlight any
remaining discrepancies, evaluate end user resolutions, and complete the
transformation from an SGML DTD to an XML DTD -- taking all guess work out of
this task. Near & Far Designer® 3.0 was designed to enable
organizations to make the transition from SGML to XML in a cost and resource
effective manner.

Following the transition from SGML to XML, the graphical interface of Near &
Far Designer® 3.0 makes the ongoing creation of XML DTDs an easy task in
the future. Designer now offers the document analyst a choice to create either
new SGML DTDs or to create XML DTDs directly.