Minimal XML 1.0

Version:2000-04-11

Editors:

Don Park (Docuverse) 
donpark@docuverse.com

[Ed:Add names of top N contributors to SML-DEV archive]

1. Introduction

Minimal XML is a subset of XML 1.0; including features
essential for data interchange applications, and excluding non-essential
features that are arcane, legacy-related, problematic for data interchange
applications, or redundant.

1.2 Goals

·A subset that allows easily implemented parsers that
are much faster and smaller than full XML parsers.

·A subset with simpler information model that can easily
be mapped to other information models.

[Ed: [?]is link to
an entry in the Minimal XML FAQ which explains the rationale for removing the
feature.We need to start collecting
rationales scattered in the message archive.Volunteers?]

2. Character Encoding

Minimal XML documents must be encoded in either UTF-8 or
UTF-16.Minimal XML parsers must
support both UTF-8 and UTF-16 character encoding formats. [ ? ]

3. Syntax

3.1 Documents

[1] document ::= WS* element WS*

A Minimal XML document contains one or more elements.There is exactly one element, called the root,
or document element, which is not contained within another element.White space surrounding the root are not
reported.Minimal XML parsers must be
able to parse multiple documents in a single stream or file.

Following example shows a file containing two documents
whose document elements are <logentry>:

<logentry>

<timestamp><date>2000/03/26</date><time>10:10</time></timestamp>

<who>syslogd</who><what>startup</what>

</logentry>

<logentry>

<timestamp><date>2000/03/26</date><time>10:20</time></timestamp>

<who>syslogd</who><what>shutdown</what>
</logentry>

3.2 Elements

[2] element ::= STag content ETag

[3] STag ::= '<' Name
'>'

[4] ETag ::= '</' Name
'>'

[TBD]

Following example shows an element withits start tag (<who>), end tag (</who>), and its content
(syslogd):

<who>syslogd</who>

3.3 Element Contents

[5] content ::= (element | WS)* | (CharData | CharRef)*

Element content must be either:

·a sequence of elements with optional white spaces
between elements

·a sequence of character data and character references

Mixed-contents are not supported.White spaces surrounding elements are not reported.

Following example shows an element (<timestamp>) with two
child elements (<date>
and <time>),
each of which contains character data (2000/03/26
and 10:20):

<timestamp><date>2000/03/26</date><time>10:20</time></timestamp>

3.4 Element Names

[6] Name ::= [^<>&/]+

In Minimal XML, element names that cannot satisfy the XML
1.0 Name production are reserved.Element
names starting with underscore ('_')
character are also reserved.Use of
character : is reserved for namespace mechanisms.

[Add Example]

3.5 Character Data

[7] CharData ::= [^<>&]

Character data may not contain <, > or, & in
literal form.

[Add Example]

3.6 Character References

[8] CharRef ::= '&#'
[0-9]+ ';'

[TBD]

Following example shows three character references
representing three reserved characters (<,>,&):

Character data my not
contain &#60;, &#62;, or &#38; in literal form.

3.7. White Spaces

[9] WS ::= (#32 | #9
| #13 | #10)

Space, tab, carriage return, and newline characters are
considered to be white spaces in Minimal XML.White spaces surrounding elements are not reported.

[Add Example]

4. Information Model

[Ed: below is a Grove-like version of our information model
where name, value, and colors and unified into properties.Just experimenting to see if it comes out
clear.Unfortunately, I think it reads
more like Zen mumbo-jumbo.]

4.1 Nodes

A node has one or more properties.

4.2 Properties

A property is a node whose nodeName is its property name.

4.3 nodeName Property

nodeName property is a node whose nodeName is nodeName and
whose nodeValue is element name.

4.4 nodeValue Property

nodeValue property is a node whose nodeName is nodeValue
and whose value is either a string or a list of nodes.