Summary
Wouldn't it be nice if there existed a standard for XML when you don't need the whole thing? I propose a strict subset of XML called XML minus minus (XML--).

Advertisement

Often in my work, I find XML would be very handy, (e.g. configuration files, data serialization, etc.), but that I only need a fraction of the specification. In these cases it doesn't make sense to embed a gargantuan industrial strength XML parser in my code, so I usually use a homegrown partial XML parser.

Inventing a brand new markup language is one possibility I have explored (such as Labelled S-Expressions), but to be honest I don't think the idea will take off. Many markups exist, and people prefer things that are already well-known (and well marketed) such as XML.

I want to propose a specification for a strict subset of XML called XML--. The specification is the same as that for XML 1.0 (third edition) but with the following restrictions:

no attributes

no CDATA sections

no processing instructions

no document type definitions

the standalone document declaration MUST have the value "yes"

the encoding must be UTF-8

support only for the entities: &amp; &gt; &lt; &quot; and &apos;

This work is inspired by TinyXML. The main difference though is that XML-- does not support attributes. I wonder what more could be done, to make this idea into a viable specification with actual users?

Postscript: Justification for Dropping Attributes

I should just make a quick justification, the attributes are dropped for several reasons:

speeds up parsing significantly

reduces complexity of the parser

speeds up tree-building phases

internal representations of the document are much simpler

No loss of information needs to occur because an XML element such as:

<mytag attribute_name="attribute_value"/>

can be rewritten trivially as:

<mytag>
<attribute_name>attribute_value</attribute_name>
</mytag>

Talk Back!

Have an opinion?
Readers have already posted
7
comments
about this weblog entry. Why not
add yours?

RSS Feed

If you'd like to be notified whenever Christopher Diggins adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Christopher Diggins is a software developer and freelance writer. Christopher loves programming, but is eternally frustrated by the shortcomings of modern programming languages. As would any reasonable person in his shoes, he decided to quit his day job to write his own ( www.heron-language.com ). Christopher is the co-author of the C++ Cookbook from O'Reilly. Christopher can be reached through his home page at www.cdiggins.com.