On Namespaces

Search

It so happens that my name on is the front page of
Namespaces in XML 1.0, a
technology which is pretty broadly disliked. Well, it seemed
like a good idea at the time. But I think we’ve learned some useful things
since then and can make some good consensus recommendations for people doing
this kind of thing, especially if they’re using JSON.

Problems of History ·
A good place to start brushing up on this would be with James Clark’s
recent
XML
Namespaces. James is authoritative on technology, but I’m going to
quibble with his take on history: “the argument for naming namespaces with
URIs is that
you can do a GET on the URI and get something human- or machine-readable back
that tells you about the semantics of the namespace.” I’m not sure. I recall
it more simply: namespaces needed to have names and
over at the Web consortium, it’s basic dogma that whenever you’re naming
things, you should name them using URIs unless there’s a good reason not
to. The benefits of using URIs are many and don’t necessarily include using
them to retrieve data.

In any case, the rest of James’ argument is well worth reading. I have
another approach I’d like to explore here, which is marginal for XML but real
interesting for JSON.

Do It Like Java ·
My Java classes have names like org.tbray.framer.Framer or
com.sun.cloud.VM, depending on the context. It’s very unlikely
they’ll collide with anything else. We really couldn’t adopt this approach
for XML, because the dot “.” had historically been allowed, and commonly used,
in SGML element types and attribute names. I bet that, if it hadn’t, we might
well have done that.

Especially for JSON ·
A lot of protocols these days format their messages in JSON, which makes
perfect sense if you’re not trying to interchange document-like things. These
messages tend to be dictionaries; here’s an example from the
Sun Cloud API:

In the Sun Cloud API, like in many others, the dictionary keys don’t use
any dots. And like many others, there’s a
MustIgnore
policy. So, while I still have an action item to make this explicit, the
extensibility policy is obvious: use java-style names with dots. So if I
wanted to add a sun-specific proprietary extension to this particular resource
representation, it’d look like this:

Contributions

This sounds good, but I'd guess there will be a need to easily port over namespaces from XML (e.g. a JSONified Atom) as well. Now all we need is a way to define link semantics in JSON and it'll be good to go. ...

I don't think this would completely solve the problem, because the biggest part of the problem is the definition of prefixes - as aliases for the longer actual namespace names - and their binding to the actual namespace. Or, somewhat equivalently, establishing the scope of each namespace alias.

In java, as an example, you can import com.a.b.c.d.FancyThing, and then refer to it just as FancyThing, anywhere in the importing class file. Python has a similar ability, as do many other languages. It's this ease of establishing a readable, easily writable name whose scope is easy to understand that is missing on XML. Not saying there aren't good reasons for it, but still, there it is.

For the desktopcouch project, we define extensibility through an "application_annotations" dictionary in each JSON document, which is where apps should put data which is specific to them. So, a standard "contact" record might be:

{

"name": "Stuart Langridge",

"phone": "01 811 8055",

"web": "http://www.kryogenix.org/"

}

and if Thunderbird, say, wanted to store thunderbird-specific data in that record, it'd be:

On a technical level, your suggestion involves doing nothing, which I think is absolutely correct. You're admitting that everyone lives in a single global namespace, and proposing a convention for people to avoid stepping on each others's toes. Thank you for being in touch with reality.

From a practical standpoint, there are two approaches, either to have names which unambiguously specify the vendor, thus ensuring no collisions but also possibly having lots of redundant keys with identical semantics and problems migrating to consistent ones, or to simply trust everyone use reasonably descriptive names and give them the responsibility of not stepping on each others's toes. Once there's a com.sun.java.lang.encoding.bitrate and a com.ibm.codec.bitrate, both of which mean the same thing, that's obviously a worse situation than if they'd both used the name 'bitrate' to begin with, but in the other case you have to worry about everybody getting the units the same for the bitrate key. In my experience doing forward compatibility with a bitfield doesn't give enough room for people to avoid stepping on each others's toes, but having a dictionary with utf-8 keys does. My preference is strongly towards simply expecting people to be grown-ups, because the alternative is trying to solve a problem which probably doesn't exist.

Trying to be too serious about these issues can result in some truly ludicrous results. For example, the mimetype for BitTorrent is application/x-bittorrent, because I was told that you're supposed to use x- for things which aren't 'real' standards, so I did, but from now until the end of time the Apache default config won't include it, because anything in x-* is by definition not a 'real' standard, and there's a strict policy that they only include 'real' standards in the default config. I'm going to switch mimetypes when pigs fly, because nobody on the planet seriously suggests that the mapping of .torrent to application/x-bittorrent isn't a clear de facto standard, and converting would be extremely painful, and at this point most real distributions have the Apache config file fixed, for that and a bunch of other reasons. The Apache people are very serious about this wankery though.

So I guess my point is that you should make clearly the statement 'This is a convention for avoiding stepping on each others's toes. It only applies to tags which are only for a single vendor. Please act like grown-ups and don't do anything stupid when coordinating tags which are shared between multiple vendors'