DOM's Gotchas

Bill Venners: What's are some of DOM's gotchas?

Elliotte Rusty Harold: Take namespaces, for example. There are two basic models for handling
namespaces in an XML API. In one model, you assign each element and
attribute a certain namespace, and you figure out where the namespace
declarations need to go when you serialize the document. In the other model,
you don't provide any special support for namespaces—you just treat the
namespaces as attributes. That also works, although it's harder on the end
user. DOM is the only API I know of that does both, simultaneously. DOM
requires client programmers to understand and use both models. Otherwise
they'll produce namespace-malformed documents, which is truly evil. DOM has
all the complexity of both approaches and the simplicity of neither.

There are a lot of other issues with DOM that stem from its cross-language
nature. For example, DOM defines exactly one exception,
DOMException, which has short type codes to
indicate which kind of exception it is. To a Java programmer, this is just plain
weird. Java programmers use many different exception classes, and never
use shorts for anything. When was the last time you used a
short in code? Have you ever? I don't think I've ever used
short, except when I was trying to demonstrate all the data
types. But using a short makes sense from a C or C++
programming perspective, where shorts are more common, and having many
exception types is not.

Some of the languages DOM needs to support, especially JavaScript, did not
support method overloading at the time DOM was invented. Therefore, DOM
could not have two methods such as createElement, one that
takes an element name and a namespace, and another that takes only a local
name. Instead, DOM has createElement, which takes
just the name, and createElementNamespace, which takes
both a name and a namespace. There are many non-overloaded methods in
the DOM API that, to any Java or C++ programmer, should clearly be
overloaded.

There are several other DOM design decisions that confuse people.
For example, DOM trees do not allow nodes to be
detached from their parent document. Only the document that created the
node is allowed to hold the node. Also, DOM's DocType
object is read-only. Why? I can't explain these design decisions. I just know that
they are painful when you're actually trying to get work done with DOM.