JDOM Uses Java Collections

Bill Venners: You asked, "Is JDOM too Java centric?"

Elliotte Rusty Harold: When JDOM was designed, Brett and Jason
said, we're going to go whole hog. We're not going to invent a separate
NodeList class, like DOM does. We're going to use the Java
Collections API. We're not going to have a cloneNode method
like DOM does. We're going to use the Java clone method.
We're going to implement Serializable, because good Java
classes implement Serializable. We're going to implement
Cloneable.
We're going to have equals and hashcode methods—all the nice, normal things Java
programmers have learned to love. The problem is, five or six years down the road, we've
learned that some of those things aren't so nice. The Cloneable interface is a disaster.
Joshua Bloch talks about this in Effective Java, and flat out recommends that people ignore it and
implement their own copy constructors instead, just because Cloneable is so poorly designed.

The Serializable interface is useful in some circumstances, but
I think in XML the serialization format should be XML, not binary object
serialization, so I'm not sure whether that's necessary. And when it comes to
the Collections API, that API suffers seriously from two things. One is Java's
lack of true generics, i.e., templates to C++ programmers. The other is that
Java has primitive data types, and the Collections API can't be used for
ints or doubles. I'm not so sure that one's
relevant, but the first one is. When you expose the children of an
Element as a java.util.List, what you're getting
back is a list of Objects. Every time you get something out of
that List, you have to cast it back to its type. We don't know what
it is, so we have to have a big switch block that says, if (o
instanceof Element) { e = (Element) o; }, and then you do the same
thing for Text, Comment, and
ProcessingInstruction, and it gets really messy. DOM, by
contrast, does have a different NodeList interface that contains
Nodes. When you get something out of that list, you know it's a
Node. And you've got certain operations you can use on a
Node, and often that's all you need. Sometimes you need
something more. Sometimes you do need to know whether it's an
Element node, an Attribute node, or a
Text node. But a lot of times, it's enough to know it's a
Node. It's not enough to know that it's an Object.

JDOM Uses Too Many Checked Exceptions

Bill Venners: You also suggested in your talk that JDOM had too many checked exceptions.

Elliotte Rusty Harold: JDOM does check many of the things that can
make an XML document malformed, not all of them, but many. For example,
you can't have an element name that contains white space. Generally
speaking, if JDOM detects a problem, then it throws a checked exception, a
JDOMException specifically. That means that when you're
writing JDOM code, you have a lot of try catch blocks. Try such and such, catch
JDOMException, respond appropriately. As Bruce Eckel has
pointed out, a lot of people just write catch JDOMException
open close curly brace, and don't actually do anything to handle the failure
appropriately.

Perhaps the appropriate response is, instead of throwing a checked exception,
to throw RuntimeExceptions. That way it doesn't get in the way
of your code. It doesn't make your code any messier. But the signal of the
problem is still there if the problem arises. The way Joshua Bloch explains this
is that any problem that could possibly be caught in testing should be a
RuntimeException, for example, setting the name of an
element. That should throw a RuntimeException. Because if
you use a bad String for that, you'll catch it in testing, if you have
good testing. On the other hand, parsing an external document should not
throw a RuntimeException, it should throw a checked exception,
because that's going to depend on which document is being passed into your
program. Sometimes it is going to be well-formed and sometimes not. There's
no way to know that in general, so that's a legitimate checked exception. But I
just have come to learn, in a way I didn't understand a few years ago, that many
exceptions that are currently checked exceptions should really be
RuntimeExceptions.

Bill Venners: So you think JDOM goes a bit overboard with the checked
exceptions.

Elliotte Rusty Harold: Yes, and that's probably my fault. I was the one
who in the very early days of JDOM argued most strongly for putting in lots of
exceptions and checking everything. There were others who argued against
putting in any exceptions at all. I think what we were missing then, was anybody
standing in the middle saying, "Hey, guys, RuntimeExceptions
would satisfy both of you at the same time. I just didn't know that then. I've
learned from Bruce Eckel and Joshua Bloch.

Will JDOM Remain Class-Based?

Bill Venners: In your talk you asked, "Are JDOM committers committed to classes?"
What did you mean by that?

Elliotte Rusty Harold: That's a completely separate issue. I had a
conversation with Jason Hunter, one of the two or three committers to the CVS
tree for JDOM. Jason said that if JDOM used interfaces rather than classes,
then it could be used, for example, as the API for a native XML database. And he
thought that was an important use case. And on further reflection, I think I agree
with him. There is, perhaps, a need for such an API. However, I also think
there's a need for a simple, concrete, class-based API. And I'm just not certain
at this point going forward that JDOM will always be a class-based API, that it
will be a class-based API when it gets to 1.0. So, I think it's useful to have my
little XOM API, which I know is going to be a class-based API.

Next Week

Come back Monday, July 7 for Part III of a conversation with
Bruce Eckel about why he loves Python. I am now staggering
the publication of several interviews at once, to give the reader
variety. The next installment of this interview with Elliotte Rusty
Harold will appear near future.
If you'd like to receive a brief weekly email
announcing new articles at Artima.com, please subscribe to
the Artima Newsletter.