Organic Schema Design

Bill Venners: Do you have any general guidelines for designing an XML
schema, designing the data structure?

Elliotte Rusty Harold: The main thing I would say is: grow your
documents organically. Try and model the actual content for which you're
writing a schema, and see what sort of XML structures come out. Don't start by
writing schemas. Start by writing example instance documents, and see what
you get.

For example, if you're modeling invoices, pull out a few invoices. Ask yourself, "If
I wrote this invoice in XML, what it would look like? That invoice, what it would
look like?" If you have a large and representative enough collection of previous
documents—in whatever format: paper, electronic—you can get a good start.
Then you will gradually discover other documents coming into your system that
don't really fit your designs. They have a couple extra fields. One document has
two shipping addresses instead of one, so you figure out how to handle that in
your schema. Another document has an address that's in the U.K. instead of in
the United States, and that has a very different format. So you adjust the
schema.

If you grow your schemas organically, you gradually figure out how the
documents are likely to be structured. You don't write down in stone up front
that the documents must be structured like this, that all these elements must
be present, that these attributes must not be present if something else is
present, and so on. You let the actual information drive the design, rather than
letting the design constrain what documents you're willing to accept.

Next Week

Come back Monday, October 13 for the first installment of a conversation with
C++ creator Bjarne Stroustrup. I know I promised this last week, but one must
always keep up some element of surprise. Nevertheless, look for Bjarne
next Monday. He will be here, really.
If you'd like to receive a brief weekly email
announcing new articles at Artima.com, please subscribe to
the Artima Newsletter.