Mass-Market XML

We’ve known for many years that most of our vital information lives in documents, not databases. XML was supposed to help us capture the implicit structure of ordinary business documents (memos, expense reports) and make it explicit. Sets of such documents would then form a kind of virtual database. The cost to search, correlate, and recombine the XML-ized data would fall dramatically, and its value would soar. It was a great idea, but until the tools used to create memos and expense reports became deeply XML-aware, it was stillborn. XML did, of course, thrive in another and equally important way. It became the exchange format of enterprise databases and the lingua franca of Web services. Now Office 11 wants to erase the differences between XML documents written and read by people using desktop applications, and XML documents produced and consumed by databases and Web services. This is a really big deal.

The first beta of Office 11 doesn’t include any demonstrations of the new XML features, but the Office team put together some examples for us, and Jean Paoli talked us through them. We started with a rsum template written in Word 11. Today we use such templates mainly to control the appearance of documents. If we also want to control their content, we can ask developers to write macros that enforce business rules. In principle, a company could publish a rsum template that would, for example, require job seekers to describe past experience in terms of a controlled vocabulary. In practice, that rarely happens. Procedural code to enforce such constraints is hard to write and even harder to reuse. With Word 11, you can attack this problem by defining a schema and mapping its elements to a rsum template.

In the rsum example, we associated a schema with a sample rsum, using the Templates and Add-ins dialog. A new task pane called XML Structure then appeared, displaying a single root element named Rsum. We selected it, and chose the option Apply to Whole Document. Now subelements named Objective, Experience, and Education appeared in the task pane. Mapping these to regions of the sample rsum revealed deeper structure until the entire schema was finally mapped.

Another example illustrated the same scenario for Excel. Here, the fields defining an expense report were captured in a schema, then mapped to an expense report. Once we saw how it worked, we were able to apply the same concept to our existing InfoWorld spreadsheet. After writing a simple schema, we dragged elements from the XML Structure pane onto the spreadsheet to bind named schema elements to numbered cells.