Dear John (Column #3) - An XMLBeans Example: Address Book

Introduction to XMLBeans
XML has rapidly become the lingua franca of business conducted over networks. It is the foundation of the Web Services concept and is used widely in electronic document interchange. The first generation of tools for processing XML in Java applications, based on the Document Object Model (DOM) and the Simple API For XML (SAX) and XML parsers, fulfilled their roles respectably. Those first generation technologies, however, didn’t take full advantage of the flexibility and power of the Java language.

BEA Systems has recently introduced a new technology, called XMLBeans, which provide a much more natural, intuitive and powerful mechanism for handling XML in Java. The name XMLBeans is derived from XML (obviously) plus JavaBeans, because XMLBeans borrows the property pattern made popular by the JavaBeans component architecture.

XMLBeans will be used throughout the WebLogic Platform when version 8.0 is released later in 2003. But the technology itself can operate in isolation, and BEA has published a web service that allows you to try out XMLBeans on the XML schema of your choice. You submit a schema (.xsd) file (or a ZIP file containing multiple related schemas), and the web service returns a JAR file containing the XMLBeans classes specific to your schema(s) plus the supporting XMLBeans API. I’ll demonstrate how to use the service later in this article. You can read about the XMLBeans Service in the XMLBeans Technical Track on BEA’s dev2dev developer site.

The example I present in this article demonstrates one particular aspect of XMLBeans: the strongly typed schema type system API that the XMLBeans schema compiler generates specifically for your schema(s). XMLBeans can also be used without schema. To learn more about all the capabilities of XMLBeans, please see the XMLBeans Overview Page.

What is an XML Schema?
While it is possible to create XML documents with arbitrary content, it is much more interesting (and necessary) to define specific types of documents. Electronic business wouldn’t get very far if every company could define it’s own electronic document format for a Purchase Order, for example. XML supports the definition of formal document types against which any particular document may be validated.

When XML was young, one defined a document type in a Document Type Definition (DTD) file. In a DTD you could specify the type of elements that could exist in an instance document, as well as how the elements had to be arranged and restrictions such as value ranges for numeric elements. But DTDs had many drawbacks, the biggest of which was that DTDs were expressed in a quirky language that wasn’t itself XML.

The modern way to define XML document types is with XML Schema. XML Schema serves the same purpose as DTDs but is expressed in XML. That means you can create, edit and manipulate XML schemas using standard XML tools.

The XML Schema specification (available at http://www.w3.org/XML/Schema) also defines a list of 46 specific data types that may be expressed in schema, including the types string, integer, float, Date, and many more. So XML schemas can utilize a rich type system that mirrors what is available in modern programming languages.

This article describes an example that utilizes two types of XML documents: contacts and address books. Address books can contain contacts.

The contactUsa.xsd Schema
An XML schema is typically stored in a file with the .xsd extension, for XML Schema Definition. Here’s an example of an XML schema from the file contactUsa.xsd (all files referenced in this article are available in this ZIP file AddressBookApp.zip). This schema defines a format for entries ("contacts") that might be found in an address book. For simplicity, this schema only handles postal addresses as they occur in the United States.

There are two basic kinds of elements in XML Schema: simple types and complex types. Simple types define elements that have a value but no children. Complex types define elements that can have children. In the schema above, <contact> is defined as a complex element that can (in fact must, in this schema) have child elements <family-name>and <given-name>and may have children of type <mailing-address> and <phone-number>.

I don't want to spend much space on XML Schema; it is a complex topic and there are many good books available on the subject. But here are some statements about the schema defined in contactUsa.xsd that may help you understand it even if you are new to XML Schema:

All of the elements defined in contactUsa.xsd are in the XML namespace "http://dearjohn/address-book". The targetNamespace declaration defines the namespace used for top-level elements in the schema (called "global" elements in schema-speak).

The specification of elementFormDefault="qualified" means all of the non-global elements in this schema are in the target namespace as well (as long as they aren't explicitly specified to be in some other namespace). It is almost always a good idea to specify elementFormDefault="qualified" in a schema file: you typically do not want global and non-global elements in the same schema to be in different namespaces.

A <contact> always includes name fields, and may optionally contain up to two <mailing-address> elements and up to three <phone-number> elements.

A <mailing-address> must always have <address-line-1>, <city>, <state> and <zipcode> child elements, in that order. A <mailing-address> may optionally have an <address-line-2> child element immediately following the <address-line-1> child element.

Both <mailing-address> and <phone-number> elements have a location attribute. While I did not express it in the schema, the intent is for location to hold values like "home" and "work".

The values of all <area-code>, <local-phone-number> and <zipcode> elements are constrained to particular text patterns.

The values of all <state> elements are constrained to lie within an enumeration of the abbreviations for the 50 states plus the District of Columbia.

This schema is much simpler: it defines an <address-book> element that may contain zero or more <contact> elements from the contactUsa.xsd schema.

Generating XMLBeans with the XMLBeans Service
Now that I have these schemas defined, I can use the online XMLBeans Service to generate the XMLBeans type system (Java API) specific to my schemas.

Since I had two related schema files, I packaged them into a ZIP file to be uploaded. Then I specified the ZIP file in the "Schema or Zip file to upload:" field on the XMLBeans Service Schema Upload Page . After the service had compiled my schema the XMLBeans Service Download Page appeared, from which I could download the JAR file containing my XMLBeans types. The default filename is xsdTypes.jar, but I renamed it during download to bookTypes.jar. In addition I need the xmlbeans.jar file, which contains the generic XMLBeans support code.

That’s all there is to it. You supply one or more schema, and the XMLBeans Service returns a JAR file containing your types. Now I’m ready to build an Address Book application that manipulates XML address books and contacts from Java.

The AddressBookApp Application
The AddressBookApp Java application is a simple command line application. It presents a menu with the following options:

open address book

print all contact names;

print all contacts

print contact

find contacts by name

find contacts by area code

find contacts by state

add contact from file

delete contact

save address book

exit

I’ll go through the highlights of these functions and explain, with code snippets, how XMLBeans are being used to accomplish the operations. I don’t cover some operations because the multiple "print" and "find" operations are very similar to each other.

Open Address Book
The heart of this operation is in AddressBookApp’s loadAddressBookFile method, some of which is shown here:

XmlLoader is one of the foundation XMLBeans classes. It’s load method is one of the ways that you can read an XML document into the XMLBeans type system, from which you can manipulate the XML via the XMLBeans API. The code above shows a very typical pattern for loading XML documents with XMLBeans: call a method of XmlLoader and cast the result to the document type you are loading. In my case, that is AddressBookDocument, the type that represents an XML document that conforms to the addressBook.xsd schema.

When I first started using XMLBeans, I had problems with ClassCastException runtime exceptions being thrown at this point. This exception almost always means one thing: the specified document is not a valid instance of the schema from which the cast destination type (AddressBookDocument in my case) was generated. If you experience this error, pay close attention to the structure of the instance document you are attempting to load vis a vis the schema. Pay particular attention to the namespace declarations in both the schema and the instance document. Use of a third-party schema-aware XML tool such as XML Spy can be very helpful.

Once the document is loaded and cast successfully, I simply call AddressBookDocument’s getAddressBook to obtain the XMLBean representing the <address-book> element in the file. The AddressBook Java object I obtain is the object on which all other operations in the application are performed.

Note that while I used a file-based method to load XML documents, XMLBeans also provides methods to load XML documents from streams.

Print All Contact Name
This operation calls AddressBookApp’s printAddressBookNames method, the core of which is shown here:

Contact is the XMLBeans Java type that represents a <contact> element. I call the AddressBook class’ sizeOfContactArray method to find out how many <contact> elements exist as children of the <address-book> element represented by the book variable. I then loop over all the <contact> elements, printing the result of AddressBookApp’s getFormattedContactName method for each contact. getFormattedContactName contains the following code:

The important part is the first two lines, in which I call the Contact class’ getFamilyName and getGivenName methods to obtain the string values of the <family-name> and <given-name> child elements of the <contact> element.

The list of contacts is not sorted before being printed. I leave that as an exercise for the user. I’ll give you a hint that the XQuery capabilities of XMLBeans would be useful for that purpose.

Hopefully you’re starting to get the idea that XMLBeans is very intuitive. The generated Java types have the same names as the XML elements defined in the schema, only "java-fied". This makes it very easy to walk the structure of an XML document in Java if you are familiar with the structure of the XML document.

Find Contacts by Name
This operation prompts the user for a search string, then prints the names from all <contact>s in the <address-book> that contain that string (case-insensitive). Here’s the code:

It is pretty simple code. I loop over all the <contact>s in the <address-book> and check to see if the string returned by getFormattedContactName for that contact contains the search string. I use getFormattedContactName because it conveniently returns a string that contains the values of both the <family-name> and <given-name> elements. Note that XMLBeans includes XQuery functionality that would make quick work of all of the "find" operations. But I’ll save that for future article.

Add Contact from File
This operation prompts the user for the filename of an XML file conforming to the contactUsa.xsd schema. It then invokes AddressBookApp’s loadContactFile method, which is very similar to the loadAddressBookFile method described in the "Open AddressBook" section above.

Once the new <contact> element has been successfully loaded into a Contact XMLBeans object, it is added to the <address-book> in AddressBookApp’s addNewContact method, whose code is shown here:

This code substitutes for an XMLBeans feature that is not yet implemented as of this writing (XMLBeans is currently beta code). While this code is simple: just walking the tree and transferring values from elements and their attributes as it goes; it will eventually be possible to accomplish this in a single line of code that looks something like:

book.setContact(newContact);

Delete Contact
This is the simplest of all the operations. After prompting the user for the index of the <contact> to delete, it calls:

_currentBook.removeContact(index-1);

which deletes the specified <contact> child element from the <address-book> element. I subtract one from the user-supplied index because indices in the application’s user interface are 1-based for convenience.

Save Address Book
The final operation writes the current state of the <address-book> element to a file whose name is supplied by the user. Here is the relevant code:

I use normal Java IO classes to create an output stream to which my XML document will be written. Note that I specified the encoding ("UTF-8") with which this document should be written. Character encoding for XML documents is a complicated topic all its own and has mainly to do with internationalization. UTF is a fairly safe encoding to use for general purposes.

Next I print the <?xml?> declaration that optionally goes at the start of every XML document.

The next step is to generate the actual XML text that will be written to the file. To do this, I call the xmlText method of XmlObject (the base class for all XMLBeans types). The xmlText method takes an optional Map of options to control its behavior. I’m using the SAVE_PRETTY_PRINT option to request human-readable formatting and specifying the SAVE_PRETTY_PRINT_INDENT option to specify that I’d like the various levels of the XML structure to be indented multiples of two spaces.

Finally, I write the XML text to the output file and close the file.

Compiling the AddressBookApp Application
Compiling the application is very simple:

Running the AddressBookApp Application
Running the app is also simple:

java -cp ".;./bookTypes.jar;./xmlbeans.jar" address.AddressBookApp

The ZIP file containing the example code also contains four sample contact XML files and two sample address books: testbook.xml and fullbook.xml. testbook.xml contains only the contact defined in contactSmithers.xml;fullbook.xml contains all four sample contacts. These files give you enough material to experiment with all of the operations available in the application.

Conclusions
Hopefully this example has demonstrated how easy XML processing is with XMLBeans. I encourage you to download the example code and try it out. Then try it with some schema of your own.

I deliberately kept this example relatively simple and focused on the schema-specific API that XMLBeans generates from your schema. There is much more to XMLBeans, including the ability to manipulate XML documents without associated schema, use XQuery to perform complex searches and transformations on XML, and other features.

About the Author

John Methot is a Senior Developer at BEA Systems Inc., working on the WebLogic Workshop team. He's responsible for WebLogic Workshop sample code and portions of the user documentation. John has been doing application development, consulting, and architecture for 19 years and explanation of technical concepts has been a theme throughout his career. Lately, he finds he writes more prose than code. John can be contacted at jmethot@bea.com.