Modeling One-To-Many Relationships With XML

Introduction

Somehow, modeling and XML aren't often found together in the same sentence. In my experience, I've seen XML vocabularies created using the "fly by the seat of your pants" methodology more than anything else. After all, because XML is the eXtensible Markup Language, it's easy to create your own markup, right?

If we were talking about storing data in a relational database, on the other hand, you would think about modeling the entities, relationships, and attributes the way they exist in the real world to provide the most flexible data access. You would have the rigor of 3rd normal form to guide you. There could be performance considerations in how you model the data as well.

As we begin to use XML to represent our portable, and even persistent data, should we throw out all we've learned about how to model relational data? I don't think so.

In this article, we'll discuss some options for implementing one-to-many relationships in XML. We'll consider three different techniques:

For each technique, we'll produce the following artifacts to flesh out the idea:

DTD(s) to represent document structure

Sample XML stream(s)

An XSL stylesheet to demonstrate data access

We'll discuss when it would be appropriate to use each approach and then summarize what we've learned. Some additional resources about modeling XML are also listed at the end of the article.

Department and Employee Domain

To begin, let's define a business domain to model. We'll implement this model by using our three one-to-many XML modeling techniques.

In the relational database world, departments and employees are often used to illustrate concepts. Because this is such a well-known problem domain that the reader may be already familiar with, I'll also use this as an example. No need to reinvent the wheel here.

The Entity-Relationship, or ER, diagram depicted here shows that we have two entities, Department and Employee. Departments are uniquely identified by department_id. Similarly, Employee uses emp_id as its unique identifier.

The line between Department and Employee indicates a relationship. The infinity symbol next to Employee indicates that there may be many Employees in a Department. In a one-to-many relationship, the key of the one side of the relationship, in this case Department, would become a foreign key on the many side of the relationship, in this case Employee.

Containment Relationship

In a Containment Relationship, a structure is defined where one element is contained within another. In the strongest form of this relationship, the "contained" element ceases to exist when the "container" element is removed.

Let's take a look at a DTD we'll use in the containment relationship implementation of our domain model:

The Company template creates the shell of the HTML page. The column headers for a table, listing Employee and Departments, are created. The Employee template is invoked for each Employee node of the document. The employee name (Name) and department name (../Name) is selected into the appropriate table cell.