Mapping Object Hierarchies to Relational Tables

I use the term object hierarchy to refer to the parent-child
hierarchy of class instances where an object has one or more objects as data
members. In the ShippingLine example, the children of
ShippingLine objects
are instances of class Ship; Ship
instances, in turn, have PortOfCall child
objects as data members.

There are several ways to map such hierarchies to a relational database
table. Each type of mapping is determined by factors such as the cardinality
of parent-child relationship (one-to-one, one-to-many, many-to-many), and if
children represent strong or weak entities.

Entities Both Strong and Weak

One of the primary constraints to be determined in any business model is
the relative independence of data entities (represented as classes and their
instances in Java). Some entities come into and out of existence of their own
accord. For instance, new shipping lines are founded and others go out of
business as time passes. We want to be able to create new ShippingLine
classes at the drop of a hat. Shipping lines are strong entities in
our data model. When a new shipping line comes into existence, I can
construct a ShippingLine object to contain information about it.

There are some entities in our model that don't have such a carefree
existence. Such an entity is so dependent upon another that it couldn't exist without the existence
of the referenced entity. When an entity instance
depends upon another for its very existence, it is a weak entity.
Ships are weak entities and must be associated with a shipping line before we
can incorporate ship information in our model.

Note that I'm referring to business rules in the abstract, and not to
rules applicable to specific physical data models. There are two physical
models that this article focuses upon: one in Java and one in a SQL database.
If we tossed in XML documents, that would be a third. Each physical model has
different mechanisms for enforcement of weak entity dependencies. In Java, a
Ship's dependency upon a ShippingLine would
usually be enforced by requiring
a non-null ShippingLine.object parameter passed to the Ship class
constructor. In the database, such a dependency would be enforced by having a
required foreign key column in the ship table that has a value that matches a
primary key in the ShippingLine table. In XML, the shippingLine
element might be the root element of a document, and ships are child elements.

In fact, there may be different constraints for any one of these physical
data models. For example, an XML document might report on all shipping lines
in our database, and therefore would require a different root element. That's
not a business constraint, however, so much as a constraint enforced by the physical
data model representation. Those types of dependencies are best
expressed locally; in this instance, the reporting module would be sure that
all shippingLine elements are within the document's root element (whatever
that may be).

I can go ahead and use the particular mechanisms that each physical model
provides to enforce strong-strong and strong-weak entity constraints. There
are drawbacks to this approach, however. The obvious one is that I must make
sure that each mechanism is in fact enforcing these constraints in a
manner compatible with the others. Also, changes in the strong-weak
relationship of entities requires changes to schema and class definitions at
separate locations. A more subtle drawback is that constraint mechanisms of a
particular data model may be incompatible with the application's mechanisms
for data storage and class construction. In fact, Castor's method of data
binding often runs into trouble when columns in a database are defined with
sophisticated column constraints.

Fortunately, the Castor mapping file has all of the appropriate features to
ensure that entity relationships, both strong-strong and strong-weak, can be
defined and enforced.

Dependent Classes

Castor uses the term dependent class to denote a class
representing a weak entity. Instances of a dependent class have a parent
master object that they are dependent upon. Through the use of a
depends attribute in the class mapping element, the JDO
persistence engine is able to ensure that a dependent class instance is only
saved as part of its master object. In other words, you can't save a
dependent class by itself.

There are other restrictions noted in the Castor documentation:

A dependent class object "may not be created, removed, or updated
separately [from its master object]."

"Both dependent and master objects must have identities." This includes
identities generated by key-generators.

Before moving on, let's note two further advantages of enforcing
dependencies through the mapping file.

All depenencies are noted in one location. This makes maintenance
easier when dependency relationships change.

Dependent class objects do not require a separate save operation. When
the dependent object's parent is saved, all of its depenencies will be saved,
too.

Related Classes

Not all parent-child object relationships are strong-weak. A class
representing a strong entity in the data model can refer to another strong
entity class. A shipping line might be owned by a shipping magnate, or by a
holding company. Both magnates and holding companies might be strong entities
in our data model (they're certainly not dependent upon shipping lines for
their existence (although, arguably, the magnate is)). Conversely, a shipping
line may be owned by a magnate, or by a holding company, or it might be a
wholly independent corporation -- so its existence is not dependent upon
magnates or holding companies.

In such circumstances, even though one class might refer to the other,
there isn't a strong-weak dependency there; it's possible to create magnates
independently of shipping lines and vice versa. Both are strong entities in
our model. For these cases, Castor declares them to be related
classes. Related classes are not treated specially by Castor. This means
that they come and go out of existence without regard for each other, and
(most importantly) they must be saved independently.

One-to-many Relationship

A one-to-many relationship in Java is simply a class instance with a
collection of objects of the same type. In a database, the one-to-many
relationship is denoted by the appearance of identical foreign key
values in a table (each row with the same foreign key value points to a
single row in different table).

In the mapping file, fields that are collections of same-type objects are
denoted by the collection attribute. The collection
attribute can take one of several values:

Mapping Name

java.util.collection

arraylist

ArrayList

hashtable

Hashtable

map

Map (HashMap)

set

Set (HashSet)

vector

Vector

(Note: The classes in parenthesis are the default
implementations of the interface shown. Defaults are used where collection
instances are null.)

One consequence of a shipping line having one or more ships is that the
ShippingLine class must have a collection whose elements are Ships. Let's
say the collection is a Java ArrayList. The mapping file below highlights
the way collections are mapped in class elements:

Here, we have a dependent one-to-many relationship between ShippingLine
and Ship. This means that the collection of Ships will be saved automatically
at the time ShippingLine is saved. When mapping the collection to a database table, the many_key attribute is used in the sql element. The many_key value refers to the column in the ship table that contains the foreign key to shipping_line. Let me restate that for emphasis: the shipping_line mapping refers to a foreign key column in the ship table. This value must match that of the name attribute in the sql element for the Ship class mapping (highlighted).

The foreign key column type of ship must match the type
of the identifier (primary key) it refers to. Since the
shipping_line table has a primary key on the
id column (an integer), the foreign key
id_shipping_line must also be of that type. However, note
that this type is not shown in the field mapping for shippingLine. The type
is inferred from the field's declared type. That type is not integer, but
ShippingLine. This is the one case where the type of the field does not
correlate with the type of the table column.

It's not enough, however, that I declare an ArrayList in ShippingLine to
store Ship objects. I have to maintain a backlink to the master object from
the dependent object. This is simple enough to do. First, I add a ShippingLine class instance to the Ship class, with get/set methods:

What about the removeShip() method? Should it delete the master object
reference from Ship when an instance is removed from the ShippingLine? It
depends on the application, but as far as we're concerned, since Ship objects
can only be persisted in the database through a ShippingLine, none of the
Ship instances will ever be saved with an invalid master object reference
(all instances will have gone through the ShippingLine.addShip() method).