A growing number of enterprise projects
today call for a reliable method of binding Java objects to relational
data -- and doing so across a multitude of relational databases.
Unfortunately (as many of us have learned the hard way) in-house
solutions are painful to build and even harder to maintain and grow over
the long term. In this article, Bruce Snyder introduces you to the
basics of working with Castor JDO, an open source data-binding framework
that just happens to be based on 100 percent pure Java
technology.

Castor JDO (Java Data Objects) is an open source, 100 percent Java data
binding framework. Initially released in December 1999, Castor JDO was one
of the first open source data binding frameworks available. Since that
time, the technology has come a long way. Today, Castor JDO is used in
combination with many other technologies, both open source and commercial,
to bind Java object models to relational databases, XML documents, and
LDAP directories.

In this article, you'll learn the fundamentals of working with Castor
JDO. We'll start with a relational data model and a Java object model, and
discuss the basics of mapping between the two. From there, we'll talk
about some features of Castor JDO. Using a simple product-based example,
you'll learn about such essentials as inheritance (both relational and
object-oriented), dependent and related relationships, Castor's Object
Query Language implementation, and short versus long transactions in
Castor. Because this article is an introduction to Castor, we'll use very
simple examples here, and we won't go into depth on any one topic. At the
end of the article, you will have a good overview of the technology, and a
good foundation for future exploration.

Please note that this article will not go into the general topic of
object-relational mapping. To learn more about object-relational mapping,
see the Resources
section.

Getting startedSometimes I
prefer to start a project by modeling the data; other times I like to
begin by modeling the objects. For the purposes of this article, we'll
start with the data model. Castor comes with some JDO examples that
present a data model and an object model surrounding the notion of a
product. We'll work with one of those examples, taking it through several
different stages throughout the article. Figure 1 is an
entity-relationship (ER) diagram of the data model for our example. For
the sake of simplicity, it contains no explicit foreign keys. Notice,
however, that the ID references do exist between tables.

Castor's Java objects look a lot like JavaBeans components. Therefore,
an object typically contains a pair of accessor and mutator
(getter/setter) methods for each property (unless the property is mapped
to be accessed directly). More complex relationships can contain
additional logic for other purposes. Object-relational mapping can become
extremely complex very fast; however, these examples are pretty
straightforward in nature.

Listing 1 shows the source code for the Product object.
Note that the code contains a property for each column in the product
table, including the identity. For those new to object-relational mapping,
a property for the identity may seem odd, but this construct is actually
quite common and is used in other object-relational frameworks as well.
Internally, Castor uses the object identity for tracking objects.
Additionally, there are two java.util.Vectors for related
ProductDetail and Category objects. Also note
that the object contains a toString() method that can be
useful for logging purposes when debugging your application.

Looking at Figure
2, note that Product has different relationships to the
ProductDetail and Category objects:

One Product can contain many
ProductDetails. This is a one-to-many relationship.

Many Products can contain many Category
objects. This is a many-to-many relationship.

The methods addDetail() and addCategory() are
used to add objects to each of the java.util.Vectors to
manage their respective relationships.

The karma of the mapping
descriptorJust as some people believe that the key to life
is a proper understanding and acceptance of one's personal karma, I've
found that the key to working with Castor JDO is a proper understanding
and implementation of the mapping descriptor. The mapping descriptor
provides the connection (the map) between relational database tables and
the Java objects. The format of the mapping descriptor is XML -- and for
good reason. The other half of the Castor Project is Castor XML (see Resources).
Castor XML provides a superior Java-to-XML and XML-to-Java data binding
framework. Castor JDO makes use of Castor XML's ability to unmarshall an
XML document into a Java object model for the purposes of reading the
mapping descriptor.

Mapping objects and properties to
elements and attributesEach Java object is represented by a
<class> element and each property in that object is
represented by a <field> element. Additionally, each
column from within a relational table is represented by an
<sql> element. Listing 2 shows the mapping for the
Product object seen above. Take a look at the listing and
then we'll discuss some of the finer points of the code.

The <class> element supports some important
attributes and elements. For example, the mapping for Product
uses the identity attribute to indicate which property of the
object serves as the object identity. The <class>
element also supports a <map-to> element that tells
Castor to what relational table each object maps. The
<field> element supports some attributes as well.

Notice that the mapping for all the <field> and
<sql> elements contains a type attribute. Type
attributes indicate to Castor what TypeConvertor should be
used internally to convert between object and relational data types.

Defining relationshipsA
special case of the type attribute exists for each "to-many" relationship.
As previously mentioned, the two java.util.Vectors in
Product handle the one-to-many and many-to-many
relationships. The one-to-many relationship exists in the
<field> element called details. The information
in the details element tells Castor that the property is a
Collection of type java.util.Vector; that the
Collection contains objects of type
ProductDetail; and that it is required (that is,
cannot be null). The <sql> element tells Castor that
the <field>'s object mapping contains an SQL column
called prod_id and that this column should be used to identify the
one-to-many relationship.

The many-to-many relationship exists in the <field>
element called categories. The categories element tells Castor that
the property is a Collection of type
java.util.Vector; that the Collection contains
objects of type Category; and that it is required
(that is, it cannot be null). The <sql> element tells
Castor that this relationship makes use of an additional table called
category_prod. It also states that this
<field>'s object mapping contains an SQL column called
prod_id, and that this column should be used along with
category_id to identify the many-to-many relationship.

One more relationship exists between ProductGroup and
Product. This relationship is a one-to-many relationship
whereby many Products can have a relationship to the same
ProductGroup. While this relationship is the same as the
one-to-many relationship between Product and
ProductDetail, it operates in reverse (many-to-one), so we
only see the "to-one" side of the relationship in the mapping.

Inheritance in
CastorCastor makes use of two types of inheritance: Java
inheritance and relational inheritance. Listing 3 contains the mapping for
Computer. As you may recall from Figure
1, Computer is an extension of Product.

When a Java object simply extends a generic base class or implements an
interface, the inheritance need not be reflected in the mapping
descriptor. Because both Product and Computer
are mapped classes, however, the inheritance must be noted for Castor to
reflect it properly. The extends attribute of the
<class> element is used to represent the relationship
between Computer and Product. Note in Figure
1 that the comp (computer) table does not contain the columns from the
prod (product) table. Rather, the prod table contains the base information
and the comp table extends it.

Dependent versus related
relationshipsCastor distinguishes the relationship of two
objects as either dependent or related, and maintains
different life cycles for each of the two types of relationship. Up to
this point, we've only discussed independent (or related) objects.
Independent objects are those objects that do not have the depends
attribute specified in the <class> element of the
mapping descriptor. If you want to perform any CRUD (create, read, update,
delete) operation on an independent object, you can do so directly on the
object.

Listing 4 contains the mapping for the dependent object
ProductDetail. Notice that the <class> element contains
the depends attribute. This tells Castor that
ProductDetail is dependent upon Product for all
operations. In this case, Product is the master object and
ProductDetail is the dependent object. If you want to perform
any CRUD operations on the ProductDetail object, you can do
so only through its master object. That is, you must first fetch a
Product and navigate to ProductDetail using the
accessor method. Each dependent object may have only one master
object.

Performing queries on the object
modelCastor provides an implementation of a subset of the
ODMG 3.0 specification for the Object Query Language (OQL). The syntax for
OQL is similar to that of SQL but it lets you query the object model
rather than directly querying the database. This can be a powerful asset
when you're supporting more than one database. Internally, Castor's OQL
implementation translates the OQL query into the appropriate SQL for the
database. Parameters are bound to a query using the bind()
method. Below are some simple examples of OQL queries.

Rather than continue to use a fully qualified object name throughout a
query, Castor's OQL implementation supports the use of aliases for
objects. In the queries below, c is one such alias.

If you wanted to query for all Computers, you would
execute the following query:

SELECT c FROM myapp.Computer c

If you wanted to query for a Computer whose ID equaled
1234, you would start with:

SELECT c FROM myapp.Computer c WHERE c.id= $1

followed by:

query.bind( 1234 )

To query for a Computer whose name is like a particular
string, you would execute the following query:

SELECT c FROM myapp.Computer c WHERE c.name LIKE $1

followed by:

query.bind( "%abcd%" )

To query for a Computer whose ID fell within a list of
IDs, you would execute the following query:

SELECT c FROM myapp.Computer c WHERE c.id IN LIST ( $1, $2, $3 )

followed by:

query.bind( 97 )
query.bind( 11 )
query.bind( 7 )

Further details on the Castor OQL implementation are beyond the scope
of this article. To learn more, see the references in the Resources
section.

Transactions in
CastorCastor's persistence operations take place within the
context of a transaction. However, Castor is not a transaction manager.
Rather, it is much more of a cache manager, in that it utilizes the
atomicity of transactions to persist objects to a database. This setup
allows the application to more easily commit or rollback any changes to an
object graph. An application running in a non-managed environment must
explicitly commit or rollback transactions. When running in a managed
environment, the application server can manage the transactional
contexts.

Short transactions versus long
transactionsNormal transactions in Castor are referred to
as short transactions. Castor also provides the notion of long
transactions, which are made up of two short transactions. We can use
a typical Web application to understand the difference between the two
types of transaction. In the case of an application that calls for data to
be read from a database, displayed to a user, and then committed to the
database, we're looking at a potentially lengthy transaction time (which
is typical for Web applications). But a write lock in the database can't
be held indefinitely. To work around this, Castor utilizes two short
transactions. For example, the first short transaction materializes
objects using a read-only query, which lets you display the objects
without leaving the transaction open. If the user makes changes, a second
short transaction is started and the update() method brings
the changed objects into the transaction's context. Listing 5 is an
example of a long transaction.

Because the time interval between the first short transaction and the
second short transaction is arbitrary, you'll have to perform a dirty
check to determine whether the object has been changed in the database.
Dirty checking is used to verify that an object has not been modified
during a long transaction. You can enable dirty checking by specifying the
dirty attribute in the <sql> element of the
mapping descriptor. For this to work properly, each object must hold a
timestamp. If you implement the Timestampable callback
interface, Castor will set the timestamp in the first short transaction
and check it during the second short transaction.

The main
feature of Castor JDO is that it provides an API for binding Java
objects to relational databases. Castor JDO supports the following
databases:

DB2

HypersonicSQL

Informix

InstantDB

Interbase

MySQL

Oracle

PostgreSQL

SAP DB

SQL Server

Sybase

Generic (for generic JDBC support)

Adding support for additional databases is fairly easy. See Resources
for further information.

ConclusionIn this
article, we've discussed the basics of mapping a relational data model to
a Java object model using Castor. The examples used in this article
closely follow the examples currently provided with the Castor source
code. By no means do these simple examples cover the breadth of Castor's
capabilities. In fact, Castor supports many other features, including key
generation, lazy loading, an LRU cache, different locking/access modes,
and much more. If you want to learn more about these features and many
others, consult the Resources
section.

Castor JDO is just one of many data-binding solutions available in the
open source world today. As this article has shown, Castor provides a
sophisticated alternative to in-house solutions, which generally require
you to maintain different SQL and JDBC code for each database you support.
Furthermore, because it is an open source, community-based project, Castor
is constantly improving. I encourage you to check out the technology, and
maybe even become a part of the Castor community.

You'll find hundreds of articles about every aspect of Java
programming in the developerWorksJava technology
zone.

About the
authorBruce Snyder lives in the Denver, Colorado,
metropolitan area, where he has worked at various startups
implementing J2EE and related solutions over the last few years.
Currently, Bruce is a senior software engineer at DigitalGlobe, a
satellite imagery and information company located in Longmont,
Colorado. Bruce's experience with Java technology and J2EE runs the
gamut from system design and architecture to implementation and
testing. After working with Castor and object-relational data
binding for nearly two years, Bruce recently became lead developer
of the Castor JDO team. You can contact him at ferret@frii.com.