Simple Object Persistence with the db4o Object Database

Many Java applications need to deal with persistent data. In most cases,
this means interfacing with a relational database, possibly a legacy database
or an industry standard Database Management System (DBMS). The JDBC API and
drivers for most database systems provide a standard way of using SQL to execute database queries.
However, the interface is complicated by the "impedance mismatch" between
the domain object model of the application and the relational model of the
database. The object model is based on software engineering principles and
models the objects in the problem domain, while the relational model is based
on mathematical principles and organizes data for efficient storage and retrieval.
Neither model is particularly better than the other, but the problem
is that they are different and do not always sit together comfortably
in the same application.

Some solutions to this problem, such as Hibernate and
Java Data Objects, are designed to
provide the developer with transparent
persistence: the application deals with persistent objects using
an object-oriented API without the need for SQL code to be embedded in the
Java code. Container Managed Persistence (CMP) does a similar job for EJB
containers, but it is not a general persistence facility for the Java platform.
In any of these solutions, the objects are mapped to tables in a Relational
DBMS (RDBMS) by the underlying framework, which
generates the SQL required to store and retrieve object attributes. The more
complex the object model, the more difficult the mapping. Descriptors, usually
XML files, need to be created to define the mappings.
Inheritance and many-to-many relationships, in particular, add complexity as these
relationships cannot be directly represented in the relational model. Inheritance
hierarchies can be mapped onto a set of tables in different ways; the choice results in a tradeoff between storage efficiency and query complexity,
as a separate join table is required to implement a many-to-many relationship.

Storing objects in a database, which itself uses an object model, offers another
solution. A variety of Object Oriented DBMS (OODBMS) products was developed,
particularly during the 1990s, but such tools can be complex to configure and can
require the use of an object definition language. The objects are stored as objects,
but they are not native to the application language.
These products have not had a major impact on the market
outside some niche areas, and effort appears now to be mainly concentrated
on object-oriented APIs for relational databases as well as on hybrid object-relational
databases.

Embedded Databases

In some applications the overhead of an industrial strength DBMS is unnecessary,
and data storage requirements are better provided with a small-footprint, embeddable
database engine. SQLite, for example, provides
a self-contained SQL database engine.
But since the interface with Java is through a JDBC driver, such SQL-based solutions
are still affected by the impedance mismatch.

In many cases persistence could be achieved more simply using an embedded
object database engine. This is a good time to look at db4o.
Created by Carl Rosenberg, db4o was at one time only
available commercially, but now it is open source and has recently been
licensed under the GPL.

db4o has some interesting features:

No impedance mismatch–objects are stored as they are

Automatic management of the database schema

No changes to classes to make them storable

Seamless Java (or .NET) language binding

Automated data bindings

Installation by adding a single 250Kb library file (Java jar or .NET DLL)

What Is It Good For?

db4o has been chosen for applications in embedded systems in which zero administration, reliability, and
low footprint are critical features.
In Germany, BMW Car IT, for example, uses it in an embedded car electronics prototype.
Die Mobilanten, also in Germany, uses
db4o in a PDA-based solution for mid-sized utilities.
In the U.S., Massie Systems' retinal imaging system for infant eye diagnosis relies on db4o to
power its client imaging session database.

The sheer simplicity of the way in which you store objects with db4o is also
attractive for teaching purposes. The University of Essex and Texas A&M
University uses db4o for teaching and research purposes. At my own college,
for students learning how to apply object-oriented concepts in their projects,
the need to interface with a relational database can have a negative influence
on their approach to design of their domain models. Using db4o allows them
to work with persistent data without the distraction of conflicting data models
and without the need to spend a significant amount of time learning to use
a tool such as Hibernate or a complex OODBMS. Also, learning about
the concepts of an object-oriented query API may prove useful in the future.

Same API, Different Storage

Sometimes you just have to use a relational database. From the Java developer's
point of view, transparent persistence is the ideal. If persistence is implemented
through an object-oriented API , then the developer does not have to learn
different techniques to make use of different kinds of data stores. Although
db4o is not JDO-compliant (it is easier to use as a result), its creators
have partnerships with other open source projects including
MySQL and Hibernate and are
working toward a single, consistent object persistence API that will interface
with object databases including db4o itself, relational databases, and alternative
storage schemes such as Prevayler.
If doing things the JDO way is important to you, then you might look at
ObjectDB, which is a JDO-compliant pure object database.

An Example

This example shows how simple it is to create a database and store objects.
It also illustrates two query methods: Query-By-Example (QBE) and the
more flexible S.O.D.A query API. The full source code of this example is available in the resources section below. [co: This doesn't properly target down to the Resources section -- js]
To run it you need to add the db4o JAR file to your classpath and execute the
main class Db4oTest.java.

The two classes in the example represent a baseball Team and
a Player. To make things a bit
more interesting, we also have a Pitcher class. Pitcher is
a subclass of
Player and adds one extra field on top of the ones it inherits. Team has
an attribute that is a list of Player objects, which can, of course,
include Pitcher
objects. Team, Player, and Pitcher objects
are "Plain Old Java Objects" with no persistence code. No unique key attributes are required as an
object database automatically stores objects with unique object identifiers (OIDs).

Note that we can retrieve all objects of the Player class and
its subclasses (just
Pitcher in this example) without any extra effort. The Pitcher
objects show up in the output as they have the extra wins attribute
listed. With a relational database we
would have had to decide how to map the inheritance tree to tables and possibly
have had to join tables to retrieve all the attributes of all the objects.

Updating and deleting

Updating objects can be achieved using a combination of the above techniques.
The following code assumes that only one match is found, and the matching object
is cast to Player so that its attributes can be modified.

More powerful query support

One of the major drawbacks of early versions of db4o was that QBE provides
fairly limited querying capability. For example, you couldn't run a query
like "all players
with batting average greater than .300". db4o now includes the S.O.D.A.
API to provide querying that comes much closer to the power of SQL. An instance
of the Query class represents a node in
a query criteria graph to which constraints can be applied. A node can represent
a class, multiple classes, or a class attribute.

The following code demonstrates how to do the query described in the previous
paragraph. We define a query graph node and constrain it to the
Player class. This means that the query will only return Player objects.
We then descend the graph to find a node representing an attribute named “battingAverage” and
constrain this to be greater than 0.3. Finally, the query is executed to return
all objects in the database that match the constraints.

At first glance, this performs a similar query to an SQL query, like this:

SELECT * FROM players WHERE battingAverage > 0.3

However, the design of the Player class allows inverse relationships
to be created between Team and Player objects, as
shown in the test data. A Team has a reference to a list of Player objects,
while each
Player has a reference to a Team. This means that
the result of this query contains PlayerandTeam objects. The
code below demonstrates this:

This worked because the inverse relationship was designed into the object
model. Object databases are navigational: you can only retrieve data
following the direction of predefined relationships. Relational databases,
on the other hand, have no directionality in their table joins and therefore allow
more flexibility for ad-hoc queries. However, given the right object relationships,
related objects can be retrieved from the object database with very little
programming effort. The database model and the application object model are
identical, so there is no need for the programmer to think differently about
the data. If you can get the Team for
a given Player
when the objects are in memory, you can do the same from the database.

What else can S.O.D.A. do?

SQL allows results to be sorted in order; S.O.D.A. does too. This example
shows how the stored Player objects can be retrieved in order
of battingAverage. (It's pretty obvious which ones are the pitchers now!)

S.O.D.A. allows more complex queries to be defined using code that is quite
simple once you get past the temptation to think the relational way. To set
constraints you just navigate around the query graph to find the classes or
attributes you want to put conditions on. The query graph is closely
related to the domain object model, which should (hopefully) be well understood
by the developer. On the other hand, to achieve a similar result with SQL you
need to take account of how the domain objects have been mapped to relational
tables.

This example shows how to set conditions on two attributes of the Player
class to find players with batting average above .130
who are also pitchers with more than 5 wins. Again, we define a query graph
node and constrain it to the
Player class. We then descend the graph to find a node representing
the attribute named “battingAverage” and
constrain this to be greater than 0.13. The result of this is a Constraint object.
To set the next constraint, we descend to find the node representing the attribute "wins";
this in itself means that the query will only find Pitcher objects.
This node is constrained to be greater than 5, and this is combined using a
logical "AND" with the first Constraint object.

The last example shows how to combine conditions on attributes of different
classes to find players with batting average above .300
who are in teams with more than 92 wins. The easiest way to do this is
to start with Player, and then
navigate to Team. We descend to
find the "battingAverage" node as before and set a Constraint.
We then descend to find the "team" attribute. As this attribute is of type Team,
the node represents the Team class, so we can descend again to
the node representing the "won" attribute of Team and
set a constraint on that. Finally, we combine this with the first
Constraint.

Conclusion

A small footprint, embeddable object database offers a very simple, compact
route to object persistence. db4o is now an open source object database that
offers a range of attractive features and supports both Java and .NET. The
simplicity of installation and use as well as the lack of an impedance mismatch between
object and data models make db4o very useful in a range of business and educational applications.