Time for an Object-Oriented Database?

Relational database management systems aren’t the only game in town. A growing number of Linux and Windows developers are turning to a more synergistic alternative: the object-oriented database. Among other benefits, an object-oriented database stores objects “as-is.” There’s no object-to-relational translation layer.

The computing world appears fixated on relational databases as a persistence technology. To most, a database such as Oracle, MySQL, or PostgreSQL is an obvious choice, even if there’s an “impedance mismatch” — conflicting philosophies, structures, and interfaces — between the tenets of so many object-oriented programming languages and traditional, tabular schemas.

But relational database management systems (RDBMS) aren’t the only game in town. Indeed, a growing number of Linux and Windows developers are turning to a more synergistic alternative: the object-oriented database. Among other benefits, an object-oriented database stores objects “as-is.” There’s no object-to-relational translation layer.

When the programming language and the database are of the same mind — think “objects” — persistence and retrieval of even the most complex objects can be as simple as three or four lines of code.

One of the most popular object-oriented databases (OOD) is db4o. db4o is open source, is available as Java or C# source code, and can be packaged as a single .jar file and a single .dll for Java programmers and .NET developers, respectively. Consequently, db4o is easily integrated and distributed with Java and. NET applications.

Better yet, the db4o application programming interface (API) isn’t overrun with complex classes and legions of methods. Most interactions occur through the db4o ObjectContainer class, which represents the actual database. The ObjectContainer interface defines only ten methods, yet provides the bulk of all common database manipulation capabilities, including inserts, searches, updates, and deletes. With db4o, you can master database coding in a surprisingly short amount of time. And if you need advanced features, you can instantiate an “extended” ObjectContainer, which exposes more methods to provide more control over the internal parameters and machinations of the database engine.

Of course, being object-oriented gives db4o some significant advantages that should not be ignored. First and foremost, objects are stored “as-is”, without any object-to-relational translation layer, either explicit or invisible. Database manipulations are performed by method calls, not strings of SQL that are passed to an SQL interpretation and execution engine. You don’t have to create and maintain schema definitions for mapping object structures to relational tables. Nor do you have to pollute your object definitions with ‘foreign key’ fields in order to model object relationships across a set of relational tables. In short, your objects’ structure is the database schema.

And, last but certainly not least, db4o is fast. Because db4o for Java and. NET are written in the native languages of those platforms, there are no language or runtime “boundaries” to cross (from process to process, say) as the database executes. For comparison, Table One shows the results of recent benchmarking of db4o and some well-known RDBMS packages.

Table 1. Performance of db4o as compared to other relational databases. Each number in the above table indicates by what factor db4o outperforms several popular relational database systems for a variety of database operations. A factor of 1.0 means the database performs on par with db4o. Larger numbers indicate how much better db4o performs. (These numbers were taken from the Pole Position open-source database benchmark. Further information can be found at http://www.db4objects.com/about/productinformation/benchmarks.)

As indicated, db4o easily holds its own against such systems in a majority of database operations.

TABLE ONE: Performance of a number of relational databases as compared to db4o (a factor of 1.0 means the database performs on par with db4o, while larger numbers indicate how much better db4o performs)

Driver/Database

Read

Write

Query

Delete

JDBC/MySQL

10.8

14.6

1.7

6.5

JDBC/HSQLDB

0.4

1.7

677.8

0.7

JDBC/Derby

3,696.0

12.9

1,299.7

7.1

Kicking the Tires

db4o’s refreshingly terse API is best illustrated via an example. Suppose you’ve written a monitoring package for systems on your local intranet. Computers being monitored run a small client stub that communicates with a master control server over TCP/IP. From the control server, you can choose to collect data from any client. This data is a stream of samples taken at a fixed interval for a chosen number of data points. Data is gathered in the context of a monitoring session.

Given that specification, you might define a class to model such a session like the code in Listing One (in Java, although the implementation can just as easily be realized in C#). (For brevity, Listing One doesn’t show the methods associated with MonitoringSession, nor does it show how the data points are collected.)

Now suppose that data for a session has been gathered on the control server, and you want to record it — in other words, you want to store the associated MonitoringSession object in the database. You can do this easily in db4o using the code in Listing Two:

// Put the MonitoringSession in the databasemsessionDB.set(_msession);

// The set() method call begins a transaction.// We must commit() the transaction before// closing the database.msessionDB.commit();msessionDB.close();

(Listing Two assumes that a global static or enumeration called PARAM_DISKIO has been defined to indicate that disk I/O statistics have been collected for this particular session.)

Storing the object requires a simple call to an instantiated ObjectContainer ’s set() method, with a reference to the object as the sole input parameter. db4o automatically (and invisibly) begins a transaction the first time you make a call that modifies the database. The call to commit() terminates the transaction and persists the changes, ensuring that modifications won’t be lost should a system failure occur before the database is closed.

With no real effort, db4o let’s you work with unmodified, “plain old Java objects” (or “POJO’s”). Classes that you intend to store in the database need not be derived from a persistance-aware superclass and do have to implement the Serializable interface. Using Java reflection, db4o explores the structure of the data members of the class and properly persists the object regardless of its structure.

Retrieving a MonitoringSession object is almost as easy as storing it, thanks to db4o’s” query by example” (QBE) technique for locating persistent objects. With db4o’s QBE, you provide the query engine with a “template” object, which db4o uses to locate the search target. The template object is instantiated from the same class as the target, and has fields populated with values that you want matched by the query.

For example, to fetch the MonitoringSession object stored in the preceding code snippet (the 10-13-05MORNING monitoring session), you’d use code like Listing Three:

In the code, zeros or nulls (depending on the data type) are inserted into those template object fields that you don’t want to be considered in the query. (This, of course, reveals one of the limitations of db4o’s version of QBE: you cannot query for objects that have a numeric field equal to zero. Happily, in this example, this isn’t an issue, and db4o provides more sophisticated querying techniques that are more flexible, as you’ll see shortly.)

Next, the query is executed via a call to the ObjectContainer ’s get() method. get() returns an ObjectSet, which we can be iterated over to retrieve the results. Listing Three assumes that there’s no more than one matching target. The code merely checks that the ObjectSet is non-empty and then proceeds to retrieve the first member of the set.

Finally, deleting an object is as easy as adding it. You can modify Listing Three to delete the fetched 10-13-05MORNING object by replacing the ellipsis with:

msessionDB.delete(_msession);

That’s literally all there is to it. All of the object’s fields are located and removed from the database. Just hand db4o the reference to the fetched object and it does the rest.

Going Further

As mentioned earlier, one important ingredient is missing from the previous example: the monitoring session’s data itself. Assuming that session data is collected in packets and packets are stored in a LinkedList object, the class definition for MonitoringSession would include a new property, data:

Furthermore, MonitoringSession ’s constructor is extended to include code to initialize data, the LinkedList, and additional methods add a new datapoint and fetch the reference to the LinkedList:

… public void addData(Object datapoint) { data.add(datapoint); }

public LinkedList getList() { return(data); } …

Now, how do you save a MonitoringSession object that includes a LinkedList of data points? It works exactly the same as before. The call to msessionDB.set(_msession) stores not only the MonitoringSession object, but the enclosed LinkedList object (and all its members) as well.

Whenever you store an object in the db4o database — regardless of the object’s complexity — the db4o engine “spiders” the object structure, locates any referenced objects, and puts everything in the database. This is persistence by reachability: all objects referenced by the object being stored are also stored.

How about reading a complex object from the database? Reading a complex object is almost as easy as writing it, with the exception that db4o gives you a bit more control. To manage memory consumption, you might not want to fetch an entire object structure from the database. Thus, db4o provides a parameter known as activation depth. The activation depth tells db4o how deep into the object tree to go when pulling that object out of the database.

By default, db4o’s activate depth is 5, so retrieving a MonitoringSession object using the default values would also retrieve the LinkedList object, as well as the contents of the LinkedList object, provided that each content object wasn’t too “deep.”

If, however, you want to retrieve only the MonitoringSession object and not the attached LinkedList simply set the activation depth to 1 and retrieve the object. The code is shown in Listing Four.

LISTING FOUR: Retrieving only the “root” object of a complex, “nested” object

// Set the activation depth in the// global configuration context.DB4o.configure().activationdepth(1);

// Fetch the retrieved objectsif(result.hasNext()){ MonitoringSession _msession = (MonitoringSession)result.next(); // Only the retrieved _msession object // will be in memory. NOT the // LinkedList. …

In the code, once the global activation depth is set to one, it affects all subsequently opened databases. (However, db4o lets you set the activation depth for individual objects, too.) You can, of course, re-set the global activation depth at some point, but that doesn’t affect any databases currently opened.

Deleting a complex object follows in much the same vein as fetching a complex object.

Ordinarily, deleting an object in db4o deletes only that object; referenced objects are untouched. You can, however, use db4o’s cascaded delete feature. You can think of this as “delete by reachability”: any persistent objects referenced by the object being deleted are also deleted. Using the db4o-provided ObjectClass, you can tell db4o which classes to apply cascaded delete to. As with the global activation depth, you need to do this before you open the database.

For instance, if you enable cascaded delete and delete the MonitoringSession object, db4o also deletes the embedded LinkedList along with all objects in the list.

Once this snippet of code is executed, deleting any MonitoringSession object causes the associated LinkedList object (and its contents) to be deleted as well. Obviously, cascaded delete must be handled cautiously; a single delete operation can reverberate throughout the database.

Going Native

db4o’s native querying feature lets you create and execute complex queries using standard, familiar programming language constructs. If you’re programming in Java, for example, you don’t have to “step out” of Java into SQL to perform a query. What’s particularly nice about native queries is that they permit you to construct just about any search condition you can imagine.

Suppose that you want to retrieve MonitoringSession objects that were recorded between two timestamps, say, DateTimeStart and DateTimeEnd. That kind of query is too difficult for db4o’s QBE system, but it’s child’s play with a native query, as shown in Listing Five.

The query uses db4o’s Predicate class, which defines a single method, match(). The argument for match() is an object of the target search class. If any object matches the criteria, match() returns true and the query engine adds that object to the results ObjectSet. If no object matches the citeria, then match() returns false.

Native Queries

Querying a database from an application typically leads you down one of two paths: you can call into a proprietary system (as with an ISAM- based system) or you pass can SQL statements — represented as strings — to a SQL query engine.

But neither option is really attractive. Calling into a proprietary API often results in hard-to-follow code, while using SQL almost always costs execution time (as the SQL must be parsed before execution) and can make certain kinds of applications, such as web applications, vulnerable to SQL injection attacks. In addition, because the SQL code is derived from a different domain than the programming language, data type checking (usually performed by the compiler) is simply unavailable, so simple programming errors that would otherwise be caught at compile time surface at runtime.

Native queries, first introduced in a paper by William Cook and Siddartha Rai, and expanded upon by William Cook and Carl Rosenberger, offer an attractive alternative. Put simply, native queries allow a programmer to express queries in the same language that the application is written in. The programmer uses familiar constructs to create what amounts to a” filter” that selects the objects that satisfy the desired query. The most recent release of db4o supports native querying.

For example, using a class named Employee, a native query that returns all employees in the “QA” department might look like this:

A programmer need merely extend the Predicate class and implement the match() method, as shown above. The returned List contains all of the objects from the database (identified by the class of object passed to match()) that satisfy the test.

A native query is much easier to read and understand. It fits smoothly into the object-oriented paradigm (being itself written in the same language as the application). It is also typesafe, and best of all, the current release of db4o performs invisible optimizations on the query itself.

Native queries have numerous advantages, even beyond the obvious fact that you don’t have to learn a new query language to query the database. Native queries are typesafe: you manipulate specific classes, rather than the overarching Object class. Consequently, it’s difficult to enter a query that doesn’t return the class of object that you intend it to. Also, like the QBE system, a native query doesn’t require you to enter queries as strings passed to an SQL processing layer. This means that if code that uses native queries ends up as part of a web application, it will be immune to scripting attacks.

Finally, native queries sidestep the problem of the “impedance mismatch” between the business logic portion of an application typically written in an object-oriented language and the database manipulation portion, which is usually written in a non-procedural language. In fact, the db40 native query facility works much more cleanly than most other implementations — even the QBE examples given toward the beginning of the article. The native language of the application is the query language.

Easy Persistence

The object oriented database system, db4o, makes object persistence easy. It’s terse API lets you create an application quickly. You can use its super-simple” query by example” API for instances that require simple object matching, or it’s native query API for more complex querying. db4o also supports transactions and adheres to the database-safety concepts of atomicity, consistency, isolation, and durability. (ACID)

For many database-driven applications, db4o is a powerful alternative to an RDBMS. It is cross-platform, (which means your applications and databases are easily moved from Windows to Linux and Unix and back) as well as cross-language. There is a great deal that recommends db4o to your attention.