Using Oracle Berkeley DB Java Edition as a Persistence Manager for the Google Web Toolkit

by Erick Audet and Gregory Burd

When building Web applications with the Google Web Toolkit, using Berkeley DB Java Edition as a persistent data store can reduce project costs significantly.

Published February 2008

The standard Java Platform, Enterprise Edition (Java EE) approach for object persistence in Web applications is to use Enterprise JavaBeans (EJB), an object-to-relational mapping (ORM) technique. Java objects are translated back and forth into SQL statements and stored within a relational database management system (RDBMS) as rows related across a number of database tables. This is the most common approach for good reason. Most customers will find that many aspects of their organization depend on SQL and RDBMS servers from Oracle to manage business critical information. In these cases EJB and Java Persistence API (JPA) are the best approaches for object persistence. They bring Java applications and existing applications together, speaking the same SQL language, to access the same relational data.

Interestingly, some EJB applications store data in a relational database simply because it is a common and familiar design pattern, not because there are multiple applications accessing the same data using SQL for different reasons. If the data isn't in an existing relational database and there are no plans for it to be accessed by any other means than through the ORM layer, then the ORM layer can be overhead. There are some specific cases where data must be accessed by other SQL-based solutions running ad hoc SQL queries, but this was not true for our application. Viewed objectively, the ORM design pattern adds a great deal of unnecessary overhead when it is the only aspect of the system interacting with the RDBMS using SQL. In contrast, Oracle Berkeley DB Java Edition's Direct Persistence Layer (DPL) stores object data directly into a Btree database, without the need to translate it into some intermediate language.

In this article, you will explore real-world experiences from combining Berkeley DB Java Edition and the
Google Web Toolkit to create a highly effective solution for object persistence.

Oracle Berkeley DB Java Edition

Oracle Berkeley DB Java Edition is one of three products acquired by Oracle when it purchased Sleepycat Software in February 2006. It is a pure-Java database engine designed to store data to a local file system. It provides Atomicity, Consistency, Isolation, Durability (ACID) transactions, allows for high levels of concurrency, can scale to terabytes of data, and will use a predictable portion of the Java virtual machine's memory as a cache for that stored data. Berkeley DB Java Edition is a reimplementation of the highly successful Berkeley DB product that was written in ANSI C, but it has significant differences. Primary among those differences is the DPL API provided by Berkeley DB Java Edition. This layer provides a JPA-like set of Java annotations allowing for easy storage, indexing, updating, and deleting of object graphs, without the need to translate object data to and from SQL. There is no ORM component or back-end database when using the Berkeley DB Java Edition DPL and no client/server connection. The DPL stores object graphs, relationships, and indexes in a very intuitive and powerful manner. The DPL annotations are similar in name and concept to the JPA. The DPL can store any Plain Old Java Objects (POJO) and keeps all the persistence rules and configuration within the class itself. This is the first key difference and advantage of the Berkeley DB Java Edition DPL when compared to ORM solutions; it is not necessary to maintain external configuration files. This approach greatly enhances the developer's efficiency, reduces project complexity, and eliminates overhead.

Applications developed using EJB-based solutions will typically need a configuration file for each class mapped to a physical database table. Although persistence such as Hibernate also use Java annotations, in practice, you will need to at least understand and control the Hibernate.config.xml file to manage the connection strings and other settings enabling your application to connect to a relational database server via Java DataBase Connectivity (JDBC).

Despite excellent integrated development environment (IDE) integration, configuration file management, and other support for EJB-style solutions, the developer will inevitably end up hand editing them at some point, a highly error-prone and time-consuming process. Berkeley DB Java Edition's use of Java annotations to encapsulate these settings within the persistent POJO classes is unique, simple, and less error prone by contrast. The DPL annotations keep the storage context in a single, obvious, and intuitive location: the source code where a developer would expect to find it.

The Google Web Toolkit

The Google Web Toolkit (GWT) is a model-view-controller (MVC) framework abstracting the process of writing HTML, JavaScript, and associated Asynchronous JavaScript And XML (Ajax) messaging into code written in Java, and then processed by the GWT for deployment in Java EE application servers. It eliminates the complexity of hand coding HTML, JavaScript, and other elements that bind the JavaScript actions triggered in an HTML page (clientside code run within the user's browser) to the controlling layer executing within the Java Enterprise application server (serverside, where the database resides). When developed within a good IDE, the GWT can provide an immeasurable advantage over other frameworks and considerably lower the cost and complexity of the project.

The GWT also provides a debugging environment called hosted mode. This enables developers to debug their clientside code (JavaScript and HTML) in real time simply by stepping through their Java source code in a familiar debugger-style manner. The GWT will handle, in real time, the mappings between the generated JavaScript/HTML code and the Java code, drastically reducing development time for complex Web applications.

Storage aspects of the application can also be quickly debugged in hosted mode. Developers can concentrate on the application when their supporting tools eliminate configuration overhead and unnecessary complexity.

Peanut Butter and Chocolate: The GWT and BBD Java Edition DPL

It is very simple to combine the GWT and Berkeley DB Java Edition DPL. The GWT controller framework is straightforward and easy to customize. The sequence diagram below presents a typical flow of calls between the GWT remote procedure call (RPC) mechanism and Berkeley DB Java Edition. A widget (or any other GWT user-interface client class in the com.google.gwt.user.client.ui package) contains user-interface (UI) controls such as Buttons or ListBoxes. These controls handle events triggered by a user viewing a page within their Web browser. Each of these controls has a definition of the screen events it can handle. Screen events such as "OnClick" can be directed to invoke code on the serverside classes using the GWT RemoteService interface. Before calling a RemoteService, the Widget must create a Data Transfer Object (DTO, a common Java EE design pattern) from its widget fields. Then an AsynCallBack (Save DTO) method is created and passed to the RemoteService. This serverside class will create a model object, a POJO representing the data delivered from the browser to the server, using the DTO, which is then handled by Berkeley DB Java Edition via calls through a BusinessService object. DataAccessors are then used by the business services object and manage the transactional storage of the various conceptual objects within the Oracle Berkeley DB Java Edition database.

Demo Details and Code

This section describes in more detail how to implement the proposed architecture by showing a few key source code examples (
download zip). To understand it, the reader must have some level of comfort with Ajax, Java annotations, and object-orientation design.

Data Model

Two conceptual model classes are used to illustrate this example: a generic class called Person and a subclass of Person called User. These objects and their relationships to one another are easily expressed with DPL annotations. Berkeley DB Java Edition will then store the data using transactions to local database files.

The GWT UI, Callbacks, and DTOs

Our example uses a very simple concept of setting nominal information and an injury to an existing user (or patient). It shows how it can use the concept of inheritance with the Berkeley DB Java Edition DPL database. The following picture is a simple Widget used to record a user's injury report.

As you can see in the example code, the SaveButton event "OnClick" will call the SaveProfile() method. This method uses a simple callback block to handle the return object from the serverside class method UserService.createUser(). The UserDTO, a POJO used to carry data from the serverside to the clientside, contains only basic Java datatypes, allowing it to be a serializable class.

JavaScript works with basic datatypes only. They must be serialized to be transferred to the client Web browser and the remote application server; this restriction thus becomes a requirement of all GWT DTOs. If you look closely at our entity classes, they do not have any concept of client application nor do they have to implement GWT interfaces. They are completely isolated from the client application. To achieve this separation, a Java EE DTO design pattern is used. These DTOs contain only basic datatypes and are used by both packages (Services and GWT Remote Services). DTOs must implement the GWT IsSerializable interface and a GWT configuration XML file must also be created. Another important reason for using DTOs is that GWT client classes must be transformed into JavaScript code. To achieve that, GWT uses a Java-like assembler, which does not support the complete Java Platform, Standard Edition API or the latest features such as annotations and typed arrays. Although this is a minor design limitation of the GWT, it generally doesn't affect the majority of programming tasks.

Side Notes:

An important change in version 1.4.60 of the GWT is new support for the Serializable interface, which does not behave the same way as IsSerializable. This causes problems when a member attribute is initialized with a nonbasic datatype [such as a Class rather than int]. Using the IsSerializable interface will now throw a runtime error exception [on both Linux and Windows]. It would be wise to use Serializable instead of IsSerializable. GWT will probably deprecate the IsSerializable interface in future versions or solve this bug.

Here's an implementation tip: Use three different projects; one project to contain the DTOs, a second project with conceptual classes, and the third and final project that contains the GWT Web application classes. By adding the special GWT XML definition at the root of the package, the GWT Web application can inherit those classes and use them. Keep in mind that any classes used by the GWT clientside packages will be transformed into JavaScript, so they cannot use any Java 1.5-specific features such as annotations. Keep the two separated because the GWT client application will be able to use the DTO packages as well as the business service classes found in the conceptual classes project.

Remote Services, Business Services, and Accessors

At this point the conceptual entities have been created and are properly annotated with Berkeley DB Java Edition DPL annotations. The basic services methods will manage the transaction states of those entities. The next step is to create GWT Remote Services. They will invoke the Berkeley DB Java Edition DPL code to store the data. Remote Services are the standard way to communicate with the Java EE application server from the clientside browser. This is done using a standard Ajax, essentially an HTTP POST, call generated and managed by the GWT. The serverside code for that Ajax call is found within the "server" package of the GWT application. This is code that executes within a servlet container or a Java EE application server (because we are on the server side, this code is Java, whereas the code running in the browser was JavaScript) and this is where we tie user actions at the browser into business logic and then the Berkeley DB Java Edition DPL for object persistence.

Model objects are business data classes that implement the data management aspects of the application within the database. The EJB programmer can roughly think of the UserService as a stateless session bean.

The class BaseService itself does not handle any logic. This class handles the transactional state of a specific task on one or more objects in the database. To add business process logic to a service, the application will need to extend that BaseService class. In the example, the class UserService will function in this role and handle all access to the User entity by providing implementations for methods such as createUser(), saveUser(), and getUser().

This implementation resides within the serverside package of the project. It calls a business service called UserService. It is the UserService that implements the business logic for a particular instance of User. It's at this point in our example that we transform the UserDTO object into a User object, the model POJO with DPL annotations that will be stored within an Oracle Berkeley DB Java Edition database. Recall that the database is simply a set of files located on the file system of a server accessible by the Web application servlet container.

Tried-and-true design patterns dictate that the application should separate the DTO objects from the business objects by layering and abstraction. Business services must not know the DTO classes, they need only to deal with the conceptual model objects. These objects are persisted into the Berkeley DB Java Edition database. The business service classes are responsible for the management of the conceptual model objects in the data store. To relate this to EJB, the UserService is similar to a stateless session bean.

This method uses the inherited methods open() and startTransaction() to set up and use Berkeley DB Java Edition as a database (aka entity store), and then begins a transaction. The separation helps to keep developers from making errors when setting up the Berkeley DB Java Edition connections and transactions later in the process. The UserAccessor class is responsible for setting up the various Berkeley DB Java Edition DPL accessors, essentially indexes, for data access. This class encapsulates the indexes used later in the code to create, update, and delete objects in the Berkeley DB Java Edition database. Here is the content of the UserAccessor class:

By setting up these accessors, the business service layer is simplified and consistent. The task of creating and managing indexes is isolated in one place and will not interfere with your business logic. Here is an example that uses an index to retrieve an instance of User by name.

wUser = wUserAccessor.userByUserName.get( pUser.getUserName() );

Otherwise, to retrieve a user object without the use of these accessor methods, the code would look more like this:

As with any object persistence solution, there are trade-offs. Berkeley DB Java Edition is no different in this regard.

Advantages:

Minimal configuration and setup. There are no XML files to maintain at all. All aspects of object storage are handled by the Java annotations of the DPL in the application's entity classes. Even the EntityStore does not need to be configured; it is simply a local directory accessible from the Java EE application server executing the code.

Simplicity. The MVC design pattern is well understood. Here, the view and controller are managed by the GWT, and the Model—how data is stored to disk—is Oracle Berkeley DB Java Edition managed by a simple set of service (EJB session beans like) classes.

Fast data access. Oracle Berkeley DB Java Edition doesn't spend time translating object graphs to and from SQL, and it doesn't send data over the network to a remote server; this reduces the work involved in object storage by two-thirds. There are no redundant caches of data, only the one in the Java EE application server managed by Berkeley DB Java Edition. With EJB there are two, one in the RDBMS and one managed by EJB/JPA within the application server.

Fast, scalable, concurrent. Berkeley DB Java Edition is transactional and can manage thousands of concurrent threads simultaneously accessing data. It will manage terabytes of data on disk caching the most commonly used portions within a subset of the JVM's available memory.

Robust transactional storage. If for any reason the application server fails, Berkeley DB Java Edition will recover the database to a consistent and usable state on restart.

Easy to administer. All aspects of database administration are simple API calls rather than complex scripts or command-line tools. The application is designed from the start to administer itself and requires no oversight in production.

Easy to deploy. Deployment of Berkeley DB Java Edition is a two-step process: (1) include the Berkeley DB Java Edition JAR (Java Archive) file in the WAR (Web Archive) file and (2) identify and create a directory on the local file system for the database files.

Open source. Berkeley DB Java Edition is made available under the Sleepycat License, which is one of the Open Software Institute approved open source licenses. If the terms of the Sleepycat License are too restrictive, you will need to purchase a commercial license from Oracle.

Very responsive support forums. I have found that the developers of Berkeley DB Java Edition are available on the Oracle Technology Network (OTN) forums to answer questions and provide help as well as tuning and architectural advice.

Shortcomings:

Weak model change management. Although Berkeley DB Java Edition provides methods to migrate databases when the model object changes, the process isn't ideal during development. Once deployed, the support for class migration and mutation is excellent. Just remember that when you are developing, it is good practice to entirely remove the Berkeley DB Java Edition database directory and start from scratch each test run. Otherwise, you could encounter runtime errors. On the upside, this is good practice during development and the process is easy. (This is no different from other approaches like ORMs and EJBs however.)

Limited reporting. Reporting is almost always a necessity in real-life applications. There is only one solution for Berkeley DB Java Edition at this time: JasperSoft's JasperReports has a Berkeley DB Java Edition adapter (available on JasperForge). It would be nice to see support for Oracle's reporting tools as well as others to have a variety of options.

Lack of integration with Oracle Database or other RDBMS systems. Berkeley DB Java Edition can be an effective database of record for your Web applications as demonstrated above using the GWT. Inevitably some amount of data will need to exist in one or more relational databases deployed within the enterprise. At this time, there is no supported, simple way to accomplish that integration. To do this, you would need to write code that uses JDBC or JMS or some other method to move data between a Berkeley DB Java Edition database and other databases. Hopefully this will be addressed in a future release, or by some open source or commercial third party.

Immature standards support. Berkeley DB Java Edition does support a few Java EE related standards (JTA, JMX, JCA) but none related directly to persistence. Notably, EJB and JPA are not supported. This is for obvious reasons of course. Berkeley DB Java Edition and its DPL are alternatives to the ORM approach exemplified by all Java persistence standards except for Java Data Objects, which hasn't had much market success or widespread adoption to date. Berkeley DB Java Edition does support (or perhaps enhance is a better way to describe it) a pseudostandard, that of the Java Collections package (java.util.collections). Berkeley DB Java Edition allows the standard collection classes to be transactional and stored to disk rather than only useful in memory. This is a significant and interesting feature, and it is used within the DPL, but it is not technically a Java (Java Specification Request, or JSR) standard. Also nonstandard, but very useful, is the BIND API in Berkeley DB Java Edition. The API provides a way to encode/decode Java types into byte streams. Again although not a standard, the BIND API can be very handy for use in Berkeley DB Java Edition as well as other parts of applications. Overall I sympathize with the Berkeley DB Java Edition team. They don't have many options. There is a bias for ORM as the only solution for Java object persistence and so there are not many standards that they can implement. This is as much an issue with the Java community as it is with Berkeley DB Java Edition. Perhaps the DPL can become a JSR at some point.

Conclusions

The primary benefit of combining the GWT and Berkeley DB Java Edition is simplicity. Both frameworks are designed for ease of use and eliminate common, and often taken for granted, overhead in all stages of application development.

Web applications can be complex for many reasons. The myriad of technologies that together support today's Web browsing experience is dizzying. Developers must pick and choose from dozens of different combinations of design patterns and frameworks when laying the foundations of their Web applications. Thankfully, with the help of companies like Google and Oracle, developers are able to reduce the complexity of Web application development to a manageable level. There are many cases where EJB and ORM make sense for the storage (or Model) aspect of a Web application, but in our particular case we didn't require external SQL access to our data. An ORM layer and RDBMS back end was overhead that we didn't need. It was nice to discover a new way to manage data storage that was just as robust and fast as an RDBMS and yet, in our case, far more appropriate in many ways.

Berkeley DB Java Edition is a true revolution in choices for a persistence layer. It is eliminating the ORM barrier and solves so many problems, such as:

ORM mapping

Maintaining and supporting an RDBMS

SQL overhead

ORM compatibility with features (access intent, pessimistic locking, among others)

Having to maintain a conceptual model and a physical model

Frameworks such as GWT, JavaServer Faces, Echo2 and others are recent additions in the Web 2.0 programming paradigms. They basically bring back the rapid development factor which was present before the Web era. These tools enable you to create professional and scalable Web applications. They all offer different features. But one thing they have in common is the functionality to abstract the complexity of clientside syntax. They do this by generating and maintaining the client side, in-browser.

How do the GWT and Berkeley DB Java Edition work in the real world of development? As with any new and innovative ideas, people hesitate and wait for them to prove themselves in some major implementation. Berkeley DB Java Edition might be new to the Web framework persistence layer, but it has matured over the last five years and is performing significant tasks storing terabytes of data in places you might not imagine but already use, and yet unknowingly depend on today. Web developers should take a close look at replacing their ORM mindset with one that allows for storage solutions like Berkeley DB Java Edition. If Oracle Berkeley DB Java Edition worked for us, it could work for you too.

Eric Audet, M.Sc., TechSolCom, has 16 of experience in IT, mainly as a software and data architect. He currently works as a software architect for the development of transactional SOA, Wireless and Web applications using WSDL, Java EE and Java ME technologies.

Gregory Burd is the Product Manager for Oracle's Berkeley DB, Berkeley DB Java Edition, and Berkeley DB XML database products. He has held this position since 2003 and continues working on Berkeley DB products within Oracle's Embeddable Databases Group. Greg has a diverse background including software engineering, product management, enterprise software consulting, software alliances and sales, and has contributed to open source and free software projects.