Monday, March 29, 2010

I have been working with Java and related technologies at multiple companies since 2004. Most of the major business problems that I have encountered revolve around working with relatively small data objects and relatively small data stores (less than 50GB). One commonality in the development environment at each of these companies, other than Java, has been some form of legacy data store. In most cases, the legacy data store was not originally designed to support all of the various applications that are now dependent on the legacy system. In some cases, performance issues would arise that were most likely due to over utilization.

One approach to help alleviate utilization issues on legacy resources is with data caching. With data caching we can utilize available memory to keep our data objects closer to our running application. We can take advantage of technologies like Hazelcast, a data distribution platform for Java, to provide support for distributed data caching. In particular, this example focuses on Hazelcast's distributed Map to manage our in-memory caching. Because Hazelcast is easily integrated with most Web applications, include the hazelcast jar and xml file, the overhead is minimal. When we take advantage of Aspect Oriented Programming(AOP), with the help of Spring and AspectJ, we can leave our current implemented code in place and implement our distributed caching strategy with minimal code changes.

Let's look at a simple example where we are loading and saving objects in a simple Data Access Object (DAO). Below, PersistentObject, is the persistent data object we are going to use in this example. Note that this object implements Serializable because it is required if we want to put this object into Hazelcast's distributed Map (it is also a good idea for applications that utilize session replication).

Here is the implementation of the DataAccesObject interface. Again, really simple and in fact, I left out the meat of the implementation for brevity, but it will work for this example. Each of these methods would usually have some JDBC code or ORM related code. The key here is that our DAO code will not change when we re-factor the code to utilize the distributed data cache because it will be implemented with AspectJ and Spring.

public class DataAccessObjectImpl implements DataAccessObject {
private static Log log = LogFactory.getLog(DataAccessObjectImpl.class);
@Override
public PersistentObject fetch(Long id) {
log.info("***** Fetch from data store!");
// do some work to get a PersistentObject from data store
return new PersistentObject();
}
@Override
public PersistentObject save(PersistentObject persistentObject) {
log.info("***** Save to the data store!");
// do some work to save a PersistentObject to the data store
return persistentObject;
}
}

The method below, "getFromHazelcast", exists in the DataAccessAspect class. It is an "Around" aspect that gets executed when any method "fetch" is called. The purpose of this aspect and pointcut is to allow us to intercept the call to the "fetch" method in the DAO and possibly reduce "read" calls to our data store. In this method, we can get the Long "id" argument from the called "fetch" method, get our distributed Map from Hazelcast and try to return a PersistentObject from the Hazelcast distributed Map, "persistentObjects". If the object is not found in the distributed Map, we will let the "fetch" method handle the work as originally designed.

The method below, "putIntoHazelcast", also exists in the DataAccessAspect class. It is an "AfterReturning" aspect that gets executed when any method "save" returns. As each PersistentObject is persisted to the data store in the "save" method, the "putIntoHazelcast" method will insert or update the PersistentObject in the distributed Map "persistentObjects". This way we have our most recent PersistentObject versions available in the distributed Map. If we just keep inserting/updating all PersistentObject's into the distributed Map, we would have to eventually look into our distributed Map's eviction policy to keep more relavent application data in our cache, unless, we have excess or abundant memory.

This is a simple example of how Spring, AspectJ and Hazelcast can work together to help reduce "read" calls to a data store. Imagine reducing one application's "read" executions against a legacy data store while improving read performance metrics. This example doesn't really answer all questions and concerns that will arise when implementing and utilizing a Hazelcast distributed data cache with Spring and AspectJ, but I think it shows that these technologies can help lower resource utilization and increase performance. Here is a link to the demo project.