The POJO Container: POJO Clustering at the JVM Level : Page 3

POJO clustering at the JVM level not only provides a simple way for applications to achieve scale-out, but also sets the stage for a stack of enterprise-scale services and practices based on simple POJOsthe POJO container.

by Ari Zilka

Jan 10, 2007

Page 3 of 4

Introducing the POJO Container
POJO clustering provides a good foundation for a new, simpler, scaled-out architecture, but it is not enough by itself. Applications built around POJO clustering should be contained by some well-understood design patterns to ensure consistent and predictable application architecture and performance. What emerges from a system of consistent application design layered on top of POJO clustering services is what Terracotta calls the POJO container––a stack of services and practices based on simple POJOs that exhibit enterprise-scale performance, availability, and control when deployed in a production datacenter.

The UnitofWork Pattern
Inside the POJO container, Terracotta turns to the Unit of Work pattern to provide a simple and consistent application architecture. Think of the Unit of Work pattern inside the POJO container in terms of the classic design pattern of master and workers. The Unit of Work pattern consists of a master, a work queue, a results queue, and workers. The master communicates with the workers simply by introducing units of work into the work queue. Workers then remove those units of work, execute the work, and put any necessary responses onto the response queue for master aggregation and summary.

This design pattern is easy to use and run in a single JVM and Java 1.5 offers some built-in support for it, but deploying Unit of Work applications in a cluster today requires some heavyweight infrastructure (such as a JMS message bus), which inevitably violates the POJO principle. Therefore, the Unit of Work pattern institutionalized inside a cluster-ready container is the core of the POJO container. Figure 3 illustrates the container's basic design.

The Master is passed an object that implements the UnitOfWork interface. The master then calls the routeMe() method from which the UnitofWork object returns a reference to the Sink that will send the unit of work to a worker. The master then sends that unit of work down the sink to the worker. Workers de-queue only the work they logically own. When they do, they pass the unit of work to the handler attached to it that contains the business logic for the given type of work. The worker then en-queues the return result onto the results queue. This logical structure is a loop in which every time a UnitOfWork is en-queued by any member of the cluster, routeMe() gets called. This means work can pass from worker to worker, not simply between master and worker.

Simple and Scalable
The Unit of Work pattern in the POJO container can be very simple. A lead developer can implement the master. Workers needn't be implemented at all because they are application agnostic. They implement the following simple loop:

Call queue.pop();

Block until queue.pop() returns;

Get a UnitofWork off the queue;

Fire UnitofWork.doWork();

Fire resultsQueue.push( return result ); and

Loop

The average developer need develop only Units-Of-Work (UoWs) and handlers for different types of UoWs. The only caveat to this is that when UoWs depend on other UoWs they are no longer atomic. In such cases, once amongst the master, the UoW or the worker has to carry with it some notion of state. To maintain simplicity, the UoW is the best place to maintain this state. A worker can fire doWork(), and then re-enqueue the work for itself to pick up at a later time. This way, workers are stateless and completely scalable, restartable, and all those good operational traits that IT likes.

But what of scalability? Scaling the POJO container does not require understanding the blackbox networking infrastructure that is moving the objects amongst the JVMs. In fact, the biggest challenge is around bottlenecking on a single queue since that queue functions as the routing core for communications amongst master and workers. For the average application, this is rarely an issue.

At high data volumes, a single work queue will most certainly be a bottleneck. The solution is to create a queue per worker and put the load-balancing work on the master. With a queue per worker, the mutations to the queue data structure will not be broadcast to the cluster, lowering the container's tendency to bottleneck on the network as more workers are introduced. Furthermore, multiple load-balancing schemes can be used because the master now has discrete queues per worker. Consider a few examples:

Round-robin would require the master to en-queue each UnitOfWork into workers' queues, one after the other.

Workload-sensitive balancing would require the master to look at each worker's queue depth and en-queue work to the least utilized worker.

Data affinity (a.k.a. sticky load balancing) would require the developer to implement a routeMe() method that sends the same class of work to the same worker each time.

The scalability characteristics of applications developed against the Unit of Work pattern and deployed in a clustered POJO container offer significant advantages. Workers can be introduced on demand. Load balancing techniques can be introduced separately from the basic business logic. The average developer does not see threads and concurrency in this event-driven model and so will tend to introduce bugs that are more of a functional scope than an infrastructure or performance scope; such bugs are both easier to reproduce and to automate tests for.

The key to clustered POJO-based application development is to have access to the source behind the Master/Worker/UoW interfaces so that lead-level developers can freely manipulate workers, master, routing/load balancing, and work-state. The container loses its power to simplify when it begins to constrain instead of abstractor containdevelopment. A blackbox UnitOfWork engine not based on POJOs will tend toward constraint since, logically, a one-size-fits-all abstraction doesn't exist.