Clustering with JBoss/Jetty

Back in January 2001, my company was experiencing some serious dot-com woes. We had been developing a product with a trial license, but we could not afford to purchase the software after the trial period had ended. Fortunately for us, open source solutions such as JBoss/Jetty had matured to the point where we thought that they were viable J2EE solutions. And while our financial condition has improved since January, our experience with open source has been so positive that we no longer consider switching back to any closed-source solutions.

Our application is a Web-based service providing e-marketing tools for small to mid-sized businesses. In order for us to succeed, our architecture has to be able to scale to tens of thousands of merchants, millions of transactions a day. Our current hardware configuration consists of three dual-processor Linux boxes running our J2EE Application Server and one dual-processor Sparc running an Oracle database. Sitting in front of the three AppServer machines is a CSS 11150 Cisco Load Balancer with a redundant hot backup. Could JBoss/Jetty running on this hardware architecture meet our needs?

The major obstacle for replacing a commercial Web application server was losing its clustering features. We defined clustering as having these four attributes: synchronization, load-balancing, fail-over, and distributed transactions. Our commercially licensed product has, among other things, distributed transactions and replica-aware HTTP Sessions, RMI and EJB stubs that provide load-balancing and fail-over facilities. Although JBoss/Jetty can be integrated with Tyrex, a distributed transaction manager, it does not have any replication or fail-over capabilities. The question was, did we actually need all these features? Could we implement some of them ourselves? Or, were there ways we could simulate some of them? The remainder of this article details these four attributes further and describes how we achieved similar functionality using JBoss/Jetty.

Synchronization

Sometimes, certain database tables need strict synchronized/serialized access. A clustering solution needs to support some kind of distributed locking mechanism for EntityBeans, that map to these sensitive tables, to protect against dirty reads and other concurrent access problems. JBoss/Jetty does not have a distributed caching or locking mechanism. To get around this problem, we used a combination of two JBoss/Jetty features.

The EJB 2.0 specification (10.5.9 Commit Options) defines the ability to apply different types of committing behavior for entity beans. We used commit option "C," which states that an entity bean must always do an ejbLoad () to synchronize its state with the datastore before it enters a transaction. On commit, the container passivates this entity bean. In a clustered environment, this option is great, because it protects against reading any entity beans from an out-of-date cache.

To serialize access to critical tables in our application, we used our Oracle database as a distributed locking mechanism. JBoss/Jetty CMP has a nice feature called "select-for-update." On an entity bean load, CMP sets up the SQL load statement, which contains syntax for performing a row-level lock. This feature, combined with commit option "C," gave us synchronized, serialized, isolated access to our entity beans.

Now, we did not use option "C" and "select-for-update" for all our entity bean types. For our read-mostly beans, or beans we considered to have very low concurrent access, we figured commit option "C" would be enough to guard against access problems. For our read-only beans, we used commit option "A" (this option actually makes JBoss/Jetty entity caching useful), along with the RequiresNew transaction attribute defined for every method. With commit option "A," database traffic is cut substantially when accessing these read-only beans because they are read from the database only once and cached locally in memory. Since JBoss/Jetty pessimistically locks entity beans as they enter a transaction, RequiresNew allows our read-only beans to be locked only for the duration of the method invocation and allows for a higher throughput on high concurrent access.

Load-Balancing

You should be able to scale a cluster simply by throwing more hardware at it. Hopefully a good solution would load-balance HTTP and EJB calls evenly throughout the nodes of a cluster. Our application takes input solely through our user interface (JSPs) and batched XML-RPC calls. So basically, all inputs are through HTTP. Here is where the CSS 11150 Cisco Load Balancer comes in. It really is a cool little box. The Cisco Load Balancer looks like one IP address and can route HTTP traffic to any awaiting HTTP server on any machine plugged into it. It can load-balance by specialized round-robin mechanisms, or you can set it up to create "sticky" sessions. Our XML-RPC traffic is load-balanced strictly round robin. Since JBoss/Jetty does not support a distributed HttpSession, we used the Cisco Load Balancer to set up "sticky" sessions. So if a user came to our site, they would be stuck on the same JBoss/Jetty instance for their entire HTTP session.

Fail-Over

If a node in a cluster goes down, HTTP sessions and EJB invocations should be recoverable and routed to other nodes in the cluster. We met this requirement by one, using our Cisco Load Balancer, and two, ignoring the problem. For our UI, we decided that it was acceptable to not have fail-over HTTP sessions or EJBs. Even when we were committed to our previously licensed commercial Web application server, we were considering not enabling its fail-over features, fearing the performance costs. The Cisco Load Balancer is able to detect node failures and not route HTTP traffic to dead nodes. So if a node fails in the middle of one of our users' sessions, they just have to re-login and start over. Since our UI is written on top of the EJB framework, user actions that are transactional would get rolled back by our database on a node failure. For our batched XML-RPC calls, we needed to make sure that clients would re-send their XML-RPCs on failures and that our service could detect duplicate messages. We had already decided to put these XML-RPC features into our architecture when we were using the previous Web application server, so this was not an issue in our decision to move to JBoss/Jetty.

If you have not noticed already, we do have a single point of failure. We have redundant load-balancers, multiple machines running app servers, but we do not have redundancy in our database. For this, we are hoping that Oracle can provide us with a fail-over solution. I will have to get back to you on this one.

Distributed Transactions

This is the ability for a transaction to span multiple resources, on multiple AppServers and databases. Luckily for us, we currently have only one database, and have no need for a two-phase commit or distributed transactions. All UI and XML-RPC actions are contained to one app server.

All and all, we have found that even though JBoss/Jetty does not have many clustering features in its current release, our hardware and software architectures can make up for these deficiencies. It is good to know that we did not have to pay any expensive licensing fees to get the scalable J2EE architecture that we required. Also, having the source code of our J2EE solution has really been a joy for us. Our application server has turned from a mysterious, unpredictable, unfixable, closed black box to an understandable, predictable, debuggable, open tool for application development. As clustering features are added to JBoss/Jetty, we will be able use this openness to accurately determine if these features fit into and scale in our application environment. But, for now, we are confident that we have a scalable "clustered" solution.

Bill Burke
is a Fellow at the JBoss division of REd Hat Inc. A long time JBoss contributor and architect, his current project is RESTEasy, RESTful Web Services for Java.