Using a Request Filter to Limit the Load on Web Applications

How do you respond when a web application is running a little slow? If you are like me, you try to help it along by clicking on another link, refreshing the page, using the Back button, or otherwise sending more requests to the server. Although this may seem like an innocent way to get the site "unstuck," it really just makes the situation worse by increasing the load.

Consider reading the front page of an online newspaper. A headline at the top of the page catches your attention, so you click on the link. While waiting for that page to load, you also decide to click on a link for tomorrow's weather that you found halfway down the page. A split-second later, you spot what you were really looking for in the sports section at the bottom of the page, and click on that link. As an IT professional, you probably spend a lot of time reading web pages, and are efficient at skimming sites. So, as in this example, it is conceivable that you may click three links on a familiar page within the three or four seconds it may take for a server to start sending back a response from your first request.

In this article, we present the design of a filter that synchronizes client requests and restricts the load each user can put on your applications. The source is available as RequestControlFilter.java. Application designers can use our filter to prevent a downward performance spiral where well-intentioned users bring an already overloaded server to its knees.

In the example above, you changed your mind twice and sent three requests to the server. You only care about the last request. Our filter will help the server detect that you do not care about the others and prevent it from doing unnecessary work to process them.

Restricting Requests

To control the load and prevent unnecessary processing, we want to restrict the server so that it only processes one request at a time per session. This limits the load any individual user can put on the server, regardless of how aggressively they click through the application. We do this with a prioritized request queue. Impatient users will make multiple additional requests while waiting for a first one to finish, so we need intelligence in the queue to handle the user's requests without doing unnecessary work on the server. This is how the queue works:

The queue holds a maximum of one request at a time.

A new request always replaces an old request in the queue, except for the request that is currently being processed by the application.

This queue models the way that browser software handles a user clicking through a web site. In the earlier example, your web browser will request the headline article, then the weather, and finally the sports article. However, it will only display the sports article from your last request, even if the server starts sending back the other pages first. Similarly, here is how the example plays out in with the filter installed on the server:

The server receives the request for the headline article. The queue is empty, so the server starts processing this request immediately.

The server receives the request for the weather, and since it is still processing the previous article, this second request is placed in the queue.

The request for the sports article arrives, and it replaces the weather request in the queue.

After the server finishes processing the headline article, it processes the request for the sports article.

With the queue design presented here, the web server detects that it does not need to waste CPU cycles processing the request for the weather.

This code fragment implements the queue. Requests are queued per session.

synchronized( getSynchronizationObject( session ))
{
// if another request is being processed,
// then wait
if( isRequestInProcess( session ) )
{
// Put this request in the queue and wait
enqueueRequest( httpRequest );
if( !waitForRelease( httpRequest ) )
{
// this request was replaced in the
// queue by another request so it need
// not be processed
return;
}
}
// lock the session, so that no other
// requests are processed until this
// one finishes
setRequestInProgress( httpRequest );
}
// process this request, and then release the
// session lock regardless of any exceptions
// thrown farther down the chain.
try
{
chain.doFilter( request, response );
}
finally
{
releaseQueuedRequest( httpRequest );
}

This implementation expects that the HttpSession is not clustered across multiple virtual machines. Although HTTP session clustering is supported by recent versions of the major servlet containers, the popular consensus is that it is rarely justified. It would be possible to build a filter that synchronizes requests spread across multiple servers within the same session, but the overhead may exceed the potential gain. In those environments, an easier solution to performance problems may be to increase the size of the server farm.

We implement the javax.servlet.Filter interface. This feature of the Servlet 2.3 API allows us to intercept requests before any servlet has a chance to process them. The application server takes care of setting up a chain, and calls each filter's doFilter() method. In ours, the implementation decides between continuing down the chain, and canceling the request.

To use the filter with an application, it needs to be registered in the filter section of the web.xml file, as in this example:

This filter is useful not only for limiting the load on the application server, but also for simplifying the programming model. Since the server will only be processing one request per session at a time, there may not be a need to worry about synchronizing access to objects in the session context. Only objects that could be accessed from multiple sessions might need to be synchronized.