Performance Analysis of J2EE Applications Using AOP Techniques

In a complex distributed computing environment like J2EE, it is very difficult to pinpoint the component that is causing a performance bottleneck. Applications can be profiled by including instrumentation code manually, but this could be cumbersome and time-consuming, and might impact the stability of the application itself. Aspect-Oriented Programming (AOP) technology can be elegantly and effectively applied for performance analysis, as illustrated by Davies et al.

Aspect-oriented programming allows the programmer to inject pieces of functionality into existing code. This can be done either during compile time (AspectJ) or during run time (Aspectwerkz). The functionality that is injected typically addresses cross-cutting concerns spread among existing code pieces. In AOP terminology, such functionality that can be injected into existing code is termed an advice. The point of execution in the existing code where the advice needs to be applied is termed a point-cut. The point-cut together with an advice is termed as an aspect. For more information on AOP, refer to Graham O'Regan's ONJava article "Introduction to Aspect-Oriented Programming."

In this article we demonstrate the use of AOP techniques through which J2EE applications can be easily instrumented without any modifications to application code. We have developed a very simple tool to achieve the above objective. Since the instrumentation has very low overhead, this tool can be deployed in the staging environments to identify problematic Java method calls and SQL statements.

We describe the architecture of the profiling tool and then the advices that were developed to instrument the application. This is followed by an illustration of how the instrumentation can be added to the necessary method calls through point-cuts, and finally, we show some of the results obtained through this tool.

Architecture

The architecture of the system is shown in Figure 1 and detailed in the following sections.

Figure 1. Architecture of AOP Profiler

AOP Infrastructure

We considered both AspectJ and Aspectwerkz for providing the AOP infrastructure and chose Aspectwerkz, since it does not require the J2EE application to be re-compiled. Due to this capability, we can profile an existing J2EE application without any additional development activity. But we also noticed that
Aspectwerkz introduces a tiny overhead as compared to AspectJ; this is due to reflection (java.lang.reflect.Method.invoke()) used by Aspectwerkz to incorporate the advices into the application code. By defining the point-cuts in a simple XML file, Aspectwerkz makes it much easier in our situation to decide which methods need to be profiled.

Agent-Server Architecture

To reduce the overhead on the application that is being profiled, we use a agent-server
architecture. The aspects incorporate lightweight code that captures the timing
information and then transmits this information to a server which is expected to run on
a different machine on the network. The server parses this information and stores this
information in a MySQL database. Since all of the profiling information is
in a database, we can write different kinds of SQL queries to view the profiling data from
different angles. For example, with a simple group by with an avg function, we can get the average method execution times of all the methods. We can sort this
list using order by to pinpoint the most expensive method.

Capturing CPU Time

While it is easy to capture the elapsed time using System.currentTimeMillis(), this
measure is not accurate in all situations, especially if there are contentions. The CPU time
is a more accurate measure of the execution time of a method. We capture the CPU time using
the JVM Profiler Interface (JVMPI). Please refer to "Using JVM Profiler Interface for Accurate Timing," by Jesper Gortz, for more information.

Capturing SQL Execution Time

Most J2EE applications are data-centric and typically persist data in relational databases.
A critical aspect of performance analysis of a J2EE application would therefore rely
on the timing information of the SQLs fired from the J2EE application. We have managed
to capture this information by utilizing the P6Log
driver. This piece of software acts as a layer between the J2EE connection pool
and the actual JDBC driver and captures the timing information of the SQLs fired.
We apply aspects to this software to retrieve that information.

Capturing the Sequence of Method Execution

Apart from obtaining the individual method execution times, it is also helpful to
observe the control flow in the container to fulfill a request. We capture the
flow of information using ThreadLocal variables.
The ThreadLocal variable holds a unique ID for each request, along with
a sequence number that runs in the order of the method execution. The limitation
in this implementation is that the sequence can be captured meaningfully only when all
of the components that are to be profiled are executed in the same JVM; i.e., there are
no remote calls. A servlet filter is utilized to trap all requests and an aspect
is applied to the filter method to initialize the ThreadLocal variable with a new
request ID and a sequence starting from 0.

Using Advices to Capture Performance Data

In this section, we describe the three different advices we use to capture performance metrics
from a J2EE application. The first advice is used to capture the sequence of method
execution, the second one captures the performance metrics of Java method execution,
and the third one captures the SQL execution times. These classes and some supporting code are available in the source code link at the bottom of the article.

StartRequest Advice

This advice is used to capture the sequence of method execution.
This advice typically should be applied in the entry point of the
particular layer of the J2EE application. In a web layer, it can either be applied
to the main controller servlet (if MVC is implemented), or to a servlet
filter that filters all requests to the web layer. This advice sets a new request ID to
the ThreadLocal variable and resets the sequence count to 0.