Learn with our tutorials and training

developerWorks provides tutorials, articles and other
technical resources to help you grow your development skills
on a wide variety of topics and products. Learn about a specific
product or take a course and get certified. So, what do you want to learn
about?

Featured products

Featured destinations

Find a community and connect

Learn from the experts and share with other developers in one of our
dev centers. Ask questions and get answers with dW answers. Search for local events
in your area. All in developerWorks communities.

This content is part # of # in the series: Java programming dynamics, Part 4

This content is part of the series:Java programming dynamics, Part 4

Stay tuned for additional content in this series.

After covering the basics of the Java class format and runtime access through reflection, it's time to move this series on to more advanced topics. This month I'll start in on the second part of the series, where the Java class information becomes just another form of data structure to be manipulated by applications. I'll call this whole topic area classworking.

I'll start classworking coverage with the Javassist bytecode manipulation library. Javassist isn't the only library for working with bytecode, but it does have one feature in particular that makes it a great starting point for experimenting with classworking: you can use Javassist to alter the bytecode of a Java class without actually needing to learn anything about bytecode or the Java virtual machine (JVM) architecture. This is a mixed blessing in some respects -- I don't generally advocate messing with technology you don't understand -- but it certainly makes bytecode manipulation much more accessible than with frameworks where you work at the level of individual instructions.

Javassist basics

Javassist lets you inspect, edit, and create Java binary classes. The inspection aspect mainly duplicates what's available directly in Java through the Reflection API, but having an alternative way to access this information is useful when you're actually modifying classes rather than just executing them. This is because the JVM design doesn't provide you any access to the raw class data after it's been loaded into the JVM. If you're going to work with classes as data, you need to do so outside of the JVM.

Javassist uses the javassist.ClassPool class to track and control the classes you're manipulating. This class works a lot like a JVM classloader, but with the important difference that rather than linking loaded classes for execution as part of your application, the class pool makes loaded classes usable as data through the Javassist API. You can use a default class pool that loads from the JVM search path, or define one that searches your own list of paths. You can even load binary classes directly from byte arrays or streams, and create new classes from scratch.

Classes loaded in a class pool are represented by javassist.CtClass instances. As with the standard Java java.lang.Class class, CtClass provides methods for inspecting class data such as fields and methods. That's just the start for CtClass, though, which also defines methods for adding new fields, methods, and constructors to the class, and for altering the class name, superclass, and interfaces. Oddly, Javassist does not provide any way of deleting fields, methods, or constructors from a class.

Fields, methods, and constructors are represented by javassist.CtField, javassist.CtMethod, and javassist.CtConstructor instances, respectively. These classes define methods for modifying all aspects of the item represented by the class, including the actual bytecode body of a method or constructor.

The source of all bytecode

Javassist lets you completely replace the bytecode body of a method or constructor, or selectively add bytecode at the beginning or end of the existing body (along with a couple of other variations for constructors). Either way, the new bytecode is passed as a Java-like source code statement or block in a String. The Javassist methods effectively compile the source code you provide into Java bytecode, which they then insert into the body of the target method or constructor.

The source code accepted by Javassist doesn't exactly match the Java language, but the main difference is just the addition of some special identifiers used to represent the method or constructor parameters, method return value, and other items you may want to use in your inserted code. These special identifiers all start with the $ symbol, so they're not going to interfere with anything you'd otherwise do in your code.

There are also some restrictions on what you can do in the source code you pass to Javassist. The first restriction is the actual format, which must be a single statement or block. This isn't much of a restriction for most purposes, because you can put any sequence of statements you want in a block. Here's an example using the special Javassist identifiers for the first two method parameter values to show how this works:

A more substantial limitation on the source code is that there's no way to refer to local variables declared outside the statement or block being added. This means that if you're adding code at both the start and end of a method, for instance, you generally won't be able to pass information from the code added at the start to the code added at the end. There are ways around this limitation, but the workarounds are messy -- you generally need to find a way to merge the separate code inserts into a single block.

Classworking with Javassist

For an example of applying Javassist, I'll use a task I've often handled directly in source code: measuring the time taken to execute a method. This is easy enough to do in the source; you just record the current time at the start of the method, then check the current time again at the end of the method and find the difference between the two values. If you don't have source code, it's normally much more difficult to get this type of timing information. That's where classworking comes in handy -- it lets you make changes like this for any method, without needing source code.

Ask the expert: Dennis Sosnoski on JVM and bytecode issues

For comments or questions about the material covered in this article series, as well as anything else that pertains to Java bytecode, the Java binary class format, or general JVM issues, visit the JVM and Bytecode discussion forum, moderated by Dennis Sosnoski.

Listing 1 shows a (bad) example method that I'll use as a guinea pig for my timing experiments: the buildString method of the StringBuilder class. This method constructs a String of any requested length by doing exactly what any Java performance guru will tell you not to do -- it repeatedly appends a single character to the end of a string in order to create a longer string. Because strings are immutable, this approach means a new string will be constructed each time through the loop, with the data copied from the old string and a single character added at the end. The net effect is that this method will run into more and more overhead as it's used to create longer strings.

Adding method timing

Because I have the source code available for this method, I'll show you how I would add the timing information directly. This will also serve as the model for what I want to do using Javassist. Listing 2 shows just the buildString() method, with timing added. This doesn't amount to much of a change. The added code just saves the start time to a local variable, then computes the elapsed time at the end of the method and prints it to the console.

Doing it with Javassist

Getting the same effect by using Javassist to manipulate the class bytecode seems like it should be easy. Javassist provides ways to add code at the beginning and end of methods, after all, which is exactly what I did in the source code to add timing information for the method.

There's a hitch, though. When I described how Javassist lets you add code, I mentioned that the added code could not reference local variables defined elsewhere in the method. This limitation blocks me from implementing the timing code in Javassist the same way I did in the source code; in that case, I defined a new local variable in the code added at the start and referenced that variable in the code added at the end.

So what other approach can I use to get the same effect? Well, I could add a new member field to the class and use that instead of a local variable. That's a smelly kind of solution, though, and suffers from some limitations for general use. Consider what would happen with a recursive method, for instance. Each time the method called itself, the saved start time value from the last call would be overwritten and lost.

Fortunately there's a cleaner solution. I can keep the original method code unchanged and just change the method name, then add a new method using the original name. This interceptor method can use the same signature as the original method, including returning the same value. Listing 3 shows what a source code version of this approach would look like:

This approach of using an interceptor method works well with Javassist. Because the entire body of the method is a single block, I can define and use local variables within the body without any problems. Generating the source code for the interception method is also easy -- it only needs a few substitutions to work for any possible method.

Running the interception

Implementing the code to add method timing uses some of the Javassist APIs described in Javassist basics. Listing 4 shows this code, in the form of an application that takes a pair of command-line arguments giving the class name and method name to be timed. The main() method body just finds the class information and then passes it to the addTiming() method to handle the actual modifications. The addTiming() method first renames the existing method by appending "$impl" to the end of the name, then creates a copy of the method using the original name. It then replaces the body of the copied method with timing code wrapping a call to the renamed original method.

The construction of the interceptor method body uses a java.lang.StringBuffer to accumulate the body text (showing the proper way to handle String construction, as opposed to the approach used in StringBuilder). This varies depending on whether the original method returns a value or not. If it does return a value, the constructed code saves that value in a local variable so it can be returned at the end of the interceptor method. If the original method is of type void, there's nothing to be saved and nothing to be returned from the interceptor method.

The actual body text looks just like standard Java code except for the call to the (renamed) original method. This is the body.append(nname + "($$);\n"); line in the code, where nname is the modified name for the original method. The $$ identifier used in the call is the way Javassist represents the list of parameters to the method under construction. By using this identifier in the call to the original method, all the arguments supplied in the call to the interceptor method are passed on to the original method.

Listing 5 shows the results of first running the StringBuilder program in unmodified form, then running the JassistTiming program to add timing information, and finally running the StringBuilder program after it's been modified. You can see how the StringBuilder run after modification reports execution times, and how the times increase much faster than the length of the constructed string because of the inefficient string construction code.

Trust in the source, Luke?

Javassist does a great job of making classworking easy by letting you work with source code rather than actual bytecode instruction lists. But this ease of use comes with some drawbacks. As I mentioned back in The source of all bytecode, the source code used by Javassist is not exactly the Java language. Besides recognizing special identifiers in the code, Javassist implements much looser compile-time checks on the code than required by the Java language specification. Because of this, it will generate bytecode from the source in ways that may have surprising results if you're not careful.

As an example, Listing 6 shows what happens when I change the type of the local variable used for the method start time in the interceptor code from long to int. Javassist accepts the source code and converts it into valid bytecode, but the resulting times are garbage. If you tried compiling this assignment directly in a Java program, you'd get a compile error because it violates one of the rules of the Java language: a narrowing assignment requires a cast.

Depending on what you do in the source code, you can even get Javassist to generate invalid bytecode. Listing 7 shows an example of this, where I've patched the JassistTiming code to always treat the timed method as returning an int value. Javassist again accepts the source code without complaint, but the resulting bytecode fails verification when I try to execute it.

This type of issue isn't a problem as long as you're careful with the source code you supply to Javassist. It's important to realize that Javassist won't necessarily catch any errors in the code, though, and that the results of an error may be difficult to predict.

Looking ahead

There's a lot more to Javassist than what we've covered in this article. Next month, we'll delve a little deeper with a look at some of the special hooks Javassist provides for bulk modification of classes and for on-the-fly modification as classes are loaded at runtime. These are the features that make Javassist a great tool for implementing aspects in your applications, so make sure you catch the follow-up for the full story on this powerful tool.

Downloadable resources

Related topics

Javassist was originated by Shigeru Chiba of the Department of Mathematics and Computing Sciences, Tokyo Institute of Technology. It's recently joined the open source JBoss application server project where it's the basis for the addition of new aspect-oriented programming features. Javassist is distributed under the Mozilla Public License (MPL) and the GNU Lesser General Public License (LGPL) open source licenses.