It?s been an interesting month for me. Jack and I have just returned from Hong Kong where
we gave the Asian premiere of our Java Performance Tuning courses. For both of us,
it was our first trip to Hong Kong, and we both found Hong Kong to be a lovely place.
And for all of you that are hearing that companies
are no longer willing to bear the expense of training their employees, what we are
finding is that although there is most likely some truth to that statement, there
are still many progressive companies that see value in providing their employees with
opportunities to enrich their skill sets. In addition to being able to assist in that
process, it also gives me an opportunity to advance my knowledge. Yes, that?s right,
the trainees are also training the trainer. In every instance, it happens in a slightly different way.

Since some of our exercises do not have a specific best answer, we always have a couple
of students that come up with some very inventive answers that satisfy the requirements.
Our experience in Hong Kong was no different. Also, we often get asked questions that
are directly related to the trainees day to day problems. When that happens, it?s a
great opportunity for everyone to learn. This too happened in Hong Kong. In this case,
the problem was an application that appeared to be bottlenecking on a search. The search wound
it?s way through tens of thousands of objects looking for ?almost perfect? matches. It
was this inexact comparison aspect of the search that eliminated many traditional optimizations. So, Jack
and I sat down over dinner and we talked about a couple of possible solutions. When we
got back in the next morning and started to discuss the problem again, we asked a few
questions so that the whole class could catch up, then follow along. One of the first questions
that we asked was, ?did you profile the code?. Not surprisingly, the answer was no. I
say unsurprisingly because it?s very common that people do not profile the code when
they have a bottleneck. They just note that this part of the application takes a long
time to execute and since this is the complicated bit, it must most certainly be the bottleneck.

Well, Jack being the geek that he is, (and I do mean that with the deepest of respect),
sat down in the airport and wrote a simulation of the searching aspect of the problem.
After a few minutes of pounding away on the keyboard, he pops up from behind the screen
and comments that he doesn?t see that the timings on the search are all that bad. This
result confirms what we both felt, were we trying to solve the right problem? From the
results of the simulation, it would appear as if the answer was no, we were not going
after the bottleneck. Which meant we had insufficient information.
Though we did not spend a lot of time on this problem, we still
did spend some time on it. The lesson that we can take away from this experience is that profiling
gives us the metrics that we need to ensure that we are tackling the real (and not the
perceived) problem. In other words, profiling helps us to use our time much more effectively.
That said, lets move on to the Java Ranch to see what is current topic of discussion down at the Saloon.

Top of the list is the question, "how to code review for performance"? For a posting to collect
so many responses that are all tightly focused on the original question, you know that
you?ve run into a topic where people?s experiences are universal. The answers to this
question bear that out. Unequivocally, every response gives the advice that one unless
the programmers have not followed best practices, the chances of improving performance by
only inspecting the code is close to zero. Every posting offered the advice that the first
step was to profile the code. One posting even offered the advice that Jack and I have always
given, the first step is to set performance goals. There are many reasons for this but, in
this case, performance tuning the application is an effort that will consume resources. By
setting specific performance targets, you will be able to limit the quantity of these resources
that will be consumed by the process. Without performance targets, when will you stop tuning any
particular piece of the application? When will it be fast enough?

A number of the postings went on to back up their advice with some real life experiences.
And, just to show you how durable and ubiquitous this advice is, one posting offered an
experience acquired in the 70?s while working on an application running on a mainframe
running in (you guessed it) Cobol! Tuning is a very dynamic process. Code reviews are
very static. Given these two facts, it is not surprising that code reviews are generally
not able to provide many improvements in overall performance.

In yet another posting, the thread focused on efficient caching. The real question
seemed to be, if I have a number of objects in a collection, how can I find them
given that I may need to search on a range of values. Of course, this is what relational
technology is very good at doing, so it was a foregone conclusion that some of the
postings would suggest that the query be done in a relational database. In many cases
that answer would work but in this instance, the whole purpose of caching was to avoid
a trip to the database. In fact, the purpose of any caching technique is to wrap a much
slower technology so that we can avoid having to call upon it. Now one could just use
an in-memory database, but that is not always a good option, and one still has to incur
the costs of inflating objects. So, if moving the data into a relational model is not a
solution then lets move the relational process into the object model.

In the relational model, adding more indexes to a table enhances searching capabilities.
In the object world, a collection is analogous to a table. So, it seems logical that we
should create a special collection that contains multiple collections (or indexes) on
the underlying data. In this instance, we are interested in searching on a range and
adding a N-M tree is a data structure that easily allows one to conduct that type of
search. So we can add HashMaps where appropriate and also add an N-M tree to cover the
range searching. If this sounds expensive then yes, you?re correct it does add some
expense. But, then again, no multiple indexing comes for free. And when you consider
the alternatives, the extra expense of a multiple indexing scheme is well worth it.
After all, if it wasn?t, then it wouldn?t be so prevalent in the relational world.

With that, we move on to the Server Side where a question is being asked as
to which J2EE application server one should use. For the most part, the
responses run like a popularity contest, but that all ends with an intelligent
post that points out that the best load balancing technique is to put a load
balancer in front of the application server and ensure that their applications are stateless and they use optimistic transactions. Ensuring that your application
is stateless means that your servers will not have to replicate state. This is
often a very expensive operation that can result in serious performance degradation.
The second point is to use optimistic transactions. As we all know, holding onto
locks for longer than is necessary can also degrade performance. Using both of these
techniques seems like a very sensible choice.

It would seem that the performance implications of AOP (or Aspect Oriented Programming)
is now being considered. I?m not sure if this implies that AOP is mature enough that it
is starting to move into the mainstream of thinking, or that we have just run into a
very progressive post. In either case, does AOP help performance? Well, I don?t see how
AOP can help performance. I do see how it might hurt performance or even obfuscate the
performance tuning process. Imagine that you?ve identified a bottleneck in an injected
piece of code. First, how would you know that it?s injected and secondly, how would you
find the source? My guess is that if you were familiar with AOP, the fact that the byte
codes may not align to any source would not be a point of confusion. Having said this,
how many of us are already confused when everything is (supposedly) apparent. It would
seem to me that of all the advantages that AOP would seem to offer, improved performance
would not be one of them. At best, it should be neutral but as the technique is still new,
the jury is still out on that one.

Last but not least, we visit my favorite discussion group, the Java Gaming group at
http://community.java.net/games. As I scanned the list of topics, I?m stunned to see
?Converting from JDK 1.4 to JDK 1.1? appear in the list. Why on earth would one want
to convert back to the JDK 1.1. Surely, this must be a mistake! I drill down into the
posting and sure enough, someone wants to migrate to the JDK 1.1 and for a perfectly
valid reason: not many people have taken the time to install the latest JDK. In fact,
I can?t imagine any non-geek taking the time to install any JDK other than that which
came with the OS. And in this case, that VM is Microsoft?s. It?s a shame to see
progress stifled due to politics and questionable business practices but unfortunately,
this is what has motivated the subject of this thread. It is interesting to see the
thread run through many of the improvements in the Java platform that had been inspired
by this participants of this gaming group.

The last thread that we look at this month is one concerning the JNI. The posting is a
reminder that although we are playing in a sophisticated piece of software that is capable
of managing multiple memory spaces, we are still dealing with a C application and with it,
all of its strengths and frailties. The question is quite simple: does using the JNI allocate
space on the Java or OS heap? For clarification, the Java VM is a process like any others.
And though its memory model it dictated by the OS in which it has been compiled to execute in,
there are some commonalities or things that we can count on. One of these commonalities is
that the process will contain heap space. It is out of this heap space that the Java VM
allocates its ?memory spaces?. In the Sun VM, that includes a new space, old space, survivor
spaces and perm space. All Java objects are created in one of these Java Heap spaces.
All other structures are created in the process heap space. This includes all of the structures
needed to support the interactions between Java and process heap spaces. So, we can see from
this that any space used by the JNI will be created in process heap space. Surprisingly, this
can have an effect on performance as improper or excessive use of the JNI can bloat the process's
in-memory image size. But most Java profiling and monitor tools only work within
Java heap space. Is this yet another reason to avoid using the JNI unless it is absolutely necessary?