Profiling Again

Now on with the tuning. Algorithm analysis is no longer useful, as we've already fully analyzed the algorithm, and the variations between the tests using different data sets can be explained by the different number of Boolean tests that are executed, depending on the different types of string parameters passed to the method. So we proceed with profiling. Profiling the latest version of the method indicates that the next bottleneck is the Integer creation, specifically in parsing the string to create the numeric value. For our method, the parse is actually very simple, and we may be incurring the overkill cost of using the more generic algorithm that Integer must have. In fact, we are already parsing the string once, to make sure that the characters are all digits. It is quite a small step to change the looped digit test to add in the extra code that extracts the integer value:

And we will obviously change the Integer creation method to use the int value rather than the String. The test measurements produce the best results so far, as shown in Table 6.

Table 6: Adding number parsing

Dataset 1

Dataset 2

Dataset 3

Baseline

100%

84.1%

540.0%

With looped digit test

114.8%

66.1%

51.4%

Restructured algorithm

46.6%

39.9%

29.8%

Eliminating toString() calls

51.9%

40.7%

34.7%

Restructured algorithm with looped digit test

54.2%

41.6%

33.2%

As last, with number parsing

36.2%

28.6%

23.0%

Round Three and More

The next profile, using the latest version of the method, shows that Character.isDigit() takes the most significant time, followed by String.equals() and garbage collection. The latest JVM (1.4.0) has possibly the most efficient garbage collection mechanisms of the JVMs released so far. Earlier JVM versions would have probably indicated garbage collection as being a bottleneck much earlier in tuning. My inclination would be to leave Character.isDigit() alone, since it should already be a simple test in the Character class, and it is a static method, so if it can be inlined efficiently the JIT compiler should manage it.

The String.equals() method is overkill to test for an empty string. It is quicker to test if the length of the string is 0. And the garbage objects being reclaimed are all of those Integer objects we are generating each time through the method. We don't even need the Integer objects now, since we are parsing the string directly and have the int value. So we can change our method to:

The measured times for all of the tests are listed in Table 7. Hands up: who predicted these or equivalent optimizations in the challenge at the beginning of the article? Well done if you did; I didn't. The equals() method turning up as a bottleneck came as a complete surprise to me.

When to Stop Tuning

I've labeled Table 7 as "the final version." Why am I stopping now? Why don't I carry on profiling, looking for more speedups? No reason at all. In fact I should carry on, because I have set no target speed to reach. This is crucial to all forms of tuning: with no target, tuning can carry on for much longer than is needed. I could carry on until the profile shows nothing except the checkInteger() method itself. Then I could re-analyze the code, try different datasets, search for literature which may suggest further speedups, analyze the bytecode to see if it is the most efficient, analyze the generated native machine code on several platforms to see if it could be improved, look at why the method is being called to possibly eliminate the method completely. (Now that would terminate the tuning!). Always set a target before starting tuning, so that you know when to stop.

The Procedure

The procedure I've followed in this article is one that you can copy for any micro-tuning exercise. The difficulties often come more from trying to create reliable measurements and useful datasets, and in interpreting profile results, than in the step-by-step flow of micro-tuning. In macro-tuning, there is an extra step in each round, to determine which method or area of the application is the current bottleneck. Macro-tuning can lead you to focus on a different method in each round of tuning, rather than repeatedly attacking one method. To help you with your own tuning, here is a summary of the steps suggested by this article.

Identify the bottleneck: Identify a method that needs to be speeded up.

Set a performance target: Decide how much performance gain you need the method to achieve. With macro-tuning, it usually makes more sense to decide on an acceptable execution time for the task that uses the method.

Use representative data: Identify a dataset that is representative of how the method is used within the application. The dataset should cover all of the data needed by the method.

Measure the baseline: Take one or more baseline measurements as a reference. You may need to execute the method repeatedly to obtain acceptable measurements. With macro-tuning, the task that executes the method may need to be started repeatedly.

Analyze the method: Use one of the following analysis methodologies to determine potential speedups for the method: profiling, algorithm analysis, and analysis of the causes of any variations in measurements. If in doubt, use profiling.

Test the change: Test that the change determined by step 5 provides a speedup. If it doesn't, repeat step 5. Otherwise retain the change.

Repeat: Repeat from step 5 until the target speedup is achieved. For macro-tuning, repeat from step 1 until the target performance is achieved.

Jack Shirazi
is the author of Java Performance Tuning. He was an early adopter of Java, and for the last few years has consulted mainly for the financial sector, focusing on Java performance.