In the last entry we discussed how the TPROF utility that is part of the Performance Inspector package of tools can be used to carry out performance profiling where HealthCenter is not available. HealthCenter has a number of other capabilities, including: lock analysis, garbage collection monitoring, class loading and environment information. This leads to the question: "Are there any other tools I can use on earlier releases of Java where HealthCenter is unavailable?"

Lock analysis for Java 1.4.2: JLM
The answer is yes, at least for the case of lock analysis. The Performance Inspector package also provides a utility called Java Lock Monitor (JLM) which provides exactly the same data that you see in HealthCenter, although without the extra level of analysis that highlights potential or actual problems in locking data.
JLM is available for AIX, Linux, Windows and z/OS where the IBM JRE is being used and provides data on which locks, both Java object locks (synchronized objects and blocks) and system locks (JVM monitors and JNI requested raw monitors), are being used and how often those locks are contended: ie. are a bottleneck in the code execution.

Enabling Java Lock Monitor for a Java application
In order to track the lock usage of the Java VM and the Java application a JProf agent, supplied as part of the package from the Performance Inspector tools site, needs to be loaded into the Java process. This can be done as follows:
For Java 5.0 and earlier:

java -Xrunjprof AppName

For Java 5.0 and later:

java -agentlib:jprof AppName

The JProf agent causes a listener socket in the Java process which waits to be notified by the supplied rtdriver utility to start and stop data collection, and to write out the gathered data to file.

Starting the Java process with the JProf agent causes the following information to be written out the console:

Collecting lock data from the Java application
Once the Java application has been started with the JProf agent loaded, the rtdriver utility can then run which connects to the JProf socket on the local machine. This starts a command prompt process which can be used to control the data gathering in the Java process:

Of the system monitors, the Heap lock is the only one likely to be of interest as contention of the heap lock means that a extremely high rate of object allocation is occurring and this is causing a performance bottleneck.

The "Java (Inflated) Monitors" section lists all of the Java synchronised objects that have had some degree on contention between threads. If these has been no contention then the synchronised object will not be listed:

What to look for in the log-jlm report
The most important value to look at is the %MISS, which gives you an idea of how often a thread was blocked trying to get a lock because it was already owned by another thread. The higher this %MISS value, the more contention that has occurred on that lock.
Where there is a high %MISS value showing contention, the next value to look at is the AVER_HTM, which is the average time that the lock has been held by a thread. If this value is low then the contention is occurring because many threads are trying to take the same lock, and reducing the number of threads will likely improve the performance. If the value is high then the lock is being held for a long time because a lot of work is being done under the lock, and moving some of the work to outside of the lock will likely improve the performance.

Differences between JLM and HealthCenter
There are very few differences between JLM and HealthCenter in terms of the data presented, as they both gather the data in the same way. HealthCenter however has the added level of helping you to interpret the data, and to alert you to locks which may be causing a performance bottleneck.

Sample based profiling for Java 1.4.2: TPROF
One option for sample based profiling at Java 1.4.2 is Time Profiler (TPROF) which is part of the Performance Inspector package of tools available from sourceforge.net.
TPROF is a system profiling tool available for AIX, Linux, Windows and z/OS that records what process, thread and code is executing on CPU(s) using time based sampling. This provides information on what C language functions are executing and occupying CPU time, and with the addition of an agent into the Java process will provide information on what Java language methods are executing.

Enabling Java method sampling in TPROF
In order to track the Java methods being executed in addition to the C functions a TPROF agent, supplied as part of the package from the Performance Inspector tools site, needs to be loaded into the Java process. This can be done as follows:
For Java 5.0 and earlier:

java -Xrunjprof:tprof,fnm=D:\ibmperf\bin\log,pidx AppName

For Java 5.0 and later:

java -agentlib:jprof=tprof,fnm=D:\ibmperf\bin\log,pidx AppName

If you are running Java on AIX, a version of TPROF is available as part of the operating system install.
Information of setting up and using the AIX TPROF to profile Java applications is available in the Java Troubleshooting Guide.

Running TPROF to sample the Java application
Once the Java application has been started with the TPROF agent loaded, you need to start the TPROF data gathering process, which will begin sampling the data from the system. The most simple mode for running tprof is using:

and will create a trace file and a tprof.out file, which is the report of the data gathered by TPROF.

TPROF in trace only mode using the "-t" flag, which produces just the trace file which has be post processed later using run.tprof with the "-p" option. The start and stop of the data gathering can also be set automatically using the "-s" flag which sets a delay time to start profiling and the the "-r" option to specify the time to gather data for:

run.tprof -t s10 -r 60

This will run TPROF for 60 seconds from roughly 10 seconds after the run.tprof command is entered, and product a raw trace file called swtrace.nrm2 which can then be post processed to produce the report file using:

Interpreting the tprof.out report
The TPROF report is broken down into sections. The section that is the closest equivalent to that provided by HealthCenter is the Process Module Symbol section, which shows the highest used Java methods or C functions on a per process basis.
Looking up the Java process, you get the following kind of output:

This gives a very similar output to what we saw previously when running with HealthCenter, with the majority of the time being spent in the compile() method of java.util.regex.Pattern.

Differences between TPROF and HealthCenter
HealthCenter has a number of advantages for Java method profiling over TPROF:

HealthCenter provides a live view, whereas TPROF only provides snapshots from a given sample period

HealthCenter has an Invocation Paths view, which shows the call relationships between methods (which method is calling which method)

TPROF however also has visibility into the C code that is running in the Java process, and as such will show you if the CPU time is being spent either in the Java Virtual Machine (JVM), for example running Garbage Collection or class loading/unloading, or the CPU time is being spent in Java Native Interface (JNI) code.

Java applications run in a virtualized Java Runtime Environment (JRE), which provides the Java heap that is commonly associated with Java memory usage as it is where Java objects are created and managed for the application.
The JRE process itself is a native process running on the Operating System (OS), in the same way as any other native process would. As such it consumes native resources, including memory. This "native" memory consumed by the JRE process includes all of its allocations, for example: the Java Virtual Machine (JVM), Just In Time (JIT) compiled code, Java Native Interface (JNI) allocations used to back Java objects, as well as the Java heap itself.

The main reason for monitoring "native" memory usage is the same as for Java heap memory: should the available memory become exhausted, an OutOfMemoryError will be generated. In addition, it's useful for understanding whether of not it's possible to increase the Java heap size should you want to. As the Java heap is one of the constituent parts of the native memory usage, increasing the Java heap size will increase the native heap usage and will lead to an OutOfMemoryError due to native heap exhaustion if it is increased to the point that the native memory usage exceeds the limit of what is available.

An excellent discussion on this topic, which includes what the OS memory limits are, is available from Andrew Hall's article "Thanks for the memory"on developerWorks.

Monitoring native memory usage in GCMV

GCMV can load and parse native memory usage data generated using svmon on AIX, ps on Linux and perfmon on Windows. Full instructions on how to generate the data are provided in the "Help" information provided by GCMV in ISA in the following section:Tool: IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer > Using Garbage Collection and Memory Visualizer > Supported data types > Native memory

Once loaded in to GCMV, the memory usage data is available as raw, structured and tabbed data, graphed to produce a line plot, and a report provided that does statistical analysis and looks for potential memory leaks.

You may have realised that, as GCMV is graphing native memory, it's not specific to Java applications so can be used for any process. In fact, the output above is from the web browser running on my desktop.

Have you ever received a "JVM Terminated. Exit code=-1" message when starting an Eclipse application?
Unfortunately this message doesn't tell you much about what caused the Java process to exit. Hopefully some of the information below will help you to troubleshoot and solve this problem.

Solving the Problem?

There are a few methods to get more information from the failure and troubleshoot the cause of the problem, but it's actually worth trying a common solution to this problem first: reducing the Java heap size.
In order to reduce the Java heap size, you need to add the "-vmargs -Xmx512M" options to Eclipse, which can either be done on the command line or by modifying the .ini file in the same directory as the Eclipse executable. Once this is done, Eclipse should hopefully start successfully and the problem is solved.
If you'd like more information on what the maximum Java heap that will run on Windows is, John Pape has an excellent blog post here

If this doesn't solve the problem, then we have some troubleshooting to do.

Troubleshooting the Problem

The reason for trying the solution of reducing the Java heap size first is that it's a problem that is very hard to debug under Eclipse. The IBM JVM does produce some useful messages if the heap size is too large:

However as Eclipse is still starting the JVMs error message can't be logged to the Eclipse error log in workspace/.metadata/.log, and if Eclipse isn't being run from the MS DOS prompt or console, the message is lost.
Also, because allocating the Java heap is one of the first actions the JVM takes during startup, a number of the JVMs troubleshooting capabilities are not yet enabled either. So ff reducing the Java heap size has not solved the problem, then the JVMs troubleshooting capabilities should be available to us to do further debugging.

Obtaining more information an the "JVM Terminated. Exit code=-1" Event

The IBM JVMs offer several mechanisms for getting more data from this event. In this case the best option would be to request the generate a Javadump file (javacore.txt) when the Java process exits. This can be done by adding the following command line argument to the Java, again using the -vmargs option to Eclipse or the .ini file:

-Xdump:java:events=vmstop

This requests a dump of type "java" (a javacore.txt file) on an event of "vmstop" (JVM shutdown). It's also possible to specify additional options to specify which exit code to produce dumps on, and where to write the javacore.txt file:

-Xdump:java:events=vmstop,filter=#-1 only produce a javacore.txt file if the exit code is -1 -Xdump:java:events=vmstop,filter=#-1,file=c:\javacore.txt write the file as c:\javacore.txt.

When specifying the file option, there are a number of substitutions that you can add into the file name. By default the file name is:

javacore.%Y%m%d.%H%M%S.%pid.%seq.txt

where the substitutions are for year, month, day, hours, minutes, seconds, process id, and sequence.

This should cause a javacore.txt file to be generated when the exit code -1 is received. The javacore.txt gives you a whole host of information about the state of the Java process, including the command line arguments and the stack traces of all the running threads.
Looking for the "Current Thread Details" section will give you the stack trace for the thread that exited, and this should give you an idea of why the JVM shut down.

In addition, here are the questions and answers from the end of the session:

Q)Is there any way I can configure which port HealthCenter uses?A) Yes. You can configure the port used by HealthCenter using the "port" option to the HealthCenter agent when you start the monitored application:
eg. java -Xhealthcenter:port=<port_number> HelloWorld
When you start the HealthCenter client you then specify which port you are using in the connection wizard.

Q) ‪Which tools are ideal for copying logs, dumps, etc from a unix production system to a laptop for analysis?A)Diagnostic Collector will collect and package the data for you on the production system so that you only have a single file to transfer. You can then use the collector capabity under "Collect and Send data" in the IBM Support Assistant (ISA) to bring the data from the production machine on to the laptop running ISA.

Quite often when I write a Java application I'll get a feeling that it's slower than I expected completing a particular operation. In the past I've usually resort to code inspection and printlns to time execution of sections, using trial and error!

In this post, I'll explain to you how I used the Health Center tool to investigate a badly-performing Java app. The problem turned out to be a simple lack of foresight when I wrote one of the methods for the app.

Pre-Reqs

I've installed the Health Center agent into the JVM running my app, and I've got the Health Center client installed within the IBM Support Assistant. Installation instructions are available here, and a YouTube™ video can be found here.

Whenever I start the JVM running my app, I'm using the JVM options suggested in the help documentation for Health Center to enable the agent.

Finding out what's wrong

I want to improve a simple Java app which extracts some statistics from my Twitter account. The particular feature picked out here is to track the number of followers I have, over time. I've been downloading my followers list in xml format a few times a day for quite a while and have ended up with a couple of thousand of these xml files. There are more efficient ways to count followers, but I want to do more processing on the file at the same time so this approach fits for me.

Running the application, which opens each file and scans through each line for the <screen_name> tag to find follower names, I noticed that it was very slow - took about 25 seconds to go through the couple of thousand files. Running the code and looking at the task manager, I could see that the app was using a lot of processor time, which suggested to me that the bottleneck was in processing rather than file loading. I dug out the Health Center to get an idea of what was happening.

Connecting the Health Center client to my application and looking at the Profiling view, I can see the view of hot methods:

From this I can see that java.util.regex.Pattern.compile() is the hottest method - over 20% of the samples had compile() as the currently running method. Given what my application is doing, there should be some regex, but I wasn't expecting it to be far and away the most expensive task. Knowing that this is at the top of my list, and I hadn't expected it to be, there are two ways I can reduce its cost:

Reduce the amount of processing done within the compile() method

Reduce the number of times compile() is called

The compile() method is part of java.util.regex.Pattern and I can't change its source, so I'll have to look at reducing how often it's called.

Looking at the Invocation Paths view to see which methods call Pattern.compile():

I can see that my FollowerCount.findScreenName() seems to be the culprit for most of the calls. Here's the source for that method:

This method is being called for every line of every file we try to read. The pattern returned from the Pattern.compile() call will be the same for every invocation, so we can change the code to cache the pattern after the first time:

As a member of the team working on the Java implementation here at
IBM, I'm lucky enough to have access to some extra resources that are
very useful. For example, if a Java application I wrote isn't behaving
as I expected, I can have a look inside some parts of the Java
implementation code, and try and work out what's going on as my
application runs. If that doesn't help, I can go and sit at the desk of
someone who really knows the Java implementation code, because they
wrote it!

Unfortunately, most of our customers aren't so fortunate. In IBM,
we've been thinking a lot about how we can help, by providing more
information about the implementation 'black box' to the outside world.

The team I'm in at the moment have been developing a tool - IBM
Monitoring and Diagnostic Tools for Java - Health Center. The Health
Center tool provides live diagnostic information about the VM, and
helps you identify some bottlenecks that can occur in your application.
This information is gathered with a low overhead, so there is almost no
impact on your application.

The kind of information you can get from Health Center includes
method profiling, GC details, locking information, class loading, and
details about the JVM environment. Here's a screenshot of the profiling
information to whet your appetite:

In the next blog post I'll explain a bit more about the profiling
view, and how you can use it to track down inefficiencies in your Java
applications.

On 02 July 2009 at 11:00 AM EDT, there will be a WebSphere Support Technical Exchange presentation on "Low overhead performance monitoring for your JVM with IBM Monitoring and Diagnostic Tools for Java - Health Center". The presentation will be delivered by Dave Nice, one of the original Health Center developers in the IBM Java Team.

You will learn more about IBM Monitoring and Diagnostic Tools for Java - Health Center, a new performance tool for the Java 5 and Java 6 SDKs. Health Center provides live monitoring of applications, with very low overhead. It can help you with method profiling, lock profiling, garbage collection, and class loading information.

The presentation is expect to last for 30-45 minutes followed by a question and answer session.

More details on the webcast are available from the WebSphere Technical Exchange Webcast page:

You might have noticed that the Java Team at IBM has been running a forum for a while. It has given us a better understanding of the sort of
problems you encounter when working with IBM Java runtimes. Hopefully,
you find the forum interesting, and it has helped solve your issues.

Generally, forums are reactive; the conversation follows a
particular problem or question. The responses are focussed on helping
with that particular issue. We want to be more proactive, so we thought
we'd introduce this blog as a way of discussing tools, capabilities and
methods for solving problems in a more general way. The blog will help
provide some important feedback on how we might make it easier to troubleshoot IBM Java problems.

The idea is that different members of the IBM team will blog on a
regular basis. We'll talk about problems we've seen, and how we applied
troubleshooting tools to solve them. We'll look at tried and tested
ways of solving common problems, and new features and capabilities.

If you have any ideas for subjects you'd like to us to blog about, go ahead and add a comment - we'll see what we can do!