The article covers the use of Health Center to understand what a web application that has been deployed on the WebSphere Application Server is doing, and shows a good example of doing this with the Plants by WebSphere sample application.

There are a number of presentations, articles and entries in this (and other) blogs that talk about how to use the IBM Monitoring and Diagnostic Tools for Java™ (Health Center, Memory Analyzer (and IBM Extensions!), GCMV, etc), but there's nothing like learning to use the tools by solving real problems - and preferably in an environment where there's no time pressure to get the problem fixed!

As a result, we've created a hands on exercise which we termed the "Troubleshooing Masterclass" which was used at IMPACT 2011, the European WebSphere Technical Conference and a number of other places. This exercise is now available online for anyone to run via the "SOA Sandbox".

What is the Troubleshooting Masterclass?
The Troubleshooting Masterclass is an exercise in debugging problems with a faulty application running in WebSphere Application Server. You will be using the tooling included in the IBM Support Assistant -- the Health Center, the Garbage Collection Memory Visualizer, and the Memory Analyzer -- to overcome common problems caused by a poorly implemented application: memory leaks, unexpected garbage collection cycles triggered by the system, and the performance hit from large application objects and HTTP session sizes. The tools in the IBM Support Assistant are included in the IBM Monitoring and Diagnostic Tools for Java™What is the SOA Sandbox?
The SOA Sandbox is a cloud offering that provides a dedicated, hosted environment with IBM products already installed just for you. Each SOA Sandbox activity is centered around a simple, realistic customer exercise. You simply install a (provided) Citric client, and request to run an exercise. You then log into the exercise and run it remotely - everything is provided for you!How long will it take?
The exercise should take between 1hr and 1hr 30 minutes.Where do I find it?http://www.ibm.com/developerworks/downloads/soasandbox/was4dev.html

In the last entry we discussed how the TPROF utility that is part of the Performance Inspector package of tools can be used to carry out performance profiling where HealthCenter is not available. HealthCenter has a number of other capabilities, including: lock analysis, garbage collection monitoring, class loading and environment information. This leads to the question: "Are there any other tools I can use on earlier releases of Java where HealthCenter is unavailable?"

Lock analysis for Java 1.4.2: JLM
The answer is yes, at least for the case of lock analysis. The Performance Inspector package also provides a utility called Java Lock Monitor (JLM) which provides exactly the same data that you see in HealthCenter, although without the extra level of analysis that highlights potential or actual problems in locking data.
JLM is available for AIX, Linux, Windows and z/OS where the IBM JRE is being used and provides data on which locks, both Java object locks (synchronized objects and blocks) and system locks (JVM monitors and JNI requested raw monitors), are being used and how often those locks are contended: ie. are a bottleneck in the code execution.

Enabling Java Lock Monitor for a Java application
In order to track the lock usage of the Java VM and the Java application a JProf agent, supplied as part of the package from the Performance Inspector tools site, needs to be loaded into the Java process. This can be done as follows:
For Java 5.0 and earlier:

java -Xrunjprof AppName

For Java 5.0 and later:

java -agentlib:jprof AppName

The JProf agent causes a listener socket in the Java process which waits to be notified by the supplied rtdriver utility to start and stop data collection, and to write out the gathered data to file.

Starting the Java process with the JProf agent causes the following information to be written out the console:

Collecting lock data from the Java application
Once the Java application has been started with the JProf agent loaded, the rtdriver utility can then run which connects to the JProf socket on the local machine. This starts a command prompt process which can be used to control the data gathering in the Java process:

Of the system monitors, the Heap lock is the only one likely to be of interest as contention of the heap lock means that a extremely high rate of object allocation is occurring and this is causing a performance bottleneck.

The "Java (Inflated) Monitors" section lists all of the Java synchronised objects that have had some degree on contention between threads. If these has been no contention then the synchronised object will not be listed:

What to look for in the log-jlm report
The most important value to look at is the %MISS, which gives you an idea of how often a thread was blocked trying to get a lock because it was already owned by another thread. The higher this %MISS value, the more contention that has occurred on that lock.
Where there is a high %MISS value showing contention, the next value to look at is the AVER_HTM, which is the average time that the lock has been held by a thread. If this value is low then the contention is occurring because many threads are trying to take the same lock, and reducing the number of threads will likely improve the performance. If the value is high then the lock is being held for a long time because a lot of work is being done under the lock, and moving some of the work to outside of the lock will likely improve the performance.

Differences between JLM and HealthCenter
There are very few differences between JLM and HealthCenter in terms of the data presented, as they both gather the data in the same way. HealthCenter however has the added level of helping you to interpret the data, and to alert you to locks which may be causing a performance bottleneck.

Sample based profiling for Java 1.4.2: TPROF
One option for sample based profiling at Java 1.4.2 is Time Profiler (TPROF) which is part of the Performance Inspector package of tools available from sourceforge.net.
TPROF is a system profiling tool available for AIX, Linux, Windows and z/OS that records what process, thread and code is executing on CPU(s) using time based sampling. This provides information on what C language functions are executing and occupying CPU time, and with the addition of an agent into the Java process will provide information on what Java language methods are executing.

Enabling Java method sampling in TPROF
In order to track the Java methods being executed in addition to the C functions a TPROF agent, supplied as part of the package from the Performance Inspector tools site, needs to be loaded into the Java process. This can be done as follows:
For Java 5.0 and earlier:

java -Xrunjprof:tprof,fnm=D:\ibmperf\bin\log,pidx AppName

For Java 5.0 and later:

java -agentlib:jprof=tprof,fnm=D:\ibmperf\bin\log,pidx AppName

If you are running Java on AIX, a version of TPROF is available as part of the operating system install.
Information of setting up and using the AIX TPROF to profile Java applications is available in the Java Troubleshooting Guide.

Running TPROF to sample the Java application
Once the Java application has been started with the TPROF agent loaded, you need to start the TPROF data gathering process, which will begin sampling the data from the system. The most simple mode for running tprof is using:

and will create a trace file and a tprof.out file, which is the report of the data gathered by TPROF.

TPROF in trace only mode using the "-t" flag, which produces just the trace file which has be post processed later using run.tprof with the "-p" option. The start and stop of the data gathering can also be set automatically using the "-s" flag which sets a delay time to start profiling and the the "-r" option to specify the time to gather data for:

run.tprof -t s10 -r 60

This will run TPROF for 60 seconds from roughly 10 seconds after the run.tprof command is entered, and product a raw trace file called swtrace.nrm2 which can then be post processed to produce the report file using:

Interpreting the tprof.out report
The TPROF report is broken down into sections. The section that is the closest equivalent to that provided by HealthCenter is the Process Module Symbol section, which shows the highest used Java methods or C functions on a per process basis.
Looking up the Java process, you get the following kind of output:

This gives a very similar output to what we saw previously when running with HealthCenter, with the majority of the time being spent in the compile() method of java.util.regex.Pattern.

Differences between TPROF and HealthCenter
HealthCenter has a number of advantages for Java method profiling over TPROF:

HealthCenter provides a live view, whereas TPROF only provides snapshots from a given sample period

HealthCenter has an Invocation Paths view, which shows the call relationships between methods (which method is calling which method)

TPROF however also has visibility into the C code that is running in the Java process, and as such will show you if the CPU time is being spent either in the Java Virtual Machine (JVM), for example running Garbage Collection or class loading/unloading, or the CPU time is being spent in Java Native Interface (JNI) code.

On 02 July 2009 at 11:00 AM EDT, there will be a WebSphere Support Technical Exchange presentation on "Low overhead performance monitoring for your JVM with IBM Monitoring and Diagnostic Tools for Java - Health Center". The presentation will be delivered by Dave Nice, one of the original Health Center developers in the IBM Java Team.

You will learn more about IBM Monitoring and Diagnostic Tools for Java - Health Center, a new performance tool for the Java 5 and Java 6 SDKs. Health Center provides live monitoring of applications, with very low overhead. It can help you with method profiling, lock profiling, garbage collection, and class loading information.

The presentation is expect to last for 30-45 minutes followed by a question and answer session.

More details on the webcast are available from the WebSphere Technical Exchange Webcast page:

Detecting the (over) use of System.gc() using Health Center

Health Center provides live monitoring and analysis of a number of areas of a running JVM and application, including Garbage collection (GC). Part of the GC analysis determines if calls to System.gc() are being made, and the rate at which those calls are occurring. If the rate is high, Health Center will flag up a warning or an error in the Status panel:
In this case Health Center has highlighted a warning that calls to System.gc() are accounting for 15% of the GC cycles.

If your worried about any calls to System.gc(), clicking on the "Garbage Collection" link will give you a more detailed view of GC, including a count of the number of calls to System.gc() (forced GCs), the number of allocation failures (non-forced GCs) and the total number of GC cycles.

Determining what is calling System.gc()

If you want to know what code is making the calls to System.gc(), the "Profiling" link contains the information. Clicking on the link opens the following panel:

This gives a breakdown of the various methods that are being called in the application. If you then filter on "System.gc()", and click on the "Invocation paths" tab in the bottom window, you see the call stack for the calls to System.gc():

In this case, there is only one method calling System.gc(), which is StoreData.run(). If there were multiple methods calling System.gc() then there would be an expandable tree under the System.gc() call, and the percentage against each displayed method would show the rate at which each method is responsible for calling System.gc(). In the case above there is only one caller, it is therefore responsible for 100% of the calls to System.gc().