Search This Blog

Exception in "CompilerThread0" java.lang.OutOfMemoryError

Ever since we switched out build process from Java 1.4 to Java 5 (1.5.0_09) we have seen FindBugs crash with an OutOfMemoryError when started from ant. The whole thing is running under RedHat Enterprise Linux 4. The output is always the same:

Our first attempts to increase the heap size with the -Xmx VM parameter did not help. We suspected the newer FindBugs release we had installed roughly at the same time, but this turned out to be wrong, because newer and older versions showed the same behaviour.

Armed with the freshknowledge about the SAP Memory Analyzer just learned at jax.07 I tried to get a heap dump to try and find out what was causing the problem. Unfortunately the VM ignored my -XX:+HeapDumpOnOutOfMemoryError option completely (on Sun's VM Options page there it says -XX:-Heap... (with a minus), tried that, too).

Getting suspicious I realized that usually Java 5 VMs say something like OutOfMemoryError: Java Heap Space which was missing in this case.

I had never before seen that strange "CompilerThread0" in an error message, same for the "requested xx bytes for Chunk::new. Out of swap space?" part. Looking around I stumbled across this bug report (and some more, all related to this one - just follow the related bugs list):
Bug #6487381 "Additional path for 5.0 jvm crash on exhaustion of CodeBuffer".

Apparently something goes wrong when the HotSpot JIT tries to compile something to native code for faster execution. One of the other bugs you can find by following the link above mentions that there is problem with situations where there is not enough room for further compiles in what is called the CodeBuffer: instead of failing gently and just continuing to run the application without compiling or throwing some older compiled code away the VM would just crash. This should have been fixed with Java 5.0, but apparently there is another code path in the VM that will cause a hard failure. (I wonder what may be requesting another 128MB(!), but anyways...)

From what the bug says this issue will be fixed with 5.0u12, however 5.0u11 is the most recent release to download, so no help there.

Attempting to find whether it might be a single method that was causing the crash reproducibly I tried the -XX:+PrintCompilation option. It produced lots and lots of output about compiled methods. A "guide" to its output can be found here: decyphering -XX:+PrintCompilation output.

As one might have expected, this did not reveal anything useful either. So the last resort was to switch to the Client VM which managed to complete the FindBugs run. In retrospect this explains why we were able to build without any problems on a Windows machine - the Client VM is default there unless you explicitly request the server.

For now we will leave it at that and try again as soon as 5.0u12 comes out.

this error occurs because the JIT compiler could not allocate enough C-Heap (the one you get via malloc()), so increasing the Codecache size will not help (on the contrary, since it will decrease the amount of C-Heap available). The problem is that the JIT (the server compiler) has some places where it might use excessive amounts of memory (130MB in your case). Sun has added some heuristics to prevent this, but as you saw they will not always save you. There are a few possible solutions. First you could file a bug report to Sun, but you probably have to have a smaller test which shows the error. Second you could find out the method which compilation causes the error (with the compilation traces) and add this to a .hotspot_compiler file to exclude it from JIT compilation. Or you can try to reduce the Java-Heap and Perm-Gen size, so more C-Heap is available and you might be lucky that this is enough. Or, as you found out yourself, you can use the client compiler, which doesn't use the sophisticated algorithms which need so much memory. And sometimes there is really not enough swap space available (as the message suggest), so increasing this would be helpful. But most of the time it's not the available memory, but the available address space that's missing on x86.

Popular posts from this blog

Today I had to look at a piece of code a colleague had written, using my XPathAccessor class. She used it in a servlet which gets XML formatted requests. As those are generated by an external 3rd party tool we agreed on some XML schema definitions. Everything they send us needs to conform to its corresponding schema, each reply we send gets validated against a different set.In order to allow independent testing on either side, we provided a little test kit that allows testing our system without having to set up a servlet engine. Basically it just takes a file, reads it into a String and hands that to the handler.First it gets parsed without validation. This is necessary to find out which type of request we were send (the address is the same for all of them). After the root element is known, it will be read again, this time using the right schema to verify the request.Once that is done, some reply is put together and sent back to the client. So far, so good.When I looked at the code I …

(Also see the follow-up post about some progress)Today I was (again) facing a log file from a machine that had for some reason not been able to start a temporary MySQL daemon during the night to prepare for a streaming MySQL slave installation. The necessary 2nd daemon had created its new ibdata files, however just after that aborted the startup process with the following message:Can't start server: Bind on TCP/IP port: No such file or directory
071001 23:09:55 [ERROR] Do you already have another mysqld server running on port: 3310 ?
071001 23:09:55 [ERROR] Aborting
071001 23:09:55 [Note] mysql\bin\mysqld.exe: Shutdown completeAs you can see, the port is a different one from the default MySQL port, so I can be sure there was no conflict with the primary instance. Even more curiously the same process has been working on that and other machines flawlessly for some time. However I remember having seen this message once before, but back then I did not have the time to look into it any…

Some words in advance...Recently I wrote about multi-threading problems with java.util.Calendar and java.text.DateFormat. The last sentence was So maybe it is time to search your code for all static usages of the Calendar and various ...Format classes, before you start getting strange errors.Searching code is not very practical, especially if you do it manually. Everyone knows you can look at code for hours, without seeing an problem - and as soon as it has reached production systems it starts breaking up in various ways :-)Fortunately smart and reknown people have devised ways of making the computer look for bugs automatically. Amongst others, FindBugs is a very nice - and free - tool that analyzes your Java application's compiled bytecode and looks for numerous so called bug patterns. Those patterns are divided into categories, such as "Bad practice", "Correctness", "Multithreaded correctness", "Performance" and some more. Each of them lo…