at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)

at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)

at java.lang.StringBuilder.append(StringBuilder.java:119)

at sun.security.util.ManifestDigester.<init>(ManifestDigester.java:117)

at java.util.jar.JarVerifier.processEntry(JarVerifier.java:250)

at java.util.jar.JarVerifier.update(JarVerifier.java:188)

at java.util.jar.JarFile.initializeVerifier(JarFile.java:321)

at java.util.jar.JarFile.getInputStream(JarFile.java:386)

at org.jboss.virtual.plugins.context.zip.ZipFileWrapper.openStream(ZipFileWrapper.java:215)

at org.jboss.virtual.plugins.context.zip.ZipEntryContext.openStream(ZipEntryContext.java:1084)

at org.jboss.virtual.plugins.context.zip.ZipEntryHandler.openStream(ZipEntryHandler.java:154)

at org.jboss.virtual.VirtualFile.openStream(VirtualFile.java:241)

Gathering and validation of facts

As usual, a Java EE problem investigation requires gathering of technical and non technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:

·Recent change of the platform? Yes, a deployment was performed a few days before the incident which did increase load to the affected JBoss environment by a factor of 50%

·Problem with our primary downstream system was confirmed given the high amount of thread waiting on blocking IO / socket.read() for a very long time

·The maximum pool size of the JBoss EJB3 of our stateless Web Services was reached, potentially explaining why the Java Heap utilization was so high at that time, ultimately failing with OutOfMemoryError condition

·Conclusion #1: The instability and slowdown of our main downstream system was the trigger of he incident / OutOfMemoryError condition

·Conclusion #2: The incident did reveal a possible lack of HTTP timeout between the JBoss application and the affected downstream system (Web Service provider)

The problem replication and further analysis of the Heap dump and Thread dump captured during performance testing revealed:

·Our application Web Service footprint for a single call is quite high can take up to 5.5-6MB of Java Heap

·Lack of Web Service HTTP timeout between our JBOSS Web Service and one of our downstream systems

·Lack of JBOSS – EJB3 Pool tuning (EJB pool was allowed to grow to an amount our system could not handled due to application footprint and current JVM memory settings)

Given the above problems, any slowdown of our primary downstream systems (non happy path) is leading to rapid Thread surge (Thread waiting for downstream system response). The combination of lack of HTTP timeout and lack of EJB pool tuning/cap was causing sharp increase of Java heap memory since each request can take up to 5MB+ of space on Java heap; leading to its rapid depletion under heavy load.

Solution and tuning

3 areas were looked at during performance testing and tuning exercise:

The performance testing done before and after tuning of item #2 and item #3 did reveal great improvement of the JBOSS environment and overall reduction to the Java Heap memory footprint during negative testing scenarios.

Such tuning was also deployed to our production environment since a few weeks now and so far it is showing a very healthy Java heap footprint with no re-occurrence.

The application memory footprint will also be revisited on future releases.