Thursday, August 28, 2014

Did you specify any extra JVM parameters to reach the state of full CodeCache? Some comments on the internet indicate this happens if you specify too low "-XX:CompileThreshold" and too much bytecode gets compiled by HotSpot very early.

when the following warning was seen:

VM warning: CodeCache is full. Compiler has been disabled.

In this article, we will look at tuning CompileThreshold and ReservedCodeCacheSize in JDK 8.

CompileThreshold

By default, CompileThreshold is set to be 10,000:

intx CompileThreshold = 10000 {pd product}

As described in [2], we know {pd product} means "platform-dependent product option". Our platfrom is linux-x64 and that will be used for this discussion.

Very often, you see people setting the threshold lower. For example

-XX:CompileThreshold=8000

Why? Since the JIT compiler does not have time to compile every single method in an application, all code starts out initially running in the interpreter, and once it becomes hot enough it gets scheduled for compilation. To help determine when to convert bytecodes to compiled code, every method has two counters:

Invocation counter

Which is incremented every time a method is entered

Backedge counter

Which is incremented every time control flow moves from a higher bytecode index to a lower one

Whenever either counter is incremented by the interpreter it checks them against a threshold, and if they cross this threshold, the interpreter requests a compile of that method.

The threshold used for the invocation counter is called the CompileThreshold, the backedge counter uses a more complex formula derived from CompileThreshold and OnStackReplacePercentage. So, if you set the threshold lower, HotSpot compiles methods earlier. And, in some cases, that can help the performance of server codes.

ReservedCodeCacheSize

A code cache is where JVM uses to store the native code generated for compiled methods. As described in [3], to improve an application's performance, you can set the "reserved" code cache size:

-XX:ReservedCodeCacheSize=256m

when tiered compilation is enabled for the HotSpot. Basically it sets the maximum size for the compiler's code cache. In [4], we have shown that an application can run faster if tiered compilation is enabled in a server environment. However, code cashe size also needs to be specified larger.

What's New in JDK 8?

We have seen people setting the following JVM options:

-XX:ReservedCodeCacheSize=256m -XX:+TieredCompilation

or

-XX:CompileThreshold=8000

in JDK 7. In JDK 8, do we still need to set them? The answer is that it depends on the platform. On linux-x64 platforms, those setting are no longer necessary. Here we will describe why.

In JDK 8, it chooses the following default values for linux-x64 platforms:

A bigger code cache is needed. Internally, HotSpot will set it to be 240 MB (i.e., 48 MB * 5)

That's why we say that people don't need to set the following options anymore in JDK8:

-XX:ReservedCodeCacheSize=256m -XX:+TieredCompilation

or

-XX:CompileThreshold=8000

Noted that “reserved” code cache is just an address space reservation, it does not
really consume any additional physical memory unless it’s used. On
64-bit platforms, it doesn’t hurt at all to set a higher value. However, if you have set cache size to be too small, you will definitively see the negative impact on your application's performance.

Acknowledgement

Some writings here are based on the feedback from Igor Veresov and Vladimir Kozlov. However, the author would assume the full responsibility for the content himself.

Trying to bring over each and every tuning option from a JR configuration to an HS one is probably a bad idea.

Even when moving between major versions of the same JVM, we usually recommend going back to the default (just pick a collector and heap size) and then redoing any tuning work from scratch (if even necessary).

Our platform is linux-x64 and these options are both set to be true based on ergonomics.

UseCompressedOops vs. UseCompressedClassPointers

CompressedOops are for the compression of pointers to objects in the Java Heap. Class data is no
longer in the Java Heap and the compression of pointers to class data is done under the flagUseCompressedClassPointers. In next sections, we will discuss them in more details.

Oops and Compressed Oops

Oops are "ordinary" object pointers.Specifically, a pointer into the GC-managed heap. Implemented as a native machine address, not a handle. Oops may be directly manipulated by compiled or interpreted Java code, because the GC knows about the liveness and location of Oops within such code. Oops can also be directly manipulated by short spans of C/C++ code, but must be kept by such code within handles across every safepoint.

Compressed Oops represent managed pointers (in many but not all places
in the JVM) as 32-bit values which must be scaled by a factor of 8 and
added to a 64-bit base address to find the object they refer to in Java Heap.

Compressed Class Pointers

Objects (in its 2nd word) have a pointer to VM Metadata class, which can be compressed. If compressed, it uses a base which points to the Compressed Class Pointer Space.

Before we continue, you need to know what Metaspace and Compressed Class Pointer Space are. A Compressed Class Pointer Space (which is logically part of Metaspace) is introduced for 64 bit platforms. Whereas the Compressed Class Pointer Space contains only class metadata, the Metaspace can contain all other
large class metadata including methods, bytecode etc.

For 64 bit platforms, the
default behavior is using compressed (32 bit) object pointers
(-XX:+UseCompressedOops) and compressed (32 bit) class pointers
(-XX:+UseCompressedClassPointers). However, you can modify default settings if you like. When you do modify them, be warned that there is a dependency between the two options—i.e., UseCompressedOops must be on for UseCompressedClassPointers to be on.

To summarize it, the differences between Metaspace and Compressed Class Pointer Space are :[3]

Friday, August 22, 2014

The system dictionary is an internal JVM data structure which holds all the classes loaded by the system. As described in JDK-7114376,[1]

The System Dictionary hashtable bucket array size is fixed at 1009. This value is too small for large programs with many names, too large for small programs and just about right for medium sized ones. The default should remain at 1009, but it should be possible to override it on the command line.

Finally, this has happened in JDK 8 and you can set hashtable bucket array size to be larger if you have more classes loaded in your applications by using:[2]

-XX:+UnlockExperimentalVMOptions -XX:PredictedLoadedClassCount=<#>

This can help calls like Class.forName(), which do lookups into this data structure.

In this article, we will look into sizing system dictionary in more details.

Performance of System Dictionary

In 1.4.2, there were some performance enhancements. One of them is making system dictionary reads lock-free.[3] From this enhancement, you probably can guess that tuning system dictionary could be important to the JVM's performance.

The current number of buckets in hash table for system dictionary is set to be 1009 by default, which is relatively small for an application which has 49987 classes loaded as shown below:

Loaded Bytes Unloaded Bytes Time
49987 110488.8 2453 7743.8 105.24

From the experience, we know:[5]

In a good hash table, each bucket has zero or one entries, and sometimes two or three, but rarely more than that.

Assuming the average ideal length of buckets is three, the hashtable bucket array size then should be:

Ideal hashtable bucket array size = 49987 / 3 = 16662

System Dictionary Tuning

Seeing 49987classes loaded in our application at end of its run, we decided to set:

-XX:PredictedLoadedClassCount=16661 (a prime)

Note that PredictedLoadedClassCount is an experimental flag. So, you also need to set:

-XX:+UnlockExperimentalVMOptions

before it in the command line.

When the JVM allocates memory, the largest chunk is the java heap. The
system dictionary is in C heap and it is the loaded class cache. The goal of setting PredictedLoadedClassCount flag is to increase the size
of the system dictionary in order to make lookups of loaded classes
faster. Before every class being loaded, it requires a checking to see if the class is already loaded. Larger system dictionary will improve JVM performance during this class loading and resolution phase.

Note that the total number of entries in the hashtable does not change— that is based on the number of loaded classes. What sizing of system dictionary do is to increase the spread of the entries so that the average length of the buckets under a single hash result could be reduced. This would reduce the time it takes to find a given entry.

How to Find Number of Loaded Classes?

There are multiple ways to find out number of loaded classes in an application. For example, you can use -Xverbose:class to determine what classes are loaded. Another way to find out is what we have done in this article by setting -XX:+PerfDataSaveToFile and then use "jstat -class" command to decipher the class statistics offline:

Wednesday, August 20, 2014

This article is one of the Metaspace series in JDK 8 on Xml and More. Please read previous articles (i.e. [1, 2, 3]) for the background information.

Here we use a case study to show you what we mean by:

Proper monitoring and tuning of the Metaspace is still required in order
to limit the frequency or delay the garbage collections of metaspace.

JVM Setup

We have used the following print options:

-XX:+PrintGCDetails -XX:+PrintGCTimeStamps

to generate GC log files. Besides that, we have set:

-XX:MetaspaceSize=128m

to specify the initial high water mark to trigger metaspace cleanup by inducing a GC event.

In the GC log file, you can find the following entries:

[Metaspace: 21013K->21013K(1069056K)]

The first value shows you the previously used size before GC, the second value shows you the current used size and the last value shows you the current reserved size.

Before Tuning

Below is the experiment running with all default values. As you can see, there are six Full GC events triggered by "Metadata GC Threshold" being reached. The initial high water mark was set by the default MetaspaceSize (i.e., 21807104 Bytes). So, the first Full GC event was induced when committed memory of all metaspaces reaches the initial high water mark. To find HotSpot's default option values, read [4].

Why to Tune?

So, you may ask what's the deal with tuning MetaspaceSize. Yes, it's just setting the initial high water mark. Later HotSpot will adjust high water mark based on ergonomics. Hopefully, when HotSpot's ergonomics becomes more matured, no tuning is necessary at all. But, even in that case, you may still want to set MetaspaceSize for some occasions. One of them is when you run a benchmark in a short period and your focus may be on other GC activities than meta data being loaded or unloaded.

At the first try, it has printed the above three lines. However, the default jstat was run, which belongs to JDK 7. So, you still see headers such as "PC" and "PU". Also, you could see some gibberish text because of "unresolved symbols." To run jstat, you need to use the correct version (i.e., from your JDK 8 installation). After correcting this error, we have seen:

Note that we now see "MC", "MU", "CCSC", and "CCSU" instead of "PC" and "PU".

Meaning of "MC", "MU", "CCSC", and "CCSU"

There is no good documentation on what they mean yet. But, a good guess would be:

MC (Metaspace Committed)

MU (Metaspace Used)

CCSC (Compressed Class Space Committed)

CCSU (Compressed Class Space Used)

If you check the values printed out from "jstat -gc" and the values printed out at JVM's exit time. They have similar values. Note that the "jstat -gc" command was issued close to the end of our JVM run.

To learn more about these new headers, read [2].

Acknowledgement

Some writings here are based on the feedback from Jon Masamitsu. However, the author would assume the full responsibility for the content himself.

Metaspace

Internal JVM memory management is, to a large extent, kept off the Java
heap and allocated natively in the operating system, through system
calls like malloc or mmap. This non-heap system memory allocated by the JVM is referred to as native memory. Similar to the Oracle JRockit
and IBM JVM's, the JDK 8 HotSpot JVM is now using native memory for the representation
of class metadata, which is called Metaspace.[2,4]

In JDK 8, Hotspot explicitly manages the space used for metadata. Space is requested from the OS and then divided into chunks. A class loader will allocate space for metadata from its chunks (a chunk is to bound to a specific class loader). When classes are unloaded for a class loader, its chunks are recycled for reuse or returned to the OS. Note that metadata uses mmap'ed space and not malloc'ed space.

Used/Capacity/Committed/Reserved

When JVM exits, it prints the following information if you have enabled GC printing:

In the “Metaspace” line, the “used” is the amount of space used for loaded classes and other metadata. The “capacity” is the space available for metadata in currently allocated chunks. The “committed” value is the amount of space available for chunks. The “reserved” is the amount of space reserved (but not necessarily committed) for metadata. On the “class space” line, those values are the corresponding values for the class area in Metaspace when compressed class pointers are used.

Mataspace is dynamically managed by HotSpot. Metadata are deallocated when their class loader are garbage collected (GC'ed). A high water mark is used for inducing a GC—when committed memory of all metaspaces reaches this level, a GC is triggered. You can use the following flag:

-XX:MetaspaceSize=<size>

to specify the initial high water level.

The interaction between metaspace growth and GC can be summarized as follows:

Metaspaces expand the native memory it is using until it gets to some level (starts at MetaspaceSize). When it hits that level, it does a GC to see if classes can be unloaded. After the GC, it can use freed space in it for metadata. If not enough space has been freed, it uses more native memory.

After the GC it also decides what the next level is for doing a GC to unload classes. The level mostly increases to have fewer GC's. It sometimes will decrease if lots of space has been freed due to class unloading. If MetaspaceSize is set higher, then fewer GC will be done early.

Acknowledgement

Some writings here are based on the feedback from Jon Masamitsu. However, the author would assume the full responsibility for the content himself.

Tuesday, August 12, 2014

If you upgrade from JDK 7 to JDK 8, there are some changes in JVM that you should pay attention to. One of them is:

The removal of Permanent Generation (PermGen) space.[1]

Similar to the Oracle JRockit
and IBM JVM's, the JDK 8 HotSpot JVM is now using native memory for the representation
of class metadata, which is called Metaspace. This may have removed the old OOM Error.[2] But ...

As described in [3], proper monitoring and tuning of the Metaspace is still required in order to limit the frequency or delay of metaspace garbage collections. In this article, we will show how to achieve that.

Tuning Metaspace

When space (either PermGen or Metaspace) is filled up, class need to be unloaded. In the PermGen era, when PermGen is filled up, a Garbage Collection (or GC) would occur and unload classes. Without PermGen, we still need to have some way to know when to do a GC to unload classes. This is when

-XX:MetaspaceSize=<size>

comes into play. When the amount of space used for classes reaching MetaspaceSize, a GC is done to see if there are classes to be unloaded. If you know that you are going to have lots of classes loaded and use more than 20M (this is the default value for my platform) class data, increasing MetaspaceSize can delay the GC's that are done to check for class unloading. During your upgrade from JDK 7 to 8, as a first estimate, you can use the old value of PermSize to be your new MetaspaceSize. You can also read [4] for more details.

Since Metaspace uses native memory, class metadata allocation theoretically is limited by amount of available native memory (capacity will of course depend if you use a 32-bit JVM vs. 64-bit along with OS virtual memory availability). However, if you want to limit the amount of native memory used for class metadata, you can set:

-XX:MaxMetaspaceSize=<size>

which can restrict the extension of metaspace to the maximum at runtime.

How Many Classes Are Loaded?

There are multiple way[6,7] to find out how many classes are loaded and how much space they would take. Here we introduce one way of finding that information.

jps is a Java tool that you can use to list the instrumented HotSpot Java Virtual Machines (JVMs) on the target system. For example, here are the list of JVMs that are running on our Linux.

To gather class information, you can use jstat tool which displays performance statistics for an instrumented HotSpot Java virtual machine (JVM). For example, jstat has attached to process 27429 and displayed the statistics on the behavior of the class loader as below:

From the above output, we know that 46167 classes were loaded and they have taken 96M. At the time of snapshots, HotSpot has also unloaded 406 classes from which 1.46 M were freed. Based on this information, you can then decide how to set your MetaspaceSize and MaxMetaspaceSize (not advised) as described above. However, be warned that once you have set MaxMetaspaceSize, you may still run into OOM error if metaspace cannot be extended further.

Acknowledgement

Some writings here are based on the feedback from Jon Masamitsu. However, the author would assume the full responsibility for the content himself.

Trying to bring over each and every tuning option from a JR configuration to an HS one is probably a bad idea.

Even when moving between major versions of the same JVM, we usually recommend going back to the default (just pick a collector and heap size) and then redoing any tuning work from scratch (if even necessary).

Friday, August 1, 2014

In [1], we have discussed how to find out default HotSpot JVM values. In the output, you may have spotted:

{pd product}

What does it mean? In this article, we will examine that in depth.

Platform Dependent

{pd product} means platform-dependent product option. First of all, if a flag is a product option, maybe you want to pay attention to it and may want to change its value for better performance. If it's an experimental or diagnostic flag, you may want to ignore it.

If an option is {pd product}, it means that its default value will be dependent on which platform its implementation would be compiled on. Currently, these platforms are supported, but not limited to:

ARM

PPC

Solaris

x86

Examples

Two of the {pd product} flags, we would like to discuss here further. In JDK 7u51,[2] they have the following default values on x86 platforms:

Disclaimer

The statements and opinions expressed here are my own and do not necessarily represent those of Oracle.For your computer health, follow me @xmlandmore. To improve your personal health, follow me @travel2health.

About Me

Healthy pursuits are like traveling. We know there is a wonderland called wellness. But, there are no fixed routes to reach there. The pursuits need effort and determination. We cannot act like tourists who don't know where they've been. We must take notes of warning signs sent by our bodies.

On the journey to wellness, there are many dangers to be avoided; there are many footsteps to be taken; unfortunately, there are no short-cuts.

Travelers walking in the night follow North Star. Healthy pursuits work similarly. We know our goal; we know the signposts; we know the dangers on the road; we can adjust pace if we are tired; we can change travel plans due to the weather. But, We walk steadily and persistently. With my companionship, hopefully, your journey will be made easier.