Oracle Blog

Garbage Collection - Let the VM do it.

Wednesday Dec 14, 2005

You know why you get an out-of-memory exception, right? Your live data exceeds the space
available in the Java heap. Well, that's very nearly, always right. Very, very nearly.

If the Java heap is barely large enough to hold all the live data, the JVM could be doing
almost continual garbage collections. For example if 98% of the data in the heap
is live, then there is only 2% that is available for new objects. If the application
is using that 2% for temporary objects, it can seem to be humming along quite nicely,
but not getting much work done. How can that be? Well the application runs until it has
allocated that 2% and then a garbage collection happens and recovers that 2%.
The application runs along happily allocating and the garbage collector runs along
respectfully collecting. Over and over and over. The application will be making forward
progress but maybe oh so slowly. Are you out of memory?

Back in the 1.4.1 days a customer noticed this type of behavior and asked for help
in detecting that bad situation. In 1.4.2 the throughput collector started throwing an
out-of-memory exception if the VM was spending the vast majority of its time
doing garbage collection and not recovering very much space in the Java heap.
In 5.0 the implementation was changed some, but the idea was the same. If you are
spending way too much time doing garbage collections, you're going to get an out-of-memory.
Interestingly enough this identified at least one case in our own usage of Java
applications where we were spending most of our time doing garbage collection.
We were happy to find it.

Why do I bring this up? Well, mostly because it was brought up in our GC meeting
this morning. If you're in this situation of spending most of your time in
garbage collection, I think you are out of memory and you need a bigger heap.
If you don't think that, you can turn off this behavior with the
command line flag -XX:-UseGCTimeLimit. May you never need it.

Tuesday Dec 06, 2005

Just a friendly warning. This one verges on GC stream-of-consciousness
ramblings.

GC ergonomics has been implemented so far in the throughput collector only.
We've been thinking about how to extend it to the low pause collector. The
low pause collector currently is implemented as a collector that does
some of it's work while the application continues to run. It's described
in

http://java.sun.com/docs/hotspot

Some of the policies we used in the throughput collector will also be
useful for the low pause collector, but because the low pause
collector can be running at the same time as the application, there
are some intriguing differences.
By the way the low pause collector does completely stop the application in order
to do some parts of the collection so some of our experience with the
throughput collector is directly applicable. On the other hand having this mix
of behaviors can be interesting in and of itself.

When we were developing the low pause collector we decided that any
parts of the collection that we could do while the application continued
to run was good. It was free. If there are spare cycles on the
machine, that's almost true. If there aren't spare cycles, then it can
get fuzzy. If the collection steals cycles that the application could use, then
there is a cost. Especially if there is only one processor on the machine.
If there are more than one processor on the machine and I'm doing GC,
am I stealing cycles from the application? If I steal cycles from another
process on the machine, does it become free again? We've been thinking about
how to assess the load on a machine and what we should do in different
load situations. That type of information may turn out to be input for GC
ergonomics.

Another aspect that we have to deal with is the connection between the young generation
size and the tenured generation pause times (pauses in collecting the
tenured generation, that is). When collecting the tenured generation, we
need to be aware of objects in the young generation that can be referencing (and thus keeping
alive) objects in the tenured generation. In fact we have to find those objects in
the young generation. And the larger the young generation is the longer it takes to
find those objects. With the throughput collector the times to collect
the tenured generation is only distantly related to the size of the young generation.
With the low pause collector the connection is stronger. If we're trying to meet a
pause time goal for a pause that is part of the tenured generation collection, then
maybe we should reduce the size of the young generation as well as reduce the size of the
tenured generation. But maybe not.

With the throughput collector a collection is started when the application attempts to
allocate an object and there is no room left in the Java heap. With the low pause collector
we want the collection to finish before we run out of room in the Java heap. So when does the
low pause collector start a collection of the tenured generation? Just In Time, hopefully.
Starting too early means that some of the capacity of the tenured generation is not used.
Starting too late makes the low pause collector not a low pause collector. In the 5.0
release we did some good work to measure how quickly the tenured generation was being filled
and used that to decide when to start a collection. It's a nice self contained problem as
long as we can start a collection early enough. But if we cannot start a collection in time
then we probably need a larger tenured generation. So a failure to JIT/GC needs to feed into
GC ergonomics decisions. Well, really we don't actually want to fail to JIT/GC before we
expand the tenured generation so there's more to think about. But not right now.

Monday Oct 24, 2005

There were some decisions made during the development of GC ergonomics that perhaps deserve some
explanation. For example,

The pause time goal comes first.

A pause is a pause is a pause.

Ignore the cost of System.gc()'s.

Why is the pause time goal satisfied first?

GC ergonomics tries to satisfy a pause time goal before considering any throughput goal.
Why not the throughput goal first? I tried both ways with a variety of applications.
As one might expect it was not black and white. In the end we chose to consider the
goals in this order.

Pause time goal

Throughput goal

Smaller footprint

The pause time goal definitely
has the potential for being the hardest goal to meet. It's dependence on heap size
is complicated and trying to meet the pause time goal without the encumberances of either
of the other goals was easier to think about. If we could meet the pause time goal, then
increasing the heap to try and meet a throughput goal felt safer (i.e., the relationship
between throughput and heap size is more linear so it was easier to understand
how undoing an increase would get us back to where we started).

In retrospect it also seems more natural
to have the pause time goal (which pushes heap size down) competing with the throughput
goal (which pushes heap size up). And only then to have the throughput goal (which again
pushes the heap size up) competing with the footprint goal (which, of course, pushes the
heap size down).

A pause is a pause ...

We talked quite a bit about whether the pause time goal should apply to both the major
and minor pause times. The issue was whether it would be effective to shrink the
size of the old generation to reduce the major pause times. With a young generation
collection you can shrink the heap more easily because there is always some place to
put any live objects in the young generation (namely into the old generation). It was clear
that reducing the young generation size would reduce the minor collection times
(after you've paid the cost of getting the collection started and shutting it down).
Well, that's true if you can ignore the fact that more frequent collections give objects less
time to die. With the old generation it was much less obvious what would happen. The old generation
can only be shunk down to a size big enough to hold all the live data in the old generation.
Also the amount of free space in the old generation has an effect on the young generation
collection in that young generation collection may need to copy objects into the old
generation. In the end we decided that trying to limit both the major pauses and minor pauses with the
pause time goal, while harder was more meaningful. Would you have accepted the
excuse "Yes, we missed the goal but it was a major collection pause not a minor
collection pause".

System.gc()'s. Just ignore them.

During development I initially tried to include the costs of System.gc()'s in the calculation of the averages used by GC ergonomics. In calculating the cost of
collections the frequency of collections matters. If you are
having collections more often then the cost of GC is higher.
The strategy to reduce that cost is to increase the size of the heap so that collections
are less frequent (i.e., since the heap is larger you can do more allocations
before having to do another collection). The difficulty
with System.gc()'s is that increasing the size of the heap does not in general increase
the time between System.gc()'s. I tried to finesse the cost of a System.gc() by considering
how full the heap was when the System.gc() happened and extrapolating to how long the interval
between collections would have been. After some experimentation I found that picking how
to do the extrapolation was basically picking the answer (i.e., what the GC cost would have
been). I could tailor an extrapolation to fit one application, but invariably it did not fit
some other applications. Basically it was too hard. So GC ergonomics ignores System.gc()'s.

Tuesday Oct 04, 2005

In general GC ergonomics works best for an application that has reached a steady state behavior in terms of its allocation pattern. Or at least it is not changing its allocation pattern quickly. GC ergonomics measures the pause times and throughput of the application and changes the size of the heap based on those measurements.

The measurements of pause time and throughput are kept in terms of a weighted average where (as one would expect) the most recent measurements are weighted more heavily. By using a weighted average GC ergonomics is not going to turn on a dime in response to a change in behavior by the the application, but it is also not going to go flying off in a wrong direction because of normal variations in behavior.

If past behavior is not a good indicator of future performance, then GC ergonomics can lag behind in its decision making. If a change is just an occasional bump in the road, GC ergonomics will catch up. If behavior is all over the map, well, what can I say.

The easiest way to get into trouble with GC ergonomics is to specify a pause time goal that is not reachable. Typically what happens is that GC ergonomics reduces the pause times by reducing the size of the heap. As the heap is shrunk the frequency of collections goes up and throughput goes down. GC ergonomics is willing to drive throughput to nearly zero (by doing collections nearly all the time) in order to reach the pause time goal. I tell people to run without a pause time goal initially and see how large the collection pauses get. That gives a baseline for experimenting with a pause time goal. Then have a little fun and try some pause times.

Another thing you should be aware of is that GC ergonomics is going to run at every collection. If your applications has settled into a stable steady state, GC ergonomics is still looking to see if anything is changing so it can adjust. It does cost you some cycles, but I don't think it's significant. Let me put it this way. I've never seen GC ergonomics code show up in a significant way on performance profiles. This is probably less a sharp edge than a mild poke in the ribs. If you think that your application is really not going to be changing its behavior after it has settled in and want those last few cycles, run with GC ergonomics until your application has reached its steady state and look to see how the heap is sized. You'll have to pay attention to how the generations are sized also. Then select those sizes on the command line and turn GC ergonomics off. At least for most of you that should be plenty good. If performances
is not quit as high and you don't already know about survivor spaces, you may have to learn about them.
The document "Tuning Garbage Collection with the 5.0 Java Virtual Machine" should help. It can be found under the URL below (same one as in "Magic"). If performances is actually better, rejoice and let me know how we can be doing better.

Monday Sep 26, 2005

In our J2SE (tm) 1.5.0 release we added a new way of tuning the Java(tm) heap
which we call "garbage collector (GC) ergonomics". This was added
only to the parallel GC collector. You may also have seen it referred
to as "Smart Tuning" or "Simplified Tuning". GC ergonomics allows a
user to tune the Java heap by specifying a desired behavior for
the application. These behaviors are a maximum pause time goal and a
throughput goal.

So what is GC ergonomics? Prior J2SE 1.5.0 if you wanted to
tune the Java Virtual Machine (JVM)(tm) for an application you typically
did it by trial-and-error. You would run the JVM on
your application without changing any parameters and see how it ran.
If the throughput of the application was not as high as you wanted,
the usual solution was to increase the heap size.
With a larger heap collections happen less often so the cost of
garbage collection decreases as a percentage of the total execution
time. But as you increase the size of the heap, often the
length of the garbage collections increase. Since the
garbage collector pauses all application threads to do a collection,
the application would see longer and longer pauses as you chose
larger and larger heaps. If the pauses became too
long for your application, then you would have to reduce the size
of the heap. You usually have to choose a compromise between pause
times and throughput.

With GC ergonomics in J2SE 1.5.0 you choose a pause time goal and a
throughput goal and let the JVM increase or decrease the size of
the heap to try to meet those goals. On big machines a larger
maximum heap size is chosen as a default. GC ergonomics
only grows the heap enough to meet your goals so the maximum
heap size is not necessarily used. Sometimes you might have to
increase the maximum size of the heap if the default maximum size
is too small.

So how does this work? Actually GC ergonomics does pretty much
what you would do to tune the heap. As I say in the title, it's not
magic. But it does have the benefit of being able to tune dynamically
during the execution of the application. GC ergonomics

Measures the performance (both throughput and pause times)
of you application.

Compares the performance against the goals.

Decreases the heap size to shorten pause times, OR

Increases the heap size to get fewer collections.

If both the pause time goal and the throughput goal are being met,
GC ergonomics will decrease the size of the heap to try and minimize
the application's footprint.

GC ergonomics tries to meet your goals but there are no guarantees that
it can. For example, a maximum pause time of zero would be nice, but
it's not going to happen. Can you tune the heap better that GC
ergonomics? Probably yes. Is it worth your time to do it? And
to keep it tuned as your circumstances change? You'll have to tell
us.

For more information on GC ergonomics, please see "Ergonomics in the
5.0 Java Virtual Machine" under