Hi,
On Thu, 2013-12-19 at 14:16 -0800, yao wrote:
> Hi All,
>
> We have a real time application build on top of HBase (a distributed
> big data store), our original CMS GC works great most of time with
> only 0 - 2 Full GCs per node everyday. This is good performance
> considering our read/write traffic. Now, it's java 7 and we want to
> see the power of G1 GC. We tried G1 GC on two of our production but
> did not see a big improvement over CMS and we would like to hear
> feedbacks from you if there are any G1 GC tuning chances.
Thanks for trying G1 :) Expect lots of perf improvements in the future.
>> Here is our observation: G1 GC cleanup time is around 0.75 secs and
> there are also some mix pauses taking more than 5 secs when
> application is in a stable state. Due to these high STW pauses, we
> believe it is the reason why G1 doesn't do better job than CMS (their
> performance is pretty close).
>> Here is the java 7 version we use on our G1 experiment machines:
> java version "1.7.0_40"
> Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
> Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed
>> mode)
> and G1 parameters:
>> -XX:+UnlockExperimentalVMOptions
> -XX:+UseG1GC
> -Xmx67714m
> -Xms67714m
> -server
> -XX:MaxGCPauseMillis=150
> -XX:G1HeapRegionSize=32m
> -XX:InitiatingHeapOccupancyPercent=30
> -XX:G1ReservePercent=20
>>> Explanations for some G1 params we set
>> G1HeapRegionSize=32m
> We have some map/reduce jobs reading from HBase everyday
> and HBase will send data in a batch fashion to mapper tasks;
> therefore, a larger temporary object (up to 15MB) will be created
> during that time and that's why we set region size to 32MB to avoid
> humongous allocation.
How frequent are those large objects, and how many of these large
objects do you expect to have at the same time relative to live heap
size?
G1 can handle a few humongous objects well, but only degrades with
larger amounts. If the humongous object survives a few collections, it's
better to try to keep them as part of the old generation. Particularly
if marking is frequent, they will be collected as part of that.
Note that I think the default heap region size for your 66G heap is 32MB
anyway (at least in recent jdk8, not sure if that change has been
backported to 7).
> InitiatingHeapOccupancyPercent=30 and G1ReservePercent=20
>> We've observed a frequent Full GC (every one hour) before
> explicitly setting these two parameters. After reading from various GC
> tuning posts, we decided to reduce
> InitiatingHeapOccupancyPercent to 30 and increase G1ReservePercent to
> 20, this reduce full GC to about 3~5 times per day per node.
InitiatingHeapOccupancyPercent should (as a rule of thumb) be set to a
value slightly higher than average live data, so that marking is not
running constantly, burning CPU cycles.
There may be situations where this is appropriate, e.g. really tight
heaps.
E.g. given the Full GC log output of
>2013-12-19T14:23:14.592-0500: 925756.121: [Full GC 65G->43G(66G),
>45.3188580 secs]
> [Eden: 0.0B(2944.0M)->0.0B(9248.0M) Survivors: 416.0M->0.0B Heap:
> 65.9G(66.2G)->43.1G(66.2G)]
>This would mean a value of at least around 66 (43G/65G), typically at
least a few percent higher.
G1ReservePercent acts as a buffer for promotion spikes, so if you have a
load that has frequent promotion spikes (and a tight heap), this is the
way to handle them.
>> Given above background, we are still keep working on G1 GC tuning and
> hope to beat CMS on our application. For now, G1 is no worse than CMS
> on our production machines except full GC happens a little more on G1
> machines.
>> Here is the Pause time (STW time, including: initial mark, young, mix
> and clean up) graph, x-axis: gc start time, y-axis: pause time in
> milliseconds.
There will be many optimizations in the future :)
> Pauses larger than 2000 milliseconds are due to mix pause and some
> spikes are due to GC cleanup time.
> Typical G1 long clean up log
>> 2013-12-19T14:37:55.375-0500: 926636.905: [GC cleanup 50G->50G(66G),
> 0.7325340 secs]
> [Times: user=12.98 sys=0.00, real=0.73 secs]
Hmmm. Could you run with -XX:+UnlockDiagnosticVMOptions -XX:
+G1SummarizeConcMark that prints some information about the phases of
the GC Cleanup (unfortunately only at VM exit if I read the current code
correctly)?
>> Typical G1 Full GC log
I guess it is known by now that full GC and the way there
(to-space-exhaustion of any kind) are too slow. ;)
The only way to fix this is to try to avoid full GC altogether.
The main knobs to turn here are G1MixedGCLiveThresholdPercent,
G1HeapWastePercent.
G1MixedGCLiveThresholdPercent indicates at what region occupancy an old
generation region is even considered for collection. The default value
is 65%, i.e. G1 only considers regions that are less than 65% occupied
for evacuation.
Assuming you have a live set of 43G (I just took the heap size at the
end of the Full GC), in case of a heap that is evenly fragmented, you
are set up for an average heap size of 66G (43G/0.65).
Which is exactly what your maximum heap size is, which means that you
are bound to have full GCs (if your heap is "perfectly" fragmented) in
the long term with that setting.
The other variable is G1HeapWastePercent: it indicates how much wasted
space G1 can leave without collecting it after considering all
candidates (selected above).
That's the 10% threshold in the
> 925756.116: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason:
>candidate old regions available, candidate old regions: 617 regions,
>reclaimable: 9897360192 bytes (13.93 %), threshold: 10.00 %]
message.
So in the worst case, the 10% means that G1 leaves untouched at most 10%
of the heap of 65% occupied regions.
I.e. at a region size of 32M and a max heap of 66G, it leaves 6.6G/.65 =
10.1G = (rounded to regions) ~320 regions.
So in this case, your target old gen heap size is around 76G (43G/.65 +
(66G*0.1)/.65) - which is beyond the max heap, not even counting some
space for the young gen (eden and survivor) and additional safety buffer
for promotion spikes.
My suggestion is to increase G1MixedGCLiveThresholdPercent and decrease
G1HeapWastePercent appropriately to avoid mixed GCs with to-space
exhaustion and full GCs.
Note that these are worst-case calculations of course, but they explain
G1 running into Full GCs. Depending on how much static and dense data
you have, you can modify these values appropriately.
This rule-of-thumb calculation should give you a stable starting point
in most cases though.
> 2013-12-19T14:23:13.558-0500: 925755.088: [GC pause (mixed)
>[...]
> 925756.116: [G1Ergonomics (Concurrent Cycles) do not request
> concurrent cycle initiation, reason: still doing mixed collections,
> occupancy: 70598524928 bytes, allocation request: 0 bytes, threshold:
> 21310419750 bytes (30.00 %), source: end of GC]
> 925756.116: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason:
> candidate old regions available, candidate old regions: 617 regions,
> reclaimable: 9897360192 bytes (13.93 %), threshold: 10.00 %]
> (to-space exhausted), 1.0285640 secs]
> [Parallel Time: 821.1 ms, GC Workers: 18]
> [GC Worker Start (ms): Min: 925755094.0, Avg: 925755094.4, Max:
> 925755094.7, Diff: 0.7]
> [Ext Root Scanning (ms): Min: 4.6, Avg: 5.1, Max: 6.5, Diff:
> 1.8, Sum: 92.7]
> [Update RS (ms): Min: 12.4, Avg: 15.6, Max: 27.8, Diff: 15.3,
> Sum: 280.1]
> [Processed Buffers: Min: 2, Avg: 22.6, Max: 32, Diff: 30,
> Sum: 406]
> [Scan RS (ms): Min: 168.6, Avg: 180.7, Max: 184.4, Diff: 15.8,
> Sum: 3252.4]
> [Object Copy (ms): Min: 616.5, Avg: 618.1, Max: 618.9, Diff:
> 2.5, Sum: 11125.9]
> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.6, Diff: 0.5, Sum:
> 5.3]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1,
> Sum: 0.5]
> [GC Worker Total (ms): Min: 819.5, Avg: 819.8, Max: 820.2, Diff:
> 0.7, Sum: 14756.9]
> [GC Worker End (ms): Min: 925755914.2, Avg: 925755914.2, Max:
> 925755914.2, Diff: 0.1]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 13.5 ms]
> [Other: 194.0 ms]
Just fyi, the missing time in this phase is taken by some additional
fixup of the to-space-exhaustion.
> [Choose CSet: 5.6 ms]
> [Ref Proc: 16.9 ms]
> [Ref Enq: 0.9 ms]
> [Free CSet: 4.3 ms]
Hth,
Thomas