Large Object Heap Compaction: Should You Use it?

Despite the many benefits of automatic memory management in .NET, there are still a few perils which we must avoid. One of the most common, and frustrating to deal with, is fragmentation of the large object heap. In this article Chris Morter explains what LOH fragmentation is, why it’s a problem, and what you can do to avoid it.

When we talk about heap memory in .NET it’s natural to picture the heap as a single large contiguous block of memory. However, given that it has been carefully architected in order to optimise performance, this isn’t quite true. Instead, .NET breaks down the heap into 4 separate chunks, the first three of which are known as the small object heaps (SOHs), and are referred to as generation 0, 1, and 2 respectively. We’ll be focusing on the fourth heap, which is known as the large object heap (LOH) and is used to store all objects that are larger than 85,000 bytes.

.NET Memory in a Nutshell

If you’re already familiar with generational garbage collection, you can skip ahead to the next section, but if you’d like a primer / refresher than stay with me for a moment. The reason for segmenting memory in this manner is to reduce the performance cost of garbage collection. Empirical studies have shown that, for any realistic application, it tends to be the case that the objects that have been most recently created are the most likely to be destroyed, meaning that it’s advantageous to garbage collect recently allocated objects more often that the ones that have already been around for a while.

By dividing the SOH into the 3 separate generations, it is possible for the garbage collector to collect only certain parts of the SOH (thus lowering the performance cost) rather than scanning everything each time a collection happens. In short, when a new object is instantiated onto the SOHs it is placed on generation 0. If it then survives a garbage collection it is ‘promoted’ to generation 1, and if it survives a second garbage collection it will be promoted to generation 2.

This is a slight simplification, some objects may remain in their current generation as they could be pinned, or added to the finalizer queue, or created during the garbage collection itself.

A generation 0 collection will happen when generation 0 is full, and a generation 1 collection will happen when generation 1 is full and will also collect generation 0. Similarly generation 2 collections also collect all lower generations, and thus is relatively expensive to do. Thankfully, the CLR tracks your application’s memory allocations at run time and continually tunes the size of the various generations for maximum performance, and also decides when to perform generation 2 collections. After a collection, any remaining objects on the SOHs are ‘compacted’, meaning they are shuffled up against each other to remove any gaps in memory. This means that the CLR can allocate only as much memory as is actually needed, rather than try and fit into new and promoted objects into awkwardly sized gaps (known as fragmentation).

The LOH is also collected when a generation 2 collection happens, but unlike the small object heaps the large object heap isn’t compacted when it is garbage collected, which means that the LOH can get into a fragmented state. This is a problem because if the heap is sufficiently fragmented there will be no gaps large enough for new objects to be allocated into so new objects will have to be allocated at the end of the heap, thereby causing the heap to expand. If this process repeats continually the LOH will eventually consume all the system’s available memory and the program will crash with an OutOfMemory exception.

Why is LOH fragmentation so bad?

LOH fragmentation can be a difficult problem to tackle since. NET abstracts away the concept of physical memory locations. This makes it hard for the developer to figure out where the CLR may choose to allocate objects, and even harder to find out which particular allocation patterns are resulting in gaps being left in the LOH. To make troubleshooting harder, any potential problems tend to require the program to have been running for a length of time before becoming apparent, which makes debugging a tedious process.

Each of these solutions are either difficult, inelegant, laborious, or a combination thereof. However, in .NET 4.5.1 the .NET team at Microsoft has provided another possibility by adding the ability to easily do a one-off garbage collection, followed by a LOH compaction, with the following code:

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect(); // This can be omitted

If GC.Collect() is omitted then the LOH compaction will happen when the next LOH garbage collection occurs naturally. After this modified garbage collection has finished the application will continue running as before (i.e. with no LOH compactions).

Microsoft deliberately chose not to compact the LOH by default when it is garbage collected (unlike the small object heaps) because they believe that the performance impact of regularly performing a LOH compaction outweighs the benefits of doing so. In an MSDN blog announcing the release of .NET 4.5.1, the Microsoft .NET team give the following warning concerning using LOH compaction:

“LOH compaction can be an expensive operation and should only be used after significant performance analysis, both to determine that LOH fragmentation is a problem, but also to decide when to request compaction.”

Here I will seek to explain in more detail what actually happens during a LOH compaction, and clarify when it is appropriate to use it.

So how long does a compaction take?

To investigate the performance hit of LOH compaction I wrote a simple test application targeted to .NET 4.5.1 which instantiates a random number (100±40) of randomly sized large objects (84KB <= size < 16MB) , and then subsequently removes a random selection of them, thereby leaving the LOH in a fragmented state.

We can infer the duration of a LOH compaction by comparing the time taken for a standard GC with the time taken for a GC with LOH compaction since the difference will presumably be the length of time taken by the compaction. For this approach to be useful the heaps must be in a consistent state before each trial, so I made sure to instantiate the same random selection of objects, and then performed a full garbage collection before each trial.

Repeating this process for 20 such random starting states showed the following correlation:

Figure 1: Results from of LOH compaction tests.

As you may expect, there is a strong linear correlation between how long a LOH compaction takes and the amount of data that has to be moved. To quantify the amount of data moved we must consider what happens during a compaction.

The compaction algorithm

During a compaction the [compaction] algorithm will look through the LOH until it finds a gap, at which point it will take the next object along the heap and simply move it down to fill the gap. It will then continue looking through the heap, continually shifting objects down each time it encounters a gap. Note that this has the effect that, if there is a gap near the start of the LOH (as will probably be the case assuming that the gaps are numerous and uniformly distributed), then the majority of the data in the LOH will end up being moved.

The practical consequence of this is that compacting a slightly fragmented heap will require moving nearly as much data as compacting a very badly fragmented heap, and so will take roughly the same length of time. This means that performing frequent compactions doesn’t make subsequent compactions quicker, so you should delay performing compactions until it is really necessary (if at all).

The compactions took around 2.3ms per MB moved on the LOH on my desktop (i5-3550 CPU with 16GB of DDR3 memory). Using a tool like ANTS Memory Profiler it is possible to measure the size of objects on the LOH, and so estimate how long your application may freeze for due to a LOH compaction:

One interesting feature of compactions, both on the LOH and SOHs, is that the objects on the heaps are not reordered during the compaction, even if doing so would increase the speed of the compactions. The reason for this is to preserve locality of reference, as objects are likely to be created in a similar order to that in which they are accessed. In addition, the time required to compute a suitable reordering would likely offset the potential time saved anyway.

So when should you use compaction?

I would recommend using the LOH compaction only if the following criteria are satisfied:

You are already targeting .NET 4.5.1 or can upgrade to it.

Pauses of the length estimated in the previous section don’t seriously affect the usability of your application.

It is not possible to pursue strategies of breaking large objects down into smaller chunks, or reducing large object churn.

I think this is a very useful few feature in the .NET framework, and it suggests that more developers are realising that they need to at least be aware of what’s happening beneath all the abstractions if they want to build really great software. However, I also think that it should be a strategy of last resort.

Identifying LOH fragmentation with ANTS Memory Profiler 8

Of course this is all academic unless you know when LOH fragmentation is actually occurring, so I’ll finish by showing how you can use Red Gate’s ANTS Memory Profiler 8 to identify a LOH fragmentation problem. Here I have profiled the application I used to test the speed of LOH compactions earlier in this article, and have taken a snapshot after the objects have been allocated and a selection of them deallocated, leaving the LOH in a fragmented state. In fact, this screenshot of the ANTS Memory Profiler’s summary screen shows all the hallmarks of a badly fragmented LOH:

I have circled the salient details in the snapshot above, and if you’re identifying LOH fragmentation in your own application then there are a few details you can look for:

In the ‘Memory fragmentation’ section, ANTS Memory Profiler warns that “Memory fragmentation is restricting the size of objects that can allocated”.

In the ‘Largest classes’ section, 146.2MB of memory is listed as free space, and similarly in the ‘.NET and unmanaged memory’ section, 146.2MB is listed as unused memory allocated to .NET. In this situation, free space could be either gaps in the LOH or unused space at the end of the heap which hasn’t been returned by the CLR to the OS. There are innocent explanations as to why there may be free/unused space, such as the CLR deciding not to return memory to the OS if it anticipates using it again soon, especially on systems with lots of spare memory. However, this will tend to be a relatively small amount and will be transient, so if your system has a large amount of free memory for a long period of time, it’s a sign there could be a LOH fragmentation problem.

The ‘Memory fragmentation’ section shows that 99.9% of free memory is taken by large fragments, i.e. gaps in the LOH. The fact that this number is close to 100% suggests that the majority of the free memory hasn’t been deliberately kept by the CLR for future allocations and so is genuinely the result of fragmentation, which confirms our suspicion that fragmentation is problem for this application.

Of course, most memory fragmentation problems won’t be quite as obvious as this, but hopefully it should give you an idea what to look for should you suspect LOH fragmentation in your own application.

TL;DR

Avoid using compaction if you can. Favour other methods of dealing with LOH fragmentation such as breaking large objects into smaller ones or reducing object churn.

If you have to use compaction then wait until as late as is safely possible before compacting

The duration of a compaction (in milliseconds) can be roughly estimated by multiplying the size of objects on the LOH (in MB) by 2.3

Chris Morter is a Test Engineer at Red Gate currently working in the .NET division on ANTS Memory Profiler and ANTS Performance Profiler. When not staring at the screen he enjoys sprinting, particularly the 100 meters.

Thanks Chris for taking the time to write your test application and perform the testing. This is a great service to .NET developers. Very well written, well thought out article. Terrific!

Subject:

on demand LOH compaction?

Posted by:

Anonymous (not signed in)

Posted on:

Tuesday, October 8, 2013 at 11:24 AM

Message:

Nice job, but are you aware if there is any mechanism to allow a LOH compaction to occur if an Out of Memory exception would be thrown. I probably don't want to compact the LOH manually in code, but I would absolutely want to enable a flag that would tell the CLR to compact the LOH before throwing an out of memory exception, and I'm still amazed that MS doesn't seem to want to supply this functionality.

Subject:

Reply - on demand LOH compaction?

Posted by:

Chris Morter (not signed in)

Posted on:

Friday, October 11, 2013 at 7:58 AM

Message:

At the moment there is no framework mechanism to force a LOH compaction after an OutOfMemory exception occurs.

I suspect that the reason for this is that the CLR currently cannot distinguish between relatively benign OOM exceptions and those which leave the program in a fundamentally corrupt state.

For instance, trying to allocate a 1GB array could cause an OOM exception which could be safely handled by catching it and then performing a LOH compaction, freeing some other memory, or just not doing the allocation.

On the other hand, an OOM exception resulting from failing to allocate enough memory to JIT something means there may not be enough memory to JIT your exception handling code either, thereby leaving the application in a corrupt state.

The upshot of this is, that since the CLR can’t differentiate between these 2 cases and can’t risk performing a LOH compaction on a corrupt application, it can’t perform a LOH compaction after a benign OOM exception either.