Memory Leak Detection in .NET

Introduction

Usually, it is very cumbersome to detect and locate memory leaks. This article will provide a way to locate memory leaks in .NET applications. First, I will talk about Resource Allocation and Garbage Collection algorithms, and then will talk about detecting leaks in .NET apps. Pay attention to the bold texts in the code section.

Background

Resource Allocation

The CLR (Common Language Runtime) allocates all the resources on the managed heap and releases them when they are no longer required by the application. C/C++ applications were prone to memory leaks because programmers had to manually allocate and free memory.

The Runtime maintains a NextObjPtr for the next free space on the heap. When a new process is initialized, the CLR allocates a contiguous space on the heap for this process represented by NextObjPtr, and increments the NextObjPtr pointer to the next free space. The space is contiguous, and is different from the C++ heap where it is maintained as a linked list. GC heap is efficient when compared to the C++ heap because to allocate new memory, the GC doesn't have to search from a list of free memory or through a linked list. As times pass by, gaps start to appear in the heap as objects get deleted, and so GC has to compact the heap, which is costly. GC in .NET uses the Win32 API VirtualAlloc or VirtualAllocEX to reserve memory.

.NET uses several types of memory like stack, unmanaged heap, and managed heap.

Stack: It is managed on a per thread basis, and is used to store local variables, method parameters, and temporary values. GC doesn't clean the stack as its get automatically cleaned when the method returns. The references to objects are stored on the stack, but the actual object gets allocated on the heap and the GC is aware of that. When the GC cannot find a reference for an object, it removes it from the heap.

Unmanaged Heap: Unmanaged code will allocate objects on the unmanaged heap or stack. Managed code can also allocate object on the unmanaged heap by calling Win32 APIs.

Managed Heap: Managed code allocates objects on the managed heap, and the GC takes care of the management of the managed heap. The GC also maintains a Large Object Heap to compensate the cost of moving large objects in memory.

The garbage collector checks the heap for objects which are no longer used by the application. If such objects exist, then the GC removes those objects from the heap. Now, the question is how GC finds out about these objects which are not used by the application. Every application maintains a set of roots. Roots are like pointers to the objects on the heap. All global and static object pointers are considered as application roots. Any local variable on the thread stack is considered as application root. This list of roots is maintained by the JIT compiler and the CLR, and is made available to the GC.

When the GC starts running, it treats all objects as garbage, and makes an assumption that none of the objects on the heap are accessible. It then starts walking with the list of application roots, and starts building a graph of accessible objects. It marks all the objects on the heap as accessible if the objects are directly accessible lie an application root or indirectly accessible via any other object. For each application, the GC maintains a tree of references that tracks the objects referenced by the application. Using this approach, GC builds a list of live objects, and then walks through the heap in search of objects which are not present in this list of live objects. After finding out the objects which are not present in this list of live objects, it marks them all as garbage, and starts to compact the memory to remove holes which were created by unreferenced (dead) objects. It uses the memcpy function to move the objects from one memory location to another, and modifies the application roots to point to new locations.

If there is a live reference to the object, then it is said to be strongly rooted. .NET also has the concept of Weak Reference. Any object can be created as a weak reference which tells the GC that we want to access this object, but if the GC is going through garbage collection, then it can collect it. Weak reference is generally used for very large objects which are easy to create but are costly to maintain in memory.

Moving objects in memory gives a significant performance hit. To improve performance, the GC does several optimizations like large object heap and generations. Objects which are in size greater than 85,000 bytes are allocated on the large object heap. Moving large objects in memory is costly, so the GC maintains a separate heap for large objects, which it never compacts. The GC also maintains generations of objects. Whenever a new object is to be allocated and the managed heap doesn't have enough memory for the new object, a GC collection is performed. For the first time, every object in the heap is considered in Gen 0. After that, the GC performs a collection. The objects which survive are moved to Gen 1, and similarly, those which survive the Gen 1 collection move to Gen 2. The GC makes the assumption that a new object will have a short lifetime and an old object will have a longer lifetime. Whenever new memory is required, the GC tries to collect memory from Gen 0, and if enough memory can't be obtained from Gen 0 collections, then a Gen 1 or even Gen 2 collection is performed.

The GC can track an unmanaged resource's lifetime, but it can't reclaim the memory used by the resource unless destructors are used or code is written to override the Finalize in the base class.

A finalizer exists to allow the programmer to clean up the native resources used by the object before the object is garbage collected. But, using a finalizer promotes the object collection to the next generation. Whenever a new object with a Finalize method is allocated on the heap, a pointer to that object is placed on the Finalization queue. During garbage collection, if the GC finds that an object is not reachable, it then searches the Finalization queue for any reference to the object. If it finds a reference, then it removes the object from the Finalization queue and appends it to other data structure called a Freachable queue. At this point, the Garbage Collector has finished identifying garbage and compacts the memory. After that, the finalization thread empties the Freachable queue by executing each object's Finalize method. The next time a GC collection is performed, the GC sees this object as garbage and reclaims the memory assigned to this object.

It takes more time to reclaim the memory from objects having Finalize methods and affects performance, so a Finalize method should only be used when required.

Memory leaks can occur either in the stack, unmanaged heap, or the managed heap. There are many ways to find out that memory is leaking, like memory increasing in the Task Manager. Before starting to correct the memory problem, you need to determine the kind of memory which is leaking. Perfmon can be used to examine counters such as Process/Private bytes, .NET CLR Memory/# bytes in all heaps, and the .NET CLR LocksAndThreads/# of the current logical thread. If the .NET CLR LocksAndThreads/# is increasing unexpectedly, then the thread stack is leaking. If only Process/Private bytes are increasing but the .NET CLR Memory is not increasing, then unmanaged memory is leaking, else if both are increasing, then managed memory is leaking.

Figure 1 : Perfmon output for Privates Bytes and #Bytes in all heaps

Stack Memory

Stack memory gets reclaimed after the method returns. Stack memory can get leaked in two ways. First, a method call consumes a significant amount of stack resources that never returns, thereby never releasing the associated stack frame. The other is by creating background threads and never terminating them, thus leaking the thread stack.

Unmanaged Heap Memory

If the total memory usage is increasing but the .NET CLR memory is not increasing, then unmanaged memory is leaking. Unmanaged memory can leak in several ways - if the managed code is interoperating with unmanaged code and a leak exists in the unmanaged code. .NET doesn't make any guarantee that the finalizer for each object will get called. In the current implementation, .NET has one finalizer thread. If there exists a finalizer which blocks this thread, then the other finalizer will never get called and the unmanaged memory will leak which was supposed to be released. When an AppDomain is torn down, the CLR tries to run all the finalizers, but if a blocking finalizer exists, then it can prevent the CLR from completing the AppDomain tear down. To prevent this, the CLR implements a timeout on the process, after which it stops the finalization process, and the unmanaged memory which was supposed to be removed is left leaked.

Managed Heap Memory

Managed memory can also get leaked by several ways like fragmentation of the Large Object Heap. The memory in the Large Object Heap never gets compacted, so there is a loss in memory over there. Also, if there exist some objects which are not needed, but there exists a reference to the objects, then GC never claims the memory assigned to these objects.

This kind of leak is common, and can be resolved using SOS.dll. There are two ways to use SOS.dll:

Run the application which you want to debug for memory problems. Start the WinDbg tool and attach it to the application process.

OR

Open the application in Visual Studio 2005. Go to the properties of the project. In the Debug tab, make sure you have "Enable unmanaged code debugging" checked, or if you are attaching to a process, then in the "Attach to Process" window, click Select, and then select "Managed code" and "Native code". Run the application, and set a breakpoint in the code somewhere you want your application to break. Hit the breakpoint. Go to Debug -> Windows -> Immediate.

Run .load SOS.dll

The SOS.dll is the most popular debugging extension used to debug managed code. It has many powerful commands that can obtain information such as managed call stack, details about managed heap, objects in heap, and much more.

Run !dumpheap –stator!dumpheap –type PolicyEditor

This command scans the GC heaps and lists the objects that lie therein. The –Stat argument is used to display the output to a statistical summary. More information about this (any) command can be found by using !help dumpheap.

The above command will list all the objects present in the memory. If you think that an object should not be present in memory and should have been garbage collected, then open a Find (Ctrl +F) window and type the name of the object and search for it. If you cannot find the object, then either it has been garbage collected, or it was not instantiated. If you find the object, for example, the PolicyEditor object in the above list, then copy the MT (Method Table) address, which is 081d9ac4 in the above case.

Run !dumpheap -mt 081d9ac4

This will list all the objects with this MethodTable address. Those objects are the instances of PolicyEditor.

This will list the path to this object from the root of the GC tree. If some path exists for this object, then this is considered non garbage and is not collected by the GC, which could be a reason for memory leakage.

In the above output, we can see that the PolicyClarificationBrowser object has an event handler which holds a reference to PolicyEditor. To find out who is hooking up this event, take the address of the event handler, which is 2448f478, and dump this object.

If you take the address of the target in the above output, which is 2c4e3714, and run a !dumpobj with this address, then you can see in the output that it is of type PolicyEditor. To get the method that is hooked up as a handler, convert the int value in _methodPtr to hex.

?0n140144060

(This command will not work in the Immediate Window. To run this command, you need to attach WinDbg to this or any other managed process, or use Google to do that.)

The output will be:

Evaluate expression: 140144060 = 085a6dbc
And then run !ip2md 085a6dbc
Failed to request MethodData, not in JIT code range

Sometimes this works, and will give you the name of the method, but if doesn't work, then we can dump the object and then find out the method.

In the above output, search for the hex address we calculated above from _methodptr, and you will get the name of the method which is still holding a reference to the PolicyEditor object and is the reason for the memory leak.

Very good insight and instructions! Unfortunately, they no longer work on Win7/x64. VS2008 issues the following error message:
"The debugger does not support debugging managed and native code at the same time on this platform."

So, I'm attempting to narrow down a process that apparently causes an "OutOfMemory" exception every now and then, with more regular frequency on our client's systems than our own. So we figured it has to be some sort of memory leak.

before running the process in question, the application has approximately 100 MB allocated to it (via Process Explorer). After running the process the application retain some 300MB allocated to it, with a spike in the middle that will carry it up to 450-500MB.

If I happen to run the process a second time, the memory will rise to 500MB usage with a spike up to 750, and will settle back down to 450 after the process is completed. but the memory usage never returns back to the 100mb of usage that it listed before running the process at all.

So, I know several things about this process:
-It operates through DevExpress's XPO library with Sql Server saving a string value.
-Processes a large amount of pasted columnar/row data from a DataTable into an XmlSerializable object, that produces a 45-50 MB xml file. this string is then saved to the database.

That's it. To GC tree to where this 90MB of bytes are being referenced, used, called, held, or put on a shelf. It's just allocated and no cause or reason why it is being retained in memory. How do I find out what this massive array of bytes is being referenced by so that I can determine the source of the leak.

PS - The string data listed above I am aware of and can be purged from memory if necessary, but it does not appear to duplicate, where as this Byte[] array does seem to grow with successive executions of the process, without shrinking (much).

Thanks for the nice article.
I followed all the steps given by you in the article. I did the same for an event handler for one of my object after doing dumpheap statistics.
I got stuck in the last line where even after conversion of _methodPtr value to hex if I dont find any method entry in Method table then what could I assume? Can I assume that there is no memory leak with that object.But from my dumpheap statistics I could notice that the object is occupying more memory.

I have a question. I've heard from some people and read some articles on Internet about using Windows Task Manager for detecting memory leak. They said Windows Task Manager was not so good tool for detecting memory leak, because the information there are not exact ones that somebody needs in order to detect memory leak. I'm talking about "Mem Usage" column on "Processes" tab that should contain main information about memory leak. Do you happen to know anything about using Windows Task Manager for detecting memory leak?

I think it's a great way to identify whether you got a memory leak possibility. It's not specific, but it's just an idea to show what your memory is doing. If you suspect a memory leak (e.g. rising memory and very little decrease in memory), then you can check it out with Perfmon like he said.

Granted, I just stumbled upon this article and found it very interesting.

I recently have a program that is allocating a bunch of memory but isn't releasing it (or is releasing it at a poor rate imo). That's how I found this article.

I'm looking for a library that could be included as a reference into a .NET project.

I develop a large software and I would like to include a "report system" that could be fired from clients when they have memory problems (actually it will be automatic). In that way, I will be able to receive and analyse reports to find memory leaks.

I have a medium sized application that is used to do file conversion and index\metadata conversion from different formats to other formats. I have written this application using DLL plugins for both the Source and for the Destination. I have spent hours and hours checking for memory leaks and stuff. I am down to one now that I can't get past.

If my program runs thru say 100,000 records, it will eventually crash (and a hard crash) telling me I am out of system resources. I have finally traced this down to thousands of undisposed Windows.Forms.Cursor objects. What is funny is I don't access\use\create\dispose\modify\etc... any cursor objects anywhere in the 500k lines of code I got. I have even changed all of my GUI objects to not use the WaitCursor, just to make sure.

When it crashes, it is ugly. I can catch it, but can't create any more GUI items (including a simple message box). I can't recover or de-allocate any type of memory, nor can I find where these items are being created so that I can add my own GC to it. Just a pain in the a__.

A friend sent me a link to this article, and it looks interesting. If anything, I would like to try to atleast get my application to store its current state, and restart when it gets close to the threshold. This has kind of pointed me in the write direction. Just need to figure out how to do this via code.

I mostly just wanted to post this to tell people that if they are writting applications that do a lot of processing, they need to be very careful of memory leaks. Read into the GC. Use GC.Collect and GC.WaitForPendingFinalizers, etc...

So far there are like 3 memory leaks I can recreate in .Net very easily, which I never had to think about before .Net. So, just cause it is new, doesn't mean it is any better, and even some of the more simple programs need to think about this stuff.

Have you used GC in a solution with many many projects?
You'll be sorry and you will really hate GC. It'll make you wonder about the motivation to have such a useless tool in such a fine product. It just does not make sense. If I wanted to write a recipee or an inventory program for my CDs Why would I might need it. But then why? I can managed 5 objects easily can't I?. For a very high volume, high performance 24x7 year round maintenance free real world application, post the URL or references and we will shut up.

I have seen a number of out of memory exceptions that were unexpected, and I have been told that it is likely fragmentation in the large object heap. How can I use Windbg and SOS to view the large object heap and assess the fragmentation?