.NET Garbage Collector PopQuiz – Followup

It was really exciting to see that so many people answered the .NET GC PopQuiz, especially seeing that so many had great answers. Perhaps the questions were too easy:)

The reason I posted the pop quiz in the first place is that, as opposed to Phil, who commented that none of this should really matter to the developer:), I do think that a good understanding of what happens behind the scenes when you are programming on top of a lot of code that you don't control, is important since it tells you a lot about how to design your app for best performance. Granted, some of it might be of less importance than other things, but still...

Without furter redue, here are my answers...

1. How many GC threads do we have in a .NET process running the Server version of the GC on a dual-core machine?

Two, one per processor, or rather one per logical processor, so as many of you pointed out it would have been 4 if it was hyper threaded. In a process running the workstation version of the GC we would have no dedicated GC threads, instead garbage collection runs on the thread initiating the GC as there is no point in switching to a different thread for garbage collection when you only have one proc/one thread doing the GC.

Why is this important to you? Since different GC modes are optimized for different things, your memory usage, GC latency etc. may vary a lot depending on what GC mode you are using. For example a windows service by default gets the workstation GC, but if there is a lot of througput (lots of short lived allocs), you are probably a lot better off running the server GC for memory usage and perf.

Btw, I really enjoyed the fact that Brian used the debugger to find this out, thats the spirit:)

2. What GC mode is used in the web development server (cassini) on a quad proc machine? Why? (you can choose from server, workstation or concurrent-workstation)

Nice catch for those of you who figured out that it was workstation GC because it is a winforms app. More specifically it is concurrent workstation meaning that most of the GC stuff will be done while other threads are executing to avoid pausing the process.

Typically you wouldn't stress test against the web development server, but if you do any kind of memory investigation, looking at when memory is released etc. you need to know that what happens on your web dev server is probably not the same thing that will happen on your multiproc web server in terms of garbage collection.

3. How many finalizers do we have in a .NET process running the Server version of the GC on a quad proc machine?

This is probably one of the more important questions, and there were different bids on this but I think most people answered one per process which is correct.

Ok, so why is this important to you? Well, its just a reminder that any objects you create that require finalization (i.e. that has a finalizer method or a destructor) will have to go through one single point (the finalizer thread) unless the object is disposed and GC.SupressFinalize is called.

Now this is interesting, and there were a lot of different bids on this one, but I think Brian said it best

When a GC occurs which happens either when you allocate an object which makes gen 0 exceed its current capacity, or when GC.Collect is explicitly called and the object is no longer referenced.

A few clarifications here:

a) If we exclude GC.Collect calls, this means that a GC will only occur on allocation. I have mentioned this before, but I think it is worth mentioning again... a classic mistake to make is to run a stress test and then come back 10 mins after the stress test and wonder why memory is not being released. In other words, objects may well be ready to be released but no allocations are made, meaning no GCs will occur, so memory usage will stay flat.

b) There are places in the framework where GC.Collect is called. NativeCPP mentioned that a GC would occur when there is memory preassure, I suspect he meant when a gen limit was reached, but it will also occur in ASP.NET apps when we are closing in on the memory limit set in IIS/machine.config. To see if the app is calling GC collect, you can look at .NET CLR Memory/# Induced GC

Recently I also found another location where GC.Collect is called. In some parts of the System.Drawing namespace GC.Collect is called to avoid having too many handles to brushes/fonts etc. in the process and getting a stale process because of this. Typically no-one should run into this since it is only likely to happen in a server app and System.Drawing is not supported in server apps according to the MSDN docs, but still it happens.

c) Regarding the object no longer being referenced, a reference in this case can be a lot of things, but in short, references are typically

strong - static/global objects, including cache or in proc sessions since they are rooted in static objectsreference counts like x.static suggested - mostly for com wrappersstack - objects that are still alive on a threadpinned objects - usually happens when there are native API calls, or remoting/webservices involved.finalizer - if the object has a finalizer the object will still be garbage collected, but it will hang out afterwards, waiting to be finalized

After some of the comments I realized that the wording of the question is a bit ambigous. What I meant with the question was basically "when does a collection occur", but I am adding in some comments from Maoni that would answer the question i actually posed "when is an object collected":)... plus a comment on what nativecpp had mentioned, just goes to show, you learn something new every day:)

"Tess, re question 4. When is an object garbage collected?

When a GC occurs which happens either when you allocate an object which makes gen 0 exceed its current capacity, or when GC.Collect is explicitly called and the object is no longer referenced.

This is a bit incorrect - the correct answer should be:

When a GC that collects the generation your object is in happens, your object, if is dead, is collected. If your object is in gen2 and we are only doing a gen0 collection, your object is not gonna be collected.

>>>NativeCPP mentioned that a GC would occur when there is memory preassure, I suspect he meant when a gen limit was reached, but it will also occur in ASP.NET apps when we are closing in on the memory limit set in IIS/machine.config.

Actually he/she is right - we do trigger GCs when the machine is under memory pressure. This is described in my first Using GC Efficiently blog entry."

5. What causes an object to move from Generation 0 to Generation 1 or to Generation 2?

If the object is still referenced during a garbage collection it will automatically move into the next generation, this includes objects referenced by the finalizer (freacheable queue).

Brian mentioned that they would not be moved if they are pinned... I don't want to make a categoric statement here since I might very well be wrong, but I don't see why pinning would make it not move into the next generation. The term "move" here is somewhat fictive. The objects don't necessarily move, instead the generation lines move, so that at the end of each garbage collection Gen 0 will always be empty, meaning that if a pinned object was located in Gen 0, by the end of the GC it would have to be in Gen 1.

6. If you look at the GC sizes for Generation 0, 1 and 2 in perfmon, why is most of the memory in the process in Gen 2?

As Stefan and others mentioned gen 0 and 1 have fairly small sizes, and once these are reached the objects in there that are still referenced move into Gen 2. Although the sizes are not "fixed" as Stefan suggests, but rather dynamic over the life of the process, in order to get the most value out of each GC, they are still limited, and objects will only stay there until the next Gen 0/Gen 1 GC as opposed to Gen 2 where referenced objects will stay forever.

In other words, given a limit x for Gen 0 and y for Gen 1. The rest of the .NET memory usage (for managed objects) has to be in either Gen 2 or the large object heap. No matter how good your allocation pattern is, there is no way that you can fit say 100 MB in Gen 0 and Gen 1:) The trick is just to not let objects spill over to Gen 2 and then die immediately so that you have a lot of turnover in Gen 2.

7. How many heaps will you have at startup on a 4 proc machine running the server GC? How many would you have if the same machine was running the workstation GC? Will the memory used for these show up in private bytes or virtual bytes in perfmon or both?

In retrospect I should have specified this a little bit. Some people mentioned the runtime heaps, NT heaps, loader heap etc. and I have to admit, I was just too snowed in in my own little .net object world when I wrote this question. What I meant was, how many .NET GC heaps will you have. Even there the question is debatable. In an interview situation I would have said that 4 was ok, but what I really wanted the answer to be was 8. 4 small object heaps and 4 large object heaps.

Ok, so why am I such a stickler for this number of threads and number of heaps bladibladibla? Well, a lot of people pose the question, why do I have so many virtual bytes at the startup of a .NET process and why does virtual bytes go up in chunks? When you look at that it is important to know how much of that memory goes to these GC heaps and also knowing that they will probably eventually be filled with .net objects, so a large variation of private bytes/virtual bytes at the startup of the process is not neccesarily a sign of something really bad going on.

8. (Leading question:)) Is the fact that you have mscorwks.dll loaded in the process in 2.0 an indication of that you are running the workstation version of the GC?

Ok, that was probably not one of the best questions:) As pretty much all of you figured out, both workstation and server now live in one single dll called mscorwks. You can check out !eeversion to see which one you are running and in the server case, with how many heaps

9. Can you manually switch GC modes for a process? If so, how and under what circumstances?

Surprisingly, a lot of people talked about gcserver enabled=true, and then answered no to this question:) For the correct answer, check dal's response

a) you can not run the server version on a single proc box, it will default to workstation

b) you can not run concurrent while also running server

c) if the runtime is hosted, the hosts GC mode will override the configuration

10. Name at least 2 ways to make objects survive GC collections unneccessarily.

There are plenty of ways to do this and a lot of you had good answers on this one. To mention two... create an unneccessary finalize method and write code that causes objects to have a mid-life crisis i.e. for example create a function that sets up a lot of objects and then go on to calling a long running operation (database request or webservice call), which causes the objects to be rooted by the thread during the whole long running operation, giving the process a good chance to perform a GC in the meantime.

In the first case (finalizer), dispose the objects when you are done. In the second case (mid-life crisis), set the objects to null if you are not planning on using them anymore so that the GC knows that they are ready for cleanup.

11. Can a .NET application have a *real* memory leak? In the C++ sense where we allocate a chunk of memory and throw away the handle/pointer to it?

Again there were a lot of good answers in the comments. Although you can't leak a .net object in the classic sense of the word, i.e. create an object and throw away the pointer, unless you are in unsafe mode, you can do plenty of things to create memory leaks.

Btw, there is also plenty of ways to create high and increasing memory usage in a .NET apps by rooting objects without realizing that you are rooting them. Take a look at the memory issues section of my blog for a few of the ways you can do this.

12. Why is it important to close database connections and dispose of objects? Doesn't the GC take care of that for me?

I think pretty much all of you got this one:) To paraphrase Arnaud, "The finalizer will eventually be called, after the object has been made available for garbage collection. Knowing that there may be quite some time until an object gets GC'ed, and that many resources are limited, you call Dispose yourself as soon as you're over with an object. It doesn't get GC'd when you call Dispose, but it releases its resources."

And of course, you avoid dragging it through the Finalizer thread.

Oh, btw, if you enjoy this kind of thing, and you live in the Seattle area, you may just want to check out Maoni's blog, I hear they have a job opening in the GC team, although I'm sure the interview questions there will be a bit harder than this quiz:)

For some reason, on the question about manually switching GC modes, I thought you meant after the process has already started up (hence my comment about doubting it but if it’s even possible it would probably be in a hosting scenario).

Thanks for the great quiz (and please keep ’em coming). I think we all learned a lot (as we always do from your posts!)

Most, if not just about all of these questions / answers really should not matter to the developer. They are specific to the implementation. There are Microsoft’s (multiple) versions, Mono’s version, and a few other minor versions out there as well. Garbage collection is done differently on each version. Threads are allocated differently as well.

Saying that, you did specifically mention ".NET", so that would be (a) Microsoft implementation. The discussion is interesting, but it does not mean much when clients execute my applications under alternative implementations.

I find all this commenting about these questions not mattering to the developer a bit tiresome. I read this blog precisely because I’m the kind of developer who cares about this stuff. It’s a great point that a fair number of implementation details vary depending on whether you’re talking the CLR, SSCLI, Mono or any other implementation of the CLI (though it’s a lousy reason to avoid understanding them!). These kinds of issues should definitely be kept in mind (and explored) so you can keep a broader perspective when making design decisions. I think the more a developer takes time to understand what’s under the covers, the better off they’ll be. There’s not enough time in one career to be the expert in every detail, but I’m gonna retire trying…

To chime in on what Brian said, if implementations of .Net on other platforms matters to you, then you should learn the equivalent paradigms from the MS implementation on the other ones. The GC is one of the most integral parts of .Net, and a good understanding of how it works (and more importantly, what NOT to do) can make or break an application.

If you’re a C/C++ developer, you certainly know how all the different memory allocation API’s work, right? Point is that stuff like this touches every single area of an application, and therefore is integral in understanding.

You are completely correct, and appolgies for not catching this when I copied and pasted arnauds answer. Dispose will not automatically be called, but in most cases as you mentioned the finalizer/destructor will call dispose.

Quote: "[…] if implementations of .Net on other platforms matters to you, then you should learn the equivalent paradigms from the MS implementation on the other ones."

I care about how my applications run on all platforms, regardless of the implementation. Optimizing it for a particular implementation does not optimize it for all implementations.

I agree with your point about C/C++. However, the crucial difference is that I will need to, at the very least, I must recompile, and at the most, port the code before running it on another platform. My .Net applications are compiled once and run on multiple platforms. There is a world of difference there.

While I am interested in the inner workings, they are implementation details and I should not and can not rely on them being consistent on multiple platforms.

>>>When a GC occurs which happens either when you allocate an object which makes gen 0 exceed its current capacity, or when GC.Collect is explicitly called and the object is no longer referenced.

This is a bit incorrect – the correct answer should be:

When a GC that collects the generation your object is in happens, your object, if is dead, is collected. If your object is in gen2 and we are only doing a gen0 collection, your object is not gonna be collected.

>>>NativeCPP mentioned that a GC would occur when there is memory preassure, I suspect he meant when a gen limit was reached, but it will also occur in ASP.NET apps when we are closing in on the memory limit set in IIS/machine.config.

Actually he/she is right – we do trigger GCs when the machine is under memory pressure. This is described in my first Using GC Efficiently blog entry.

Alright, Stephens…if you’re going to get all technical on us, you’d better be prepared to add "and the object doesn’t implement IFinalizable"…we can play this game too 🙂 And don’t even make me go all "rude thread abort" on you! 🙂

I have a question about % time in GC and the 3 generations. I have an app that has a very low % of time in the GC, but the ratio between the generations doesnt seem to be ideal. The average % of time in the GC is 2.27%, but the ratio between gen0 and gen1 is .65, and the ratio between gen1 and gen2 is .19. Should i be looking further, or since the % is so low, just ignore the ratio?

daveblack, on 32-bit the usermode process (the .net process) can address 2 GB, unless the /3GB switch is added to boot.ini (only available on some operating systems, and you need to really think about it before using it as it "steals" 1 GB from the OS).

On 64-bit it’s some gianormous number (16TB i think), but in both cases, the amount of mem you use (in all processes on the system) have to be backed up by RAM or page file.

In workstation mode there is one .net GC heap + a large object heap, and each heap can have multiple segments. In server mode there is one .net GC heap + 1 LOH per logical processor and each of them can have one or more segments. Then there is also a .net loader heap, and several NT heaps used by a process, so there is a variety of different heaps involved, depending on how you see it.

Segment sizes for the GC heaps vary between .net versions, service packs, GC modes and architectures.

I am dealing with a heavy web applcation that runs around 120 users. The worker process grows to 1.8GB and then it crashes. We are running a 12GB server with Win2k3 Enterprise 32 bit and the /3GB parameters in the boot.

Isnt there a way without changing code to "collect the garbage" more often. The vendor told us that the system requires 16MB per user, but when they log off and the memory is not released, all the new user cannot log, and eventually it all crashes.

We initally used process recycling, but then all users lost their sessions.

Given that you commented on this post, i assume you already read up on how/when the GC collects etc.

If your process goes up to 1.8 GB and then crashes (with OOMs) your problem is probably not that the GC doesn’t collect often enough, but rather that the objects are not collectable when the collections occurr.

You can set a memory limit on the app pool when the GC starts scavenging cache etc. more agressively, but based on what you are saying above that probably wont help.

You need to get a couple of dumps of the system as memory grows and investigate to see what the memory is used for and why the objects cant be collected efen after the users log off. Perhaps they store a lot of items per user in session scope and then, unless you abandon the session, the session will be alive 20 minutes after the user hits their last page. If that is the case, and the mem is needed, perhaps you could add code to abandon the session on a logout.

We are dealing with a 3rd party product, so we have no access to source code. We also tried to spawn multiple worker processes for the pool that holds the application, but then it stops working. That means that we have 1 worker process and a max of 3GB (using /3BG) of memory.

I understand that 3GB is the max a worker process can use on a 32b system.

Is it possible that the amount of users we are tryign to crunch is so high (100-150) that even tough the GC is doing its work, the process still reaches a limit and then causes an OOM error?

The vendor claims that a user requires 16Mb, but i can clearly see that when i login, the memory reserved is around 30-35MB. So 100 users x 30 MB, is already 3GB.

Anyway, I will follow your advice and create some dumps to send to Microsoft and see what they have to say.

Updating my issue in case someone else has a similar problem. After carefull analysis of the debug diagnostic dumps, we have concluded that issues are caused by multiple datatables put in session. The vendor of the application did not expect a high user load on the system and did not build the application with that in mind.

Since we are quickly filling the memory that a worker process can use on a 32bit system [2GB], and we are now targeting to have 300 concurrent users, our solution is to load balance the application based on the amount of average memory a user session requires [20mb in our case].

what i mean is that whenever you have a finalizer or destructor that finalizer/destructor has to be called by the finalizer thread before you can garbage collect the object.

If you close or dispose the object, and if the close/dispose calls supressfinalization, the finalizer no longer has to call your finalizer/destructor. Instead your object is ready to be garbage collected as soon as it is not referenced anymore.

An Excellent Blog. Very informative. We have .NET app using ServicedComponent running in DCOM (dllhost). We consistently see the memory reach close to 2GB, but always wondered why the GC doesn’t seem to be running on idle.