The True Cost of .NET Exceptions

Here's an article under Jon's Homepage that was just brought to my attention. It's an interesting analysis of (approximately) the raw cost of throwing an exception. Jon is definately right on that 200 exceptions per hour isn't gonna hurt anything.

However, there are some problems with the analysis at least in my opinion, because I believe it understates the true costs. I thought it would make a fun exercise to think about the approach taken because it is an interesting topic, and the techniques for thinking about the true cost are generally applicable.

So, asking Jon's kind forgiveness for picking on him, gentle readers, can you tell me why I would say this benchmark understates the cost?

Well one thing is that it will cost developers a lot of time having to step through exceptions that are a part of regular control flow instead of only breaking in the debugger when a catastrophic failure occurs.

A few things come to mind but they are probably outside of the scope of the "true cost" but this is a topic that interests me greatly so I am very curious about what specific things you are referring to.

Here are my 2 cents…

1. The time taken doesn’t include clean up costs of objects associated with throwing the exception

2. The article discusses ApplicationExceptions. A common exception is the NullReferenceException, which has additional costs of throwing a native access violation exception

3. If the exception is not handled directly in the function where they are thrown they may be rethrown at multiple levels. For example if you are throwing an exception in an ASP.NET app and it is not handled directly you would incurr costs of moving up the chain to the global exception handler.

4. In the example we have one looping function. In the real world we would have multiple threads executing simultaneously. The added CPU usage may have a decremental effect on the process if we throw enough exceptions.

5. Common coding practices for handling "unexpected exceptions" involves logging them to the eventlog or SQL server which makes the actual cost of the exception a lot larger.

I have more real-life examples where the cost of an exception is much larger than the cost of just throwing it but I think I’ll stop there:)

The cost of an exceptions is dependant on the call stack before it, it’s not fixed and higly dependant on the environment. Every rethrow of the exption multiplies it’s cost proportionaly. So having a relatively low cost for a situation where and exception traverses a stack that only eventualy catches it might not be verry big. But an the cost of an excetion that traverses a stack where it’s rethrown n times it has a cost of about n times in magnitude. Aso add a slight cost for every specific catch blocks it traveses wihout catching it and also some costs for any finaly blocks that must be executed and the costs for creating container exceptions, but these costs are usualy neglijible compared to rethrowing.

So I think the trap that you can fall in is the fact that your initial calculated cost can be multiplied without ever beeing aware of it!

Another pitfall might be that the code that trows exceptions might find that in some circumstances the exceptional conditions are met way more often than originaly anticipated due to let’s say an architecural change or data shifts, in wich case something that wan’t a problem before coud become a seriouse performace problem.

Combining all this and I want to stay as far away from exceptions as posible!

The glaring omission I notice is the cost to JIT the exception handling code throughout the callstack. If you’re throwing exceptions from *the same place* all the time (as the tests do), you only pay once and may not notice that cost over the AppDomain’s lifetime. But, if you’re using exceptions inappropriately *throughout* your code, you’re paying all the associated costs with JITting the exception handlers, keeping the EH pages in memory (and the opportunity cost of paging out data or other "warm" code).

I would say, he is not considering the CPU utilization there, so yes, it might be going fast through the exceptions in his example, but he didn’t meassure the CPU utilization, so if we had more processes running at the same time, things might be very differnt. I’d also say the stack trace plays a very important role on any realistic projects without even considering serializing stuff, at that point it gets just nasty

One time my boss asked why my code had access violations. We couldn’t even step to the place where the ACCVIO was occuring. Finally I figured out what to search for on Google. Not only that, but Google found one hit in the entire world. One person in the entire world had found, and had posted a comment in another Microsoft employee’s blog, about a Microsoft component that used exceptions that way. I don’t remember the details now but I think it was an editbox control.

I would hazzard a guess and say that perf really suffers when you decide what you’re going to do when you handle the exception. Eg. If you automatically send the error report to some central location (via webservice, email etc..), or just dump it in a lof file, those costs are WAY higher than anything to do with the actual throwing of the exception.

(Good design would take care of this, use another thread and do all your reporting async.)

Rico, perhaps this would be a good occasion to explain the vast difference in exception cost that we observed between XP and 2003.

Using a test program similar to Jon Skeet’s (simpler, actually) we observed that the maximum number of exceptions/sec was much lower, even though the XP machine was a common laptop and the server a recent dual Xeon beast.

Presumably throwing an exception causes the CPU to fetch code and data it would otherwise not have executed. This will evict data from the cache that might now need to be fetched again later when it would otherwise have still been in the cache.

So the net result of this is that everything might end up running in slow motion for a short while after the exception has been thrown.

Jon’s test won’t be illustrating this because he’s doing nothing but throwing exceptions. It’d be hard to get a meaningful measure of this outside the context of a specific workload – it seems like the kind of thing where a microbenchmark like this is going to give even more unrepresentative results than normal. You need to be doing meaningful work with a reasonable quantity of data. (Of course you could deliberately skew the results by making the benchmark execute just enough code working on just enough data to have the caches brimming but not quite overflowing, and see how fast you can do work, and then see how much of a difference throwing an exception every N iterations, for various N, including N=1. That should give a particularly bad impression so it’s probably not terribly realistic, but if you could manage to engineer a ‘worst case scenario’ and benchmark it, that would actually be an interesting result, despite being skewed.)

Another possibility: I don’t know if the CLR does this, but a technique I’ve often heard described involves generating code and data structures relating purely to exceptions in completely different pages from the main line stuff. (This improves the density of useful stuff in the pages used most of the time.) So the net result of this would be that there’s a very high cost for the first time you throw a particular exception (and also when you throw one when you’ve not previously thrown one for ages).

This is actually observable in some cases. The suggestion that you should use the computer bleeping 3 times and stopping for 2 seconds as a rough guide to the intrusiveness of an exception doesn’t seem so far off when I think of the "exception premonition" experience. If I’m demoing something in code, I can usually tell when I’m going to see an exception dialog a second or two before it appears, because everything grinds to a halt, and the machine thrashes away for a bit.

Of course a lot of that will be because I’m at the debugger… But it does illustrate the point that throwing an exception may cause loads of stuff to be paged in.

Again, Jon’s test won’t see that because it amortizes that particular aspect of the cost of the exceptions. Then again, if you’re throwing 200 exceptions an hour, the cost may well also be amortized there too, because the exception code paths will most likely be part of your working set.

Of course, allowing this stuff into your working set may amortize the transient costs, but that’s likely to have resultant effects on performance elsewhere. The extent of these will depend on how close you were sailing to the wind on memory usage.

I’m also wondering if there are metadata costs associated with the stack trace. Throwing the same exception from the same place every time will again amortize these, but if you’re throwing from lots of different places, perhaps you could end up materializing all sorts of reflection objects that wouldn’t otherwise get created. (I’d hope that the stack trace is held in terms of metadata tokens or something similar rather than actual MethodInfos until such time as you ask to see them. So I’m hoping this one’s not an issue. Although if you have a logging methodology that writes out the stack trace every time an exception is detected, this could be a real issue…)

I think this scenario is rather misleading. On my system (an AMD Athlon 64 2.2 gHz Server 2003, SP1) on .NET 2.0 (32 bit, release build) this benchmark produces about 29,000 exceptions/sec. This means it takes on average about 35 microseconds to throw/catch the exception.

Compared to an hour, that’s tiny. But compared to returning an error code, it’s about 35,000 times more expensive.

Compared to simple system calls (eg. SetEvent, it’s about 70-100 times more expensive.)

It’s also several times slower than writing 128 bytes to a file (where the file write is completed in the FS cache).

To evaluate the 200/hr scenario, one has to ask several other questions. Do these 200/hr happen to coincide with the 200 transactions/ hr that the application processes? If so, what is the time with and without the exception. If it takes 35 us to process a tranaction without the exception, and 70 with the exception, it’s doubled your processing time. That’s a big difference.

I agree that this test is too simple since the exception creation is very cheap compared to the stack unwinding. A much more realistic test would throw an exception which is catched two or three functions above to let more stack unwinding happen.

7. Stack traces captured into the exception object add overhead; memory and the time and effort to capture it.

8. Unless the delivery mechanism has changed, ring transitions from kernel to user mode occur as the exception is delivered to the windows subsystem via the kernel.

9. The debugger gets notified, potentially adding more overhead.

10. It may turn into an unhandled exception, which adds more overhead and could also result in terminating the application.

11. Exception handling is thread specific; one may get captured on one thread and rethrown on another thread, which introduces the initial overhead all over again (ring transition, debugger notification, stack walk, etc.).

12. If using catch-wrap-throw to add context, each cwt operation incurs the initial cost all over again.

13. This is speculation on my part, but I would expect that the runtime uses metadata associated with methods to determine where a catch block is located in the JITed code, so it has to load that information as it walks the stack searching for a handler. This may result in more cache misses as the metadata may not be loaded into the cache.

14. Nesting of try-catch-finally blocks are not shown, nor is the method call stack depth increased.

15. The test does not show any finally blocks, which could incur a potentially unlimited amount of processing time. This would be a more pronounced problem is the finally blocks were located lower in the stack then the catch handler – each finally clause would get executed before the catch handler did. An interesing side-problem is what happens if another exception gets thrown within a finally block.

16. Not shown are additional time delays due to accessing culture-specific resources to populate the exception message (which typically must be done in production code).

17. Non-local effects and side effects are not measured. In other words, in real code you don’t always know where, or if at all, an exception will be caught and handled, so knowing the total cost is not possible.

18. I’m not sure of the side effects of mixing exception handling and delivering thread aborts into the thread that is walking the stack (i.e. not in a finally block). Also, thread aborts are not delivered into code executing a finally block, but I believe they are to code executing in a catch block. There are some potentially nasty race conditions there (heopfully handled by the runtime).

All that said, it does not mean that exceptions should not be used, just don’t abuse them – you don’t want to write an exception-bound application! The more stringent the performance requirements the more care must be spent analyzing how to detect and report errors. For example, I would not use exceptions for error handling in a tight loop, but I do not hestitate to use exceptions in applications, and I catch-wrap-throw as needed to add meaningful context information. The extra overhead is often less important then the clarity of a clear description of the problem.

If you’re throwing thousands of exceptions/second then there’s a design problem. If you’re throwing several hundred per hour then I say party-on.

David’s _magnum_opus_ of a response is full of excellent points. While I can’t say I agree with all of them the spirit seems right on to me and you’ll find my points of disagreement to be not very important compared to the points of agreement when I post my own thinking on the matter (probably later today).

I think what Jon’s article is overlooking is that his timing is done in isolation. He compares throw a lot of exception only with throwing a few.

Generally, if a process is throwing many exception, either it is experiencing some catastrophic failure (and does know enough to die), or (far more likely) it’s using exception to handle normal program flow.

(It parses a string by potenially throwing an exception on each character)

If you read my comments on that article (follow the "Discuss in the Forums" link at the top, or take this direct link: http://dotnetjunkies.com/Forums/ShowPost.aspx?PostID=897), I show that a properly written algorithm, with doesn’t abuse exception is 500 (non-debug) to 6000 (debug) time faster.