Windows Error Reporting and the Appdomain.UnhandledException Event

Sometimes applications fail. If you are the author and it fails on your machine, typically you fire up the code in a debugger, figure out the issue, fix it, and rebuild the code. If the applications is out in the wild, perhaps with millions of users, it gets much more complicated. Yet, the goal is still to fix the application to improve user experiences. Without knowing the details of the failures, they’re very difficult to fix.

In the old days, people used to write amazing exception handlers in their applications. The FoxPro world had several folks giving talks around the world just on good exception handlers. We called them error handlers at the time. They would try to capture as much information about the error as possible, perhaps including the user’s machine configuration, filenames, call stacks, etc. Some would even include recorded user actions.

The handlers would try to send the information back to the application author. After all, the application user has no idea what an application really does under the covers (even in non-error conditions): it could be secretly gathering credit card numbers. That’s one of the biggest reasons why Easter Eggs in software are verboten: the author has hidden some code in the application that could be doing anything.

In the old days, not many people were concerned with user’s privacy. In fact, when you run an application, or even visit a web site, you’re actually asking someone (perhaps a remote computer or software developer) to do something on your own machine. Perhaps this is merely showing some text, but it could be running a script that reads your personal information or plants a virus. Keep in mind a computer can execute a billion instructions in a second.

In order to do handle errors, some code would have to run to handle the exception. Without operating system support, a totally crashed application wouldn’t be able to run exception handling code.

Many years ago, Microsoft was getting lots of reports of “Blue screens”: crashes of Windows. These were seriously hurting customer experiences, and even today, BSOD, or Blue Screen of Death is a well-known term. Instrumenting Windows to have an error recording and reporting service lead to a huge reduction in these errors. It turns out that at one point, 90% of the BSODs were due to poorly written device drivers, specifically video drivers). To lend credence to that, device drivers run in kernel mode, and thus have full access to the entire system. User applications run in “User mode”: a user application is isolated from the system where it’s much harder to crash the entire computer; users experience only the application failing. It wasn’t necessarily all the fault of the device driver authors: perhaps the documentation or device driver architecture could have been improved to prevent erroneous code.

Thus was born Windows Error Reporting, which enables Microsoft to capture information about application failures after they fail and send it to the application authors (internal or external to Microsoft) so that they can be fixed.

When an application fails, WER is invoked. It checks various settings to determine whether to upload the data, or present a dialog first.

The end user or the Domain policy for the user’s machine can control the behavior of exceptions:

Microsoft is very concerned preserving privacy when capturing information, especially without consent. We’re also concerned about using customer paid network bandwidth when sending data.

WER encapsulates and provides common exception handling tasks so all applications can take advantage of them:

1. Common user permissions, control, consent and user experience

2. Capture information

3. Send it back, handling various connectivity states

Sometimes you want to write custom data to the Window Error Reporting service. This requires capturing all possible crashes before the OS WER kicks in, then formatting the data and calling the WER functions to package up the info and send it. You can easily add a Try…Catch block at the main entry point of your application. The Catch can then capture information about the error. However, Try..Catch only catches exceptions on the currently executing thread. Thus you must surround all your thread routines with Try..Catch. This gets cumbersome, especially with ThreadPools, Tasks, Windows Workflow Foundation Activities.

The sample code adds 3 controls to a window: a checkbox, a button and a textbox.

The textbox just shows what ‘s happening.

The button throws an exception from a threadpool thread. Even though the code is surrounded by a Try..Catch, because it’s run on a different thread, the Try..Catch is not relevant. This is to simulate what happens to users when they write cod that spawns threads,

Now click on the checkbox to subscribe to the Unhandled Exception event for the appdomain, then click the button to cause the exception.

The UnhandledException Handler code is invoked and it writes an event to the event log. It also shows a separate window indicating the exception. When the window is closed, the application terminates, but does not invoke WER because to the OS, it did not crash.