“The new C++ for the new Windows / Part 0 / Putting bugs into buckets” by Kenny Kerr

Kenny Kerr is a software developer and entrepreneur. He is passionate about software and getting it done right. His new company recently launched Window Clippings 3, a screen capture tool designed to produce the highest quality screenshots.

This is the first article in a new series about C++ for Windows. I would have liked to start this series with a somewhat more fun topic but it needs to be said. There are certain prerequisites for starting a project on the right track and one of those is defining the approach for dealing with run-time errors. Many writers avoid talking about error handling when it comes to C++ because there are different approaches and differing views on how it should be done. I would like to say that you can use whatever approach suits you. I must however prepare the way for the remainder of this series and without a consistent way of dealing with errors subsequent articles won’t make sense.

As my approach relies on the Standard C++ Library and in particular the Standard Template Library the use of C++ exceptions is a given. The challenge then is to come up with a rational strategy for handling run-time errors. First I’ll describe what is to be done with exceptions and then how run-time errors in the Windows API are handled.

The first rule of exception handling is to do nothing. Exceptions are for unexpected run-time errors. Don’t throw exceptions you expect to catch. That also means you must not use exceptions for expected failures. If there are no exception handlers then Windows automatically generates an error report, including a minidump of the crash, as soon as an exception is thrown. This provides a perfect snapshot of the state of your application without unwinding the stack. This helps you to get as close to the source of the error as possible when you perform postmortem debugging. If you sign up with the Windows Quality Online Services (Winqual) then Microsoft will even collect these error reports, categorize them and provide them to you at no charge. This is tremendously valuable.

The next step is to clearly distinguish between fatal run-time errors, those that will crash your program, and run-time errors that are expected and that you will handle in your program so that it can continue to run. Some of these will be unique to your application but many will be common to most applications. Keep in mind that unexpected run-time errors that will be reported with exceptions indicate two things. Firstly they may indicate a bug in your application. You assumed the contents of a file is in a certain format when it is not. You expected a folder to exist when it’s actually missing. You sent a message to a window that’s already destroyed. Your algorithm dereferences an invalid iterator. And so on. Secondly they indicate problems outside of your control. Some other factor on the computer causes memory allocations in your application to fail. You fail to get the size of your window’s client area. You fail to write a value to the registry. These types of run-time errors typically point to a bigger problem. In both cases you don’t want your application to continue and an exception that results in an error report is the fastest way to bring your application down so that it doesn’t cause any further harm and lets you debug the problem when the error report arrives.

On the other hand many errors can and should be handled by your application. You may expect writing a value to the registry to succeed but you probably shouldn’t expect reading a value to succeed. Parsing text should be expected to fail. Creating a file may fail if the complete directory structure isn’t already present. And so on. In these cases using exceptions is not usually appropriate. It is usually simpler and more efficient to handle the error directly and as close to the source of the failure as possible.

Now let’s turn our attention to the many functions in the Windows API and the various ways they report run-time errors. The Windows API is unfortunately not very consistent when it comes to reporting run-time errors. You can think of it as having islands of consistency in a sea of inconsistency. There are four common types used for reporting errors explicitly using a return value.

BOOL

Many functions return a BOOL, a typedef of int, indicating either success or failure. It is best to compare the result against 0 rather than 1 since some functions only guarantee that the result will be nonzero upon successful completion. Some but certainly not all of these functions will provide a more descriptive error code that you can retrieve using the GetLastError function upon failure. The values returned by GetLastError can usually be found in the winerror.h header file excluding the HRESULT values.

LONG/DWORD

Different libraries use various typedefs of long and unsigned long including LONG, DWORD, NET_API_STATUS and others to return an error code directly. In most cases success is defined by a 0 return value. These functions typically directly return error codes defined in the winerror.h header file excluding the HRESULT values.

HRESULT/NTSTATUS

Many newer libraries as well as most member functions of COM interfaces use an HRESULT return value to report errors. Some functions that have roots in the Windows Driver Kit use an NTSTATUS return value. Both of these define identical packed error codes. It is not uncommon for values from winerror.h, possibly returned by GetLastError, to be packed inside an HRESULT before returning it to the caller. An HRESULT or NTSTATUS value can have multiple values indicating success and of course multiple values indicating failure. You need to check the documentation for any function that you’re using but in most cases a 0 return value indicates success with negative values indicating failure. Additional values greater than 0 may be defined to distinguish between different variations of success.

A small number of functions have a void return type. This either means that the function cannot fail, usually because whatever resources it relies on have already been allocated, or that any failure will be reported at a later stage. Other return values often imply an error given some sentinel value. This is common in functions that return a handle or pointer to a resource. You just need to read the documentation carefully to determine how to distinguish success from failure as it is not always true that 0 or nullptr alone indicate failure.

Finally for all those internal assumptions in your application there are assertions. Prefer to use static_assert for compile time validation. When that’s not possible use an ASSERT macro that is compiled away in release builds. I prefer to use the _ASSERTE macro from crtdbg.h as it stops the program in the debugger right on the offending line of code.

The listing at the end of this article includes error.h and error.cpp used for error handling in subsequent articles.

Although I avoid macros as much as possible, they remain the only solution for implementing debug assertions. I also define VERIFY and VERIFY_ mainly for checking the result of functions called within destructors where exceptions should not be used. It at least lets me assert the result of these functions in debug builds.

Namespaces are used to partition types and functions unless they’re specifically designed to work together. The error handling functions are however so fundamental that they reside in root kerr namespace. A few overloaded functions are provided for checking the return value of most functions in the Windows API. Argument matching and integral promotion rules help to funnel the various return types into the appropriate check functions. Specifically the int overload handles bool and BOOL return types, the long and unsigned long overloads take care of the rest. The check template function also comes in quite handy when you need to check for a specific value rather than the usual logical success or failure return values.

The check functions throw check_failed exceptions. The check_failed type includes a member that holds the error code passed to the check functions or returned by GetLastError. This comes in handy when you receive a minidump which contains the address of the exception and then allows you to easily find this error code. This can often be invaluable in determining the cause of the crash.

Why did I title this part “Putting bugs into buckets”? Well that’s because Windows Error Reporting categorizes error reports into what they call buckets. And that’s all for today and as always I’d love to hear what you think.