Of all the great brain teaser and logic games out there, for my money nothing can beat debugging kernel mode software. None of them are as complex, vast, and downright satisfying as solving a complex interaction between complex software running on a complex platform. And if that wasn't good enough, debugging is incredibly self reinforcing. The more you do it the better you get and yet there always seems to be a more complex problem lurking around the corner ready to push the limits of your current skill set.

Hmm, maybe I'm on to something. Maybe someday the Sudoku section in the newspaper will be replaced by a kernel debugging section. I can see it now, my Mom sitting in a chair on Sunday morning with her reading glasses on wondering if the address was pageable or completely invalid. Shigeru Miyamoto will take to the game and soon people across the world will be playing Kernel Debug Heroes on the Nintendo Wii.

Back to Reality

Until that day comes, however, we're left to the small minority, stuck having only each other to entertain with war stories of particularly difficult problems we've debugged. One thing I find particularly fascinating in speaking with others about debugging problems is the process they used to come to the solution. What I find is that people who debug often have a core set of tools and commands they use to get their job done and everyone's is slightly different. Commands they use all of the time I've never heard of, elegant approaches they use to a particular class of problem make mine seem barbaric, and vice-versa.

For a long while I've wanted to put together an article on a random collection of approaches we use to solve particular problems for two reasons. One, I want to see if someone else out there can offer alternate, time saving approaches, and two I'd hope that it would inspire others to open up their chests and share what their unique methods are.

Thus, the following is a mixed bag of techniques we find ourselves going to constantly while tracking down issues.

Application X No Longer Works Properly

We come across these a fair amount, especially in the file system and file system filter space. A filter is installed and application "X" starts to misbehave. These bugs are hard, especially when the application in question is complex and you don't have source for it (Office applications are a particular pain point here).

The main question on these is where to start? Typically you either get a silent failure or a not very helpful error message such as, "Error Saving File". Failures of this type generally fall into one of two categories:

1) API call fails unexpectedly

2) API call succeeds but returns unexpected result

By far the second type of failure is much more difficult to track down, typically requiring quite a bit of time analyzing the application, trace logs, and information from utilities such as FileSpy or IrpTracker. While the first type can also be tracked down this way, there is a trick you can go to in an attempt to quickly track the problem.

The trick is based on this assumption: most applications written for Windows use the Win32 API either directly or indirectly. As most of us know, the Win32 API is built upon the underlying NT native API, which provides a superset of features actually exploited by the Win32 API.

Because I'm trying to confirm or deny that a failing API call is to blame, what I want to know is, "Which API calls are failing during the failure case?" This basically means finding a single point in the system that is always called as a result of an API failure. While there's nothing like this documented, I do know that all Win32 APIs that use native APIs must translate the native NTSTATUS value returned from the O/S into a Win32 API return value. And, as luck would have it, they all call the same API: RtlNtStatusToDosError. This used to be undocumented, but a Google search shows that MSDN has provided the API:

ULONG RtlNtStatusToDosError( IN NTSTATUS Status );

What I want to do then is set a breakpoint on RtlNtStatusToDosError in the faulty application. When the breakpoint fires, I want to see the failure status and the call stack of the failing call. If all goes well, I'll see some obvious API failing and that will lead the rest of my analysis.

For fun, let's see an example. We'll trace trying to save a read only file in Notepad running on our target machine. We'll put a breakpoint in the status conversion routine that prints out the first parameter, the callstack, and then resumes. Note that this is an x86 target.

While this doesn't immediately provide the solution, it does give a starting point for the investigation.

Breaking on a Particular API Call From a Particular Driver

I often find myself in need of setting a breakpoint on a particular API but only when it's called by a particular driver.

For example, say I want to inspect every time the floppy disk driver completes an IRP with IoCompleteRequest. If I try to set a breakpoint in IoCompleteRequest I'm never going to be able to weed through all of the IRPs completing that I don't care about, I really just want the IRPs that the floppy disk driver completes.

Luckily, there's a feature of the Windows PE format that we can exploit: the import address table. Modules that call exports of other modules have something called an IAT which is fixed up by the loader to contain pointers to the locations of the routines in memory. A call "through"the IAT shows up as a pointer dereference in the debugger. For example:

call dword ptr [flpydisk!_imp_IofCompleteRequest (f9cacbf8)]

This instruction dereferences a value that lies somewhere within the floppy disk module that the loader fills in with the location of the real IoCompleteRequest:

1: kd> dps f9cacbf8 l1f9cacbf8 804eef48 nt!IofCompleteRequest

We can then exploit this to our advantage. If we care to see all of the calls that the floppy disk driver makes to IoCompleteRequest we can set an access breakpoint on its IAT entry that points to the real IoCompleteRequest.

To find the value, we can use the X command to look for the symbol that represents the import:

Breaking on a Particular API Call From a Particular Driver When You Don't Have Symbols

The above was easy enough because we had symbolic information for the floppy disk driver, thus it was simple to find the address of the IoCompleteRequest import using the X command.

Let's try to do this again with a driver that we don't have symbols for, such as vmscsi.sys:

1: kd> x vmscsi!**** ERROR: Module load completed but symbols could not be loaded for vmscsi.sys

Oops, so much for that. So, how can we achieve the same result? Quite easily, actually, all we need to do is find the IAT of the module and figure out the module address for ourselves. First thing we'll need is the base address of the vmscsi module. We can do this with the Loaded Modules command and a match string:

This gives us the offset and size of the IAT. Because the IAT is simply an array of pointers filled in by the loader, we can inspect each pointer and what it points to by dumping them out with the dps command.

This is a pretty well known trick at this point but I'm always surprised that people haven't picked up on it since I find it incredibly useful. Say you're having a memory leak on your system and there is a particular pool tag implicated from the !poolused output. The tag doesn't belong to your driver and you'd like to know where it's coming from.

The first option you might think of would be a conditional breakpoint on ExAllocatePoolWithTag. This would work just fine, but as we know, conditional breakpoints can be excruciatingly slow as the debugger needs to break, evaluate the condition, then resume if it doesn't match.

Turns out there's a better way, the "PoolHitTag" global. Simply fill it in with the tag of your choosing and the O/S will break into the debugger whenever that tag is allocated. Let's set a breakpoint in ExAllocatePoolWithTag, see what the tag used is, then see if we can catch it when we hit GO.

As indicated by the call stack, this trick only works if pool tagging is enabled on the target system.

Manually Recreating a Stack

This one is always a fun one. For one reason or another the stack is hosed. The biggest culprit is generally a module on the stack that uses FPO for which you don't have symbols. Here, I showsa recent example I ran into.

Notice how the stack disappears into that module. This is a clear sign that there is more stack to be seen but WinDBG is no longer able to reconstruct it properly. Generally, getting a call stack for the missing piece can be a real pain. However, there is a slacker approach to get "close enough" with minimal effort. All it requires is some ability to recognize EBP frames on the stack.

For some detailed into on EBP frames, check out Stacking the Deck(The NT Insider, Volume 9, Issue 6, November-December 2002). However, the main point you'll need to know about EBP frames is that EBP points to the previous EBP on the stack and EBP+4 points to the return address.

Our trick will involve inspecting the raw contents of the stack after the corrupt point looking for a pointer to a stack address followed by something that looks like an instruction address. If we follow the pointer to the stack address we should see another stack address followed by another instruction address, and so on.

kd> dps bd74e7a4

bd74e7a400000000

bd74e7a8be745e75 savrt+0x18e75

bd74e7ace5d590b0

bd74e7b0e369d7a8

bd74e7b4be74355a savrt+0x1655a

bd74e7b8bd74e7c8

bd74e7bce369d7a8

bd74e7c0e369d7a8

bd74e7c4be741b0f savrt+0x14b0f

bd74e7c800000000

bd74e7cce5da1e68

bd74e7d0e5da1e70

bd74e7d4be741c26 savrt+0x14c26

bd74e7d800000000

bd74e7dce5da1e70

bd74e7e0e1bffb48

bd74e7e4be8329bc SYMEVENT+0xd9bc

bd74e7e800000000

bd74e7ece5da1e70

bd74e7f0bd74e810

bd74e7f4be834e74 SYMEVENT+0xfe74

bd74e7f8e5da1e70

bd74e7fce1374c28

bd74e800e5da1e68

bd74e804bd74e870

bd74e808e1374c30

bd74e80ce1374c28

bd74e810bd74e824

bd74e814be835023 SYMEVENT+0x10023

bd74e818e5da1e68

bd74e81cbd74e870

bd74e820e1374c48

bd74e824bd74e838

bd74e828be82b07f SYMEVENT+0x607f

bd74e82cbd74e870

Do you notice the pattern of the numbers in underline? If you keep following the pointer values you keep finding stack values followed by addresses somewhere inside the SYMEVENT module. This is exactly what an EBP frame looks like on the stack. If you can find this pattern you can get pretty close to finding the real stack with a quick trick.

Take the second frame from the top that you found and feed the possible EBP address, that address minus eight, and the instruction address found in the first frame from the top to the K command.

Again, this is a shortcut to getting a stack back and isn't the entirely correct way to calculate the values. However, it has personally not failed me once in getting close enough to analyze the problem with minimal effort.

That's All For Now

Obviously this is just a start and there are many, many more techniques to be covered. Hopefully in future issues we can go over more of our techniques and even some of yours!