Debug Tutorial Part 1: Beginning Debugging Using CDB and NTSD

Introduction

Debugging is one of the most valuable skill sets when it comes to software development and maintenance. This is a skill that is used at every stage of a product's life cycle. The developer first creating the project will obviously run into bugs. These bugs can be anywhere from logic bugs to syntax bugs and compiler errors. The quality assurance being conducted on the software may run into problems as more advanced scenarios are tested and the software interacts with other environments. Finally, after release of the product, it must be supported. The debugging does not end when the customer gets the software, bugs are generally escalated back to the company who will now again need to debug.

What is the goal of this tutorial?

This tutorial is merely an introduction to debugging. This would be considered "tutorial #1" and I will write more add-ons if the feedback is good. There are a lot of complex debugging techniques and issues that it's hard to know where to start. This tutorial attempts to start at the beginning and get you acquainted with debugging. I hope to expose novices and intermediate level programmers to the world of advanced debugging. "Advanced" debugging, basically without recompiling, without doing "message box or printf debugging".

Debuggers and Operating Systems

CDB, NTSD and Windbg

This article will generally talk about Windows 2000 and higher Operating Systems. The three debuggers that we will talk about here are CDB, NTSD and WinDbg. Windows 2000 and higher systems generally have NTSD already installed on the system! This is a big bonus as you do not need to install any extra software for quick debugging.

So what's the difference? The documentation says "NTSD does not need a console window and CDB does". That is true. NTSD does not need a console window in order to run, while CDB does. However, I have found that there are a lot more differences. The first is that older NTSDs do not support PDB symbol files, they only support DBG! I also found that NTSD does not support the symbol server, while CDB does. Older NTSDs could not create a memory dump and I've also found other problems such as NTSD only supports up to 2 breakpoint commands. There is one advantage that NTSD has now that CDB does not. The ability to not have a console window.

The ability to not have a console window is vital when you are debugging a user-mode service or process before anyone has logged onto the system. If no one has logged onto the system, you cannot create a console window. There is a command line option, -d, which specifies for NTSD to communicate with the attached kernel debugger (CDB has the same option). This can be used on processes during startup to debug them through the kernel debugger. While you can debug a process using the kernel debugger already, this gives you the flexibility to debug the process using the user-mode debugger. This is outside the scope of this current introduction article, just digest the concept for now.

WinDbg and CDB are basically the same with some few exceptions. The first is that WinDbg is a GUI and CDB is a console application. WinDbg also supports kernel debugging and source level debugging.

Visual C++ Debugger

I do not use this debugger and I would not recommend using it. The reasons are that this debugger is firstly a resource hog. It's slow loading and contains more than just debugging tools which makes it cumbersome. The second reason is generally, you need to reboot after you install this debugger. I generally work off the principal that the machine running or testing the software may not already have a debugger installed. VC++ is also a large, time consuming installation.

Windows 9x/ME

What can we do on Windows 9x/ME? Well, you can actually use WinDbg. The debug APIs are the same for all systems, so it has been long known to me that WinDbg should just "work" on Windows 9x/ME. My only concerns were if WinDbg attempted to detect it was on Windows 9x and not allow debugging. I recently found this to be untrue. The only problem is that the latest WinDbg installs are MSI packages that do not natively install on Windows 9x. This can be solved simply by installing them on an NT based machine and sharing the directory or even putting it on a CD. This obviously has other side effects though, such as do not think you can use all the !xxx commands as NT and 9x place their data in different memory locations. Do symbols work? Yes, PDBs work. I did find stepping through code after setting a ba r1 xxxxx was very slow though. This article does not cover Windows 9x/ME.

Setting Up Your Environment

This is a very important step before you start debugging or successfully set up your debug environment. The system needs to be configured to your liking and contain all the tools you need.

Symbols and the Symbol Server

Symbols are an important part of any debug operation. Microsoft contains a location where you can download all the symbols for any particular Operating System (Windows XP, etc.). The problem is, you need to have a lot of hard disk space and if you debug many Operating Systems on one machine (from crash dumps, etc.), then this is cumbersome.

To accommodate this need to debug many Operating Systems, Microsoft supports a "symbol server". This will help you to get the correct symbols onto your system. The symbol server is located here. If you set your symbol path to this location, your debugger will automatically download the system symbols that you need. The symbols that you need for your application are up to you.

Image File Execution Options

There is a location in the registry that will automatically attach a debugger to an application when it starts to run. This registry location is the following:

Under this registry key, you simply create a new registry key with the name of the process you want to debug, such as "myapplication.exe". If you have not used this before, there is probably a default key already created called "Your Application Here" or something similar. You can rename that key and use it if you like.

One of the values on this key is "Debugger". This should point to the debugger you want to start when this application is run. The default for "Your Application Here" is "ntsd -d". You cannot use this unless you have a kernel debugger attached so I would remove the "-d" part.

Note: Keeping "-d" and not having a kernel debugger attached could result in locking up of your system every time that application is run! Be careful. If you have a kernel debugger setup, you can unlock the system by hitting "g".

There is another value that may be there called "GlobalFlags". This is another tool that can be used for debugging, however it is outside the scope of this article. For more information on that, look up "gflags.exe".

Kernel Debugging Equipment

In order to kernel debug, you first need to boot the Operating System in debug mode. Although there is a GUI under system properties to do this, I generally edit the boot.ini directly. Locate the boot.ini on the root of your C:\ drive. It is most likely a hidden system file. I would attrib -r -s -h boot.ini and then open it for edit.

Caution: Editing this file incorrectly can prevent you from ever booting again!

The duplicated line can then contain your setup. /debug, then /debugport=port and finally /baudrate=baudrate. The debug port to use is the port of that machine where you would hook up your SERIAL NULL MODEM CABLE. This is a piece of hardware that you need. You will also need another machine. Aside from using the COM ports, you can use firewire which is a lot faster.

Next time you boot, just select the "Debugger Enabled" selection in order to boot in debug mode.

Environment Variables

I would generally setup _NT_SYMBOL_PATH to point to the Microsoft Symbol server and the local directory that contains your symbol information. To set this environment path, go to System Properties -> Advanced -> Environment Variables.

Default Debugger

This is the default debugger that will be used when any crash happens on the system. By default, it's generally set to "Doctor Watson". That program is not worth mentioning here. The registry key is this location:

Assembly

I highly recommend that you learn assembly programming. These tutorials will not show source level debugging as I never do it and I don't even know how! The problems with source level debugging is that the source is not always available as well as sometimes the problem is not seen when just looking at the source, but rather in the generated code. It also makes walking the system much easier. If you understand how the environment was setup, you can easily reverse the system to finding out the information you need to know and it may not always be available using Source Level debugging.

The other thing I hate about source level debugging is that if the source does not match the symbols, the source debugger will not show you the correct information. This means that if you create multiple builds of your program or change your program after you've built, you better be able to find the source that matches the build you're debugging!

Let's Get Started

This tutorial is basically Part One and if it's liked, I will write more, each getting more and more advanced. This first tutorial will walk through a couple of simple scenarios of user-mode programming problems.

Symbols For Release Executables

First, how do you create symbols for "release" binaries? That's simple. You create a make file that properly rebases the binaries.

This will create the .PDB for your project. Of course, with the introduction of VC++ 7, they have gotten rid of .DBGs (so /debugtype:both may error on this compiler). .DBG is a smaller version of the .PDB and it does not contain source information, strictly symbol look ups. It does not even contain the parameters or anything. If you're using a compiler that can still generate them, here's what you do:

rebase -b 0x00100000 -x $(TARGETDIR) -a $(TARGETDIR)\$(TARGET)

The -b is the new memory location to rebase the executable to. However, this will strip the debug symbols from the release executable making it smaller in size. If you build an executable the default Visual Studio method, it may be a tiny bit smaller than this executable. However, you do not have symbols. The generated code is the same and just as optimized using the optimization flags you specify. The difference is that these binaries are now more useful, as no matter where they go or who uses them where, you can still get symbols!

Remember, the best debugging always occurs if you do not have to rebuild the executable. Once you have to rebuild the executable, you must also know that you've now changed the memory foot print of the executable. You may also have changed the speed of the executable. This is critical since you now have to reproduce the problem using this binary! What if it took 4 days to cause this problem? It would be best to be able to debug it as much as possible on the spot.

Simple Access Violation Trap

Let's walk through a simple problem. Your program crashes with "Access Violation", this is not uncommon! This is probably the most frequent problem that occurs when running an executable. There are three steps to help solving this problem.

Who is attempting to do the access? What module?

What is it attempting to access? Where did the memory come from?

Why is it attempting to access it? What does it want to do with it?

These are general guidelines to solving this problem. I put #2 in italics as it is probably the most important of the three. However, solving 1 and 3 can also help determine #2 if it is not immediately apparent.

I have created a very simple program that crashes. I have setup my default debugger to be CDB and I have now just run the program. I have also created symbols for this executable as well as set the _NT_SYMBOL_PATH to the Microsoft symbol server.

What is the first thing we notice? This trap occurred in MSVCRT.DLL. This is apparent because the debugger generally displays this information using <module>!<nearest symbol>+offset. This means the closest symbol in MSVCRT.DLL is _output and we are +18h bytes into it. Given that this is such a small offset and providing that the symbols are correct (even symbols can be incorrect, but that's a later tutorial), we can assume that we are in _output() function of MSVCRT.

This command will give us a list of all the modules in the process with their beginning and ending memory locations. Our trap is at 77c3f10b, which is 77c10000 <= 77c3f10b <= 77c63000, so we are definitely trapped in MSVCRT. The next thing to do is find out where this memory came from.

There are a few methods of doing this, we could un-assemble the code and attempt to find out where the memory came from. We could also get a stack trace and figure out who's on the stack. Let's first attempt to disassemble the _output function to see where the memory came from.

I have highlighted all the important instructions to look at. Even if you do not know assembly, you will want to hear this out for what it is. First, we notice that the memory is coming from EAX. It's a register in the CPU, but we can just consider it a variable. The []s around EAX is the same as doing *MyPointer in C. This means we are referencing the memory pointed to by EAX. Where did EAX come from? EAX came from [EBP + 0Ch], which you could think of as "DWORD *EBP EAX = EBP[3];". This is because in assembly, there are no types. EAX is a 32 bit (DWORD) register. Dereferencing the DWORD at EBP + 12 is the same in C as adding 3 to a DWORD pointer (or 12 to a byte pointer then typecasting to a DWORD).

The next thing to look at is MOV EBP, ESP. ESP is the STACK POINTER. As you should know, parameters (pending calling convention and optimizations) are pushed on the stack, return addresses are pushed on the stack and local variables are on the stack. ESP points to the stack! In memory, a function call would look like this for the C calling convention:

[Parameter n]
...
[Parameter 2]
[Parameter 1]
[Return Address]

Now, we see PUSH EBP. PUSH means put something on the stack. So, we are saving EBP's previous value on the stack. So, our stack looks like this now:

This being the case, we know that our variable came from the second parameter of _output. So, now what? Let's un-assemble the calling function! We know that EBP + 4 points to the return address, or we could try to just get a stack trace.

"KB" is one of the commands to do this. Now, we may not always get a full stack trace, however, this too is for a more advanced tutorial. In this simple tutorial, we will assume we got the full stack trace. We notice, this is a printf function call or it looks that way. As we notice, printf called _output. Let's un-assemble printf. Please note that we may not always want to disassemble the entire function and we may use discession. Sometimes, we can find out the trap just from doing a stack trace (I will go over this in this simple context at the end). These are small functions though and we may be able to trace them simply.

This is simple. We notice that the second parameter to _output is [EBP + 8]. We now notice that PUSH EBP and MOV EBP, ESP are there and thus the stack is setup the same way I mentioned previously. This is *notalways* the case, but we are starting out slowly here.

Thus, we can determine that the first parameter to printf() is where the memory came from. And, as luck would have it, printf() was called from our program! From the trap information, we know that EAX was 0, so we were trying to dereference a NULL pointer.

You can notice a lot of problems with it! However, the printf is what trapped since it was NULL. *TheLastParameter is NULL. Surprisingly it didn't trap on sprintf(). So, how would we have solved this just with KB? Look at this trace:

We had symbols and we had the stack trace. The italics is the first parameter. It's 0. We also know that we called it. This is a very simple scenario though and I tried to portray some of the techniques that could be used to back trace to the location of a problem. Learn the stack. Knowing how the stack is setup and what memory is on the stack can be vital to finding and tracing where data came from. You will not always be that lucky to find where all information can be found with just doing "kb".

Program Not Working As Expected

This is a popular error. You run the program and you don't see the correct output or the program keeps giving you an error message. The file you want to create is not being created, etc. This is a very common problem that can be easy to complex to solve. What are some of the first steps you would take to debug this?

What is not working?

What APIs or modules would this revolve around?

What would cause those APIs to not function properly?

These are some steps, though they are not general. Let's say you have a program that attempts to create a file in Windows. The file is not created though. Let's look at some code:

This is your code. Generally, you would want to recompile with perhaps GetLastError() and print it out. However, you do not have to do that. Although in this case it may be simple to, if you're stepping through code and a function fails, wouldn't you want to know what happened on the spot? Let's try to debug this. First, we'll start the debugger and break on our function. Since we have symbols, this is easy. If we didn't, we could just break on CreateFile as it is an exported symbol and would always be available.

After we call CreateFile, EAX will have the return value. We notice it's ffffffff or "Invalid Handle Value". We want to know the GetLastError. This is stored at fs:34. FS is the TEB selector, so we can dump it.

Luckily, it's a constant so the memory will still be around. It would still be around even if it wasn't though since we didn't step too far away from the return of CreateFile.

We can then use "da", "dc" or "du". "da" is dump ANSI string, "du" is dump Unicode string and "dc" is similar to "dd" except it dumps all characters, even unprintable ones. Since we know it's an ANSI string, just use da.

0:000> da 403010
00403010 "c:MyFile.txt"
0:000>

That's wrong! We need to use C:\\MyFile.txt to get it to work with the C:\!

So, we fix this. But wait, it still won't write! We need to debug this further.

0:000> !gle
LastErrorValue: (Win32) 0x5 (5) - Access is denied.
LastStatusValue: (NTSTATUS) 0xc0000022 - {Access Denied}
A process has requested access to an object,
but has not been granted those access rights.
0:000>

Access denied? What could cause that! Let's check, wait, we opened the file for READ access only! We didn't open the file for write access! So, we can easily fix this problem and move onto our next project!

Conclusion

In summary, this is just an introduction to some very basic debugging techniques. The examples were simple but you must take their value for the techniques they displayed. This is just the first installment of this debugging tutorial. Hopefully, if there is interest, I may add more tutorials getting more advanced.

To some, this tutorial may be simple, to others too advanced. You will not become a good debugger overnight, it takes practice. I would suggest attempting to use the debugger even on the simplest of problems, to solve them. The more you practice, the better you get. I guarantee, the more you fool around with the tools, the more you will learn.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Share

About the Author

Toby Opferman has worked in just about all aspects of Windows development including applications, services and drivers.

He has also played a variety of roles professionally on a wide range of projects. This has included pure researching roles, architect roles and developer roles. He also was also solely responsible for debugging traps and blue screens for a number of years.

Previously of Citrix Systems he is very experienced in the area of Terminal Services. He currently works on Operating Systems and low level architecture at Intel.

Comments and Discussions

DeBugging is one of those skills that separates an asset from the unemployed when it comes down to downsizing the herd. This is the bastardized child of development as many seem transfixed of pretty stuff and then their dreams crumble before their eyes and they haven't the skill to prevent, diagnose or treat the symptoms.

Hi, I have a problem with debugging but with visual studio debugging since I've installed the SP1 for vs2005. I am on a fresh XP SP2 machine.

In any project, when I hit F10 or set a breakpoint and F5, visual studio hangs, winamp continues to play music, mouse moves, keyboard does not respond. Only solution is to power-off the PC.

There are no entries in the eventlog, no error messages of any kind. ctfmon.exe is disabled. I have made a repair with SP1 setup, no help. Uninstalled visual studio and reinstalled it, no help. Ran add/remove windows components no help. COM+ is reinstalled. It is not damaged now.

When I attach windbg to devenv.exe, aforementioned symptoms occur. So I cannot get a minidump or anything from visual studio.

40GB diskspace empty, 2GB physical RAM, 2GB pagefile.

I have read about 40 different forum/newsgrup/microsoft KB pages. They don't help. Is there anything I can do apart from reinstalling windows and many applications from scratch?

Dear Programmer,Here,Let me to introduce my selfMy name is Andrio,From Indonesia.I'm newbie and idiot to ProgrammingI don't know about this command and this code about Debug Tutorial Part 1: Beginning Debugging Using CDB and NTSD.I would like to know how to using and running the code.i'm using Slackware and XP sp2 OSI don't know,how i can write the code on my Operating system and How to running the code.Anyone Programmer here Please for help me?I need LIVE CD for practice at my home.

Firstly, the parameter passing always depends on the calling convention, the method listed is one commonly used in 32 bit applications (There are CDECL, STDCALL and FASTCALL which are the 3 most common in 32 bit applications).

The "previous EBP" just means the previous value in EBP. When a function sets up a stack frame they do: