Administrators' Intro to Debugging

Get familiar with using the Windows debugger as a backup diagnostic tool

Executive Summary:As a Microsoft Windows administrator, in certain situations the tools you usually use to diagnose Windows system problems won't work. For those cases, the Debugging Tools for Windows can be a useful alternative troubleshooting tool. Get acquainted with the debugger by walking through this real-life case of troubleshooting the cause of bloating heap memory on a system.

As a Windows administrator, you realize the crucial role that tools play in helping you diagnose and resolve system problems. Previously, I've discussed two essential tools that help identify process leaks: user-mode dump heap (UMDH) and DebugDiag. (See this article's Learning Path for links to these and my other columns.) However, sometimes these tools add too much overhead to the system, causing high CPU usage and creating a situation that makes running UMDH or DebugDiag impractical. When your diagnostic tools fail or become unavailable, the problem becomes much more difficult to solve, so having a backup troubleshooting plan is vital.

Do you have an alternative approach when you can't use your regular set of troubleshooting tools? My aim in this article is to provide you with at least one backup plan and also demonstrate the importance of learning about Windows internals and using the Windows debugger to find helpful nuggets of diagnostic information.

Debugging a Bloated ProcessRecently we resolved a customer issue where wmiprvse.exe (the WMI Provider Host process) was consuming increasing amounts of memory. The customer provided a dump file to help determine the root cause. We started by enabling DebugDiag and UMDH separately. However, when each tool ran, we saw that the CPU was constantly spiked at 100 percent, thus freezing the system. Since our regular troubleshooting tools overstressed the system, we needed to turn to our backup plan.

We had only the 185MB wmiprvse.exe process dump file to help us identify why the process had bloated to such a large size and what, if anything, the customer could do to decrease the memory consumption. First we opened the dump file using the windbg.exe debugger included in the Debugging Tools for Windows. (You can download the Debugging Tools for Windows at www.microsoft.com/whdc/devtools/debugging/default.mspx.) The debugging toolset is a must-have for any administrator who wants to dig a little deeper into system problems. You can retrieve a lot of information from a dump file using the debuggers without being a debugging expert or having a lot of code knowledge.

To debug the dump file, we followed these steps:

1. Start windbg.exe.2. Ensure that the Symbol path is set correctly by clicking File, Symbol File Path, then typing the following path into the box provided:

3. Open the wmiprvse.dmp file in the Windbg debugger by selecting File, then clicking Open Crash Dump.

We then entered this command in the debugger:

!address -summary

This is a handy command because it provides a summary of the type of memory consumed within the process, as you can see in Figure 1.

Figure 1: Output of !address -summary command

The most important column in the output is the Pct(Busy) column, which represents the type of memory consumed on a percentage basis. We quickly scanned this column and identified that the RegionUsageHeap memory was the highest consumer at 74 percent. RegionUsageHeap is the label for heap memory, an area of memory reserved for processes to store data. Every process has heap memory, which is initially 1MB by default. However, this area can grow, and more memory will be allocated for the heap as the process requires.

Understanding that heap memory is where a program stores its data was a key component in our investigation. You can think of heap memory as a bucket represented as an address of memory where applications store process-specific data. Since a process can have several heap "buckets," it's important to know which bucket you want to explore because not every heap will be filled with data and high in memory usage. Our goal was to dump out the heap bucket consuming the highest amount of memory to try to identify some clues about why the process was consuming so much memory.

Consulting the Debugger Help FileAs I mentioned, a process will have several heap buckets available to explore, so we needed a way to conduct a more targeted investigation. This is where the debugger Help file (debugger.chm) saved the day. I knew that we needed to use the !heap debugger command to investigate a heap-related problem in a process. However, when you aren't sure what debugger command to use, try searching in debugger.chm for a term relevant to your investigation. For example, if you search for the term "heap," the first hit you receive is the !heap command and all the !heap parameters.

Running the !heap command displays a list of heap memory addresses, but this list alone wouldn't provide us enough information to find the heap bucket containing the highest amount of memory usage. As I scrolled down through the !heap parameters, I noticed the -stat parameter, which displays heap usage statistics. Remembering our goal was to find the heap (or bucket) containing the most memory, I ran this command:

!heap -stat

As Figure 2 shows, running this command displays each heap in descending order, from highest to lowest consumer. So the first heap displayed is the heap address we need to investigate.

Figure 2: Running !heap -stat to view highest-to-lowest heap address

Notice in Figure 2, each heap bucket is designated with a specific address. The first heap bucket is designated by the 00700000 address; the next heap is located at address 00090000. The important row is Committed bytes—the amount of memory committed in this particular heap. And we know that heap 00700000 is the highest–memory-consuming heap because the -stat parameter displays each heap in highest-to-lowest order based on memory consumption. For further confirmation, we can convert the committed bytes hexadecimal number 06c03000 to decimal either using calc.exe (i.e., the Windows calculator) or within the debugger by using the ? (Evaluate Expression) command, as follows:

0:000> ?06c03000Evaluate expression: 113258496 = 06c03000

(The second line is the command's output.) 113258496 is 06c03000 converted into decimal, so this output confirms for us that this heap address contains 113MB of memory. Considering the entire process consumed 185MB of memory, we can be fairly certain that we're on the right track. We know our process is bloating because of heap usage, and we know that our highest-consuming heap contains 113MB. Now we can enter debugger commands that will display the contents of this heap, which may point to hints about why the process is utilizing so much memory. Simply displaying the heap might not reveal a smoking gun, but the information you find could point you in the right direction, as it did for us.

To dump out the heap, we use the dc command (which displays the values as ASCII characters) followed by the heap address. So in our case, the command and partial output would look like that in Figure 3.

Usually you'll need to keep dumping out more and more heap (by pressing Enter after the initial dc command output is displayed), to reveal interesting information. Figure 4 shows the section of the dump containing information that helped us deduce that our heap memory was consumed with information generated by Event Tracing for Windows (ETW).

Notice we even found the filename and location in the dump: D:\AppData\Logs\MSTrace_20080615_221501_SERVER1.etl.

Problem SolvedOur biggest clue that the data contained in the heap was indeed ETW tracing-related information was the file extension .etl, which is associated with ETW. As it turned out, the customer hadn't realized ETW tracing was still enabled for a previous problem resolved months before. Turning off ETW tracing resolved the customer's problem, and the wmiprvse.exe process's memory consumption decreased immediately. By knowing a few debugger commands and a little about OS internals, you can successfully troubleshoot when your usual tools aren't available.

Special thanks to Venkatesh Ganga, a Microsoft senior escalation engineer, who contributed to this article.