Archive for September, 2009

As promised earlier I am including some more savory Kernel debugging topics. Some of these scenarios are more corner case one-off ones.

Virtual Machine (VM) debugging

Debugging a virtual machine

You will need to enable debugging on the VM just like a regular OS. Also, you will need to set one of the COM Port(COM1 or Com2) as a named pipe (=\.\pipe\<com1| com2 |yourstring>). On the VM HOST OS you can type this to attach to the debuggee.

This is the scenario where you use one VM running on the host to debug another VM running on the same host. This is interesting scenario and is useful in any scenario where you may have to reimage the host OS maybe in a lab. Environment. I had to set this up for showing kernel debugging for KMDF at WinHEC.

This is how I did it:

Created a named pipe on COM1 as following on the VM acting as host/debugger

When I tried this with COM1 port on the debugee VM I couldn’t get it to work but Named pipe on Com2 port worked for me. If you had a different experience please share it for others benefit.

Invoke the debugger on host VM as:

“Windbg –k com:port=com2”.

Debugging Local/single mode debugging

In some cases you may not have access to another machine or you may want to look at some device state or read a global variable for a driver. In the absence of other applications or tools one can use the local debugger. You will need to enable debugging the local machine on Vista and forward:

bcdedit /debug on

followed by a reboot. You can look at the documentation for how to edit the boot.ini file to do this on pre-vista OS.

Then you can begin debugging with the following command:

C:\> kd –kl

However, it has limited use. You can’t set break points or check call stacks.

You can use it for checking state of global variables in your drivers etc. You could also use it for trying out commands/ approaches when you don’t have another machine handy if you are going to visit a customer for example.

I am stuck now what. How do I get my machine back?

At some point in your life just by law of averages J you will come across a system which is crashing regularly maybe even at boot that you can’t get any work done or even log on to the system. These are a few techniques you could use to get your system back.

Safe mode

Try booting safe mode. The set of drivers loaded in safe mode are a subset of the remaining drivers. Majority of times these are well tested critical boot drivers. If you enabled verifier for all drivers. This is especially useful if one of the non-safe mode drivers is failing hence doesn’t get loaded in safe mode.

In some cases (more of a driver developer /test/verification scenario) the non-safe mode driver maybe failing a verifier check if you enable verifier for all drivers on your machine maybe to debug a corruption problem.

System restore

System restore helps you restore your system to a previous state. A thing to note is that driver binaries are not rolled back. It can be useful as it primarily reverts the registry to a prior well known state. If the state the system is in, involved the registry directly or indirectly, this is a good option to try.

The scenario I discussed earlier of enabling driver verifier to safe mode drivers and one such driver barfing could get your system to not boot even in the safe mode. It is unfortunate but you could find even safe mode drivers sometimes fail verifier checks.

Driver Verifier saves state in the registry so by reverting the registry (with system restore) you can disable the verifier and claim your machine back perhaps temporarily since the offending driver is still on your system. If you are lucky there is an update waiting for you from the IHV/ISV which fixes the issue flagged by driver verifier.

BIOS and disabling devices

You also find most BIOSes have the option of disabling devices. If the offending driver is for hardware and you can figure out the name of the driver binary (from the BSOD)which is crashing you can try disabling it from the BIOS. For e.g.: This is useful for devices you couldn’t care more for like the finger print reader (unless you actually use it) which has a buggy driver. Once the driver is fixed you can re-enable the device.

Kdfiles and Windbg

Kdfiles is an excellent way to change a boot driver or another driver especially if you are debugging/developing a driver. Windbg acts as a conduit for passing bits of the driver you need to change on to the target machine. It can be used for boot drivers as well except for windows vista.

If all else fails and you desperately need a crash dump to pass on to the IHV for debugging. On windows 7, kernel crash dump which is good enough normally for debugging is enabled by default.

F10 trick to edit the boot parameters passed to the kernel at boot time

If you forgot to enable kernel debugging with bcdedit you could always do it at boot time. Pressing F8 can give you the boot menu which has the option of enable debugging. There is another way of doing it however you need to have precision timing. You will need to add the following lines to the boot debug options after pressing the F10 key. The debug option doesn’t persist across a boot.

Serial — “/debug /debugport=comX /baudrate=115200”

1394 – “/debug /debugport=1394 /channel=[1-63]”

USB – “/debug /debugport=usb /targetname=String “

Getting a crash dump for a process without enabling the kernel debugger

If the machine is not booted in debug mode (local kd cannot be used) and you can’t enable debugging and reboot it in fears of losing the repro, then you could try the following:

kdbgctrl –td <pid> <file>

This will capture a driver dump with the hang that should contain the IRP information to see if any thread in the process is hung on an IRP. Kdbgctrl.exe is a tool from the debuggers packages.

Debugging over USB-serial converter

When you set a target to use a COM port for debugging, the debugging engine
takes over the port and drives it itself. This is why a COM port will no
longer appear in device manager when being used for debugging. Let’s say now you want to
use a port on a USB device. When the system first boots, that port do
not exist according to the system hardware. Therefore there is no such thing
as COMX, so it just doesn’t work. Ports on USB-Serial converters do not get
created until later on when the PNP manager loads the driver for the USB
device.

The kernel drives the debug ports so if a driver is driving the debug port that won’t work.

So debugging over USB-serial converters doesn’t work.

Accessing the registry from Windbg

Won’t it be great if you could access the registry values from windbg. This way you could disable driver verifier through the debugger or set/unset some driver registry value. There are numerous scenarios which this could be handy. You can actually do this over windbg. Let me warn you that this may require a little poking and digging around as I show below.

The registry holds system information, such as configuration data, for both hardware and software. The registry is broken up into “Hives” which are registry files for parts of the registry. For example, the Software hive is where the HKEY_LOCAL_MACHINE\Software information is kept. The Hives are broken up into “Bins” and the Bins are broken up into “Cells” which hold the registry key and value data.

The diagram below which is borrowed from the above article shows the layout of the registry.

How is this information managed?

The registry subsystem maintains the registry on a Hive basis. That is how the registry keeps track of open registry files. The open Hives in the system can be displayed by using the debugger extension “!reg hivelist”.

Before we can access the registry, we need to know where to begin. So we need the root block:

Inthe above case I pick the root block for the System Hive. Below I try and dump a few values under the “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet1\Control\Session Manager\Memory Management\” key namely “SystemPages”

lkd> !reg baseblock 8bc1a5a0

FileName : SYSTEM

Signature: HBASE_BLOCK_SIGNATURE

Sequence1: 84a7

Sequence2: 84a7

TimeStamp: 1ca3f22 34d2fd1c

Major : 1

Minor : 5

Type : HFILE_TYPE_PRIMARY

Format : HBASE_FORMAT_MEMORY

RootCell : 20-> Index of the root cell

Length : dbb000

Cluster : 1

CheckSum : 4bd3452c

Since everything in the registry is represented as a cell we look at the cell index of the root block to calculate the cell address. The Cell address is broken down to the map directory offset, map table offset , and then block offset to access registry data. The debugger extension !reg cellindex does this for us automatically.!reg cellindex is then used to get the cell address.

lkd> !reg cellindex 8bc1a5a0 20

Map = 8bc22000 Type = 0 Table = 0 Block = 0 Offset = 20

MapTable = 8bc23000

BlockAddress = 937d3000

pcell: 937d3024

The !reg subkeylist command can do this for you. This breaks down the Key Node and does all the work.

What the List points to depends on the number of values. The registry uses an index mechanism to access the sub key data. The type of index is determined from the type code. Above 0x686c indicates this is a CM_KEY_HASH_LEAF. Where the cell contains a hash of the key and the cell index of the key. Another example is a CM_KEY_FAST_LEAF where the first four characters of the key name are used instead of a hash. What is used depends on the version of the hive. Also when there are many sub keys, the index can be a CM_KEY_INDEX_ROOT, which contains cell indexes which point to leafs.

You’ll see that some hives are volatile and don’t have associated files. The system creates and manages these hives entirely in memory; the hives are therefore temporary in nature. The system creates volatile hives every time the system boots. An example of a volatile hive is the HKEY_LOCAL_MACHINE \HARDWARE hive, which stores information regarding physical devices and the devices’ assigned resources. Resource assignment and hardware detection occur every time the system boots, so not storing this data on disk is logical.

You can use the !reg subkeylist recursively till you hit a roadblock where you don’t see all the child keys showing up in which case we resort to manual means. Since we have CurrentControlSet1, lets try and o down to its child key “Control”.

lkd> !reg subkeylist 8bc1a5a0 937d3164

Dumping SubkeyList of Key <ControlSet001> :

SubKeyCount[Stable ]: 0x5

SubKeyLists[Stable ]: 0x2ccbb0

SubKeyCount[Volatile]: 0x0

SubKeyLists[Volatile]: 0xffffffff

[ 5] Stable SubKeys:

[Idx] [SubKeyAddr] [SubKeyName]

[0] 937d3634Control

[1] 937d3634 Control

[2] 937d3634 Control

[3] 937d3634 Control

[4] 937d3634 Control

[ 0] Volatile SubKeys:

Use ‘!reg knode <SubKeyAddr>’ to dump the key

Trying again recursively to dump the subkeys didn’t work as the debug extension dumped I only ACPI out of the 84 subkeys.

lkd> !reg subkeylist 8bc1a5a0 937d3634

Dumping SubkeyList of Key <Control> :

SubKeyCount[Stable ]: 0x54

SubKeyLists[Stable ]: 0x12a968

SubKeyCount[Volatile]: 0x3

SubKeyLists[Volatile]: 0x8001d930

[ 84] Stable SubKeys:

[Idx] [SubKeyAddr] [SubKeyName]

[0] 93732884 ACPI

[1] 93732884 ACPI

[2] 93732884 ACPI

[3] 93732884 ACPI

[4] 93732884 ACPI

[5] 93732884 ACPI

[6] 93732884 ACPI

[7] 93732884 ACPI

[8] 93732884 ACPI

[9] 93732884 ACPI

[10] 93732884 ACPI

[11] 93732884 ACPI

[12] 93732884 ACPI

[13] 93732884 ACPI

…

[79] 93732884 ACPI

[80] 93732884 ACPI

[81] 93732884 ACPI

[82] 93732884 ACPI

[83] 93732884 ACPI

[ 3] Volatile SubKeys:

[Idx] [SubKeyAddr] [SubKeyName]

[0] 8bc31484 hivelist

[1] 8bc31484 hivelist

[2] 8bc31484 hivelist

Use ‘!reg knode <SubKeyAddr>’ to dump the key

Since I was interested in getting to the address of “Session Manager”, I decided to this manually. On my particular system I looked at the offset of the Session Manger and it was the 60th key in alphabetical order.

I took a guess that the keys are arranged alphabetically so if you have a rough idea you could use that information to get the key you are looking for you could calculate the offset. This alphabetical order offset as it is displayed in the registry doesn’t correlate to the offset in the key list but I guess I have been lucky most times in getting close.

Let me warn you that your mileage may vary but instead of going brute force and looking at every key this approach can be fast if you get lucky. So I looked at the offset 60. Each sub-key takes 8 bytes so it was simple math

After I got to the Session Manager, to get to the “Memory Management” sub-key the same dilemma presented itself where only the first value was visible and the other sub-keys weren’t so I had to sub-key hunting again J.

lkd> !reg subkeylist 8bc1a5a0 937bbe34

Dumping SubkeyList of Key <Session Manager> :

SubKeyCount[Stable ]: 0xf

SubKeyLists[Stable ]: 0xaf8c0

SubKeyCount[Volatile]: 0x0

SubKeyLists[Volatile]: 0xffffffff

[ 15] Stable SubKeys:

[Idx] [SubKeyAddr] [SubKeyName]

[0] 937bbe94 AppCompatCache

[1] 937bbe94 AppCompatCache

[2] 937bbe94 AppCompatCache

[3] 937bbe94 AppCompatCache

[4] 937bbe94 AppCompatCache

[5] 937bbe94 AppCompatCache

[6] 937bbe94 AppCompatCache

[7] 937bbe94 AppCompatCache

[8] 937bbe94 AppCompatCache

[9] 937bbe94 AppCompatCache

[10] 937bbe94 AppCompatCache

[11] 937bbe94 AppCompatCache

[12] 937bbe94 AppCompatCache

[13] 937bbe94 AppCompatCache

[14] 937bbe94 AppCompatCache

[ 0] Volatile SubKeys:

Use ‘!reg knode <SubKeyAddr>’ to dump the key

In this case at least the challenge was a little less since I was dealing with only 15 sub-keys instead of 84 earlier. So just like earlier first get the cell address of the subkey list.

lkd> !reg cellindex 8bc1a5a0 0xaf8c0

Map = 8bc22000 Type = 0 Table = 0 Block = af Offset = 8c0

MapTable = 8bc23000

BlockAddress = 93724000

pcell: 937248c4

Now lets go through this list and use the offset of “Memory Management” sub-key which happens to be 11 on my particular system. By hit and trial I found that the correct offset was 0x54 bytes for “Memory management”.

lkd> dc 937248c4+0x54

93724918 00098068 b76d431e 00040910 092eb60d h….Cm………

93724928 00304c98 122bab7f 000af950 381a2f7e .L0…+.P…~/.8

93724938 000af178 0001dd10 00000000 00000000 x……………

93724948 00000000 00000000 ffffffa0 00206b6e …………nk .

93724958 03a3c1c6 01ca3ef7 00000000 00018e30 …..>……0…

93724968 00000000 00000001 ffffffff 80035830 …………0X..

93724978 00000007 000b0340 0000fc10 ffffffff ….@………..

93724988 0000000a 00000000 00000010 00000246 …………F…

To confirm it lets calculate the cell address and check the knode.

lkd> !reg cellindex 8bc1a5a0 00098068

Map = 8bc22000 Type = 0 Table = 0 Block = 98 Offset = 68

MapTable = 8bc23000

BlockAddress = 9373b000

pcell: 9373b06c

lkd> !reg knode 9373b06c

Signature: CM_KEY_NODE_SIGNATURE (kn)

Name : Memory Management

ParentCell : 0x18e30

Security : 0x3a68c8 [cell index]

Class : 0xffffffff [cell index]

Flags : 0x20

MaxNameLen : 0x24

MaxClassLen : 0x0

MaxValueNameLen : 0x30

MaxValueDataLen : 0x2a

LastWriteTime : 0x 1ca3ef7:0x 3a88487

SubKeyCount[Stable ]: 0x2

SubKeyLists[Stable ]: 0x2a0c48

SubKeyCount[Volatile]: 0x0

SubKeyLists[Volatile]: 0xffffffff

ValueList.Count : 0xf

ValueList.List : 0x303f70

What about Values?

Keynodes have values as well. Now that we have got the actual sub-key we need to get the Value from the sub-key. This task is similar to sub-key hunting as each sub-key maintains a sub-key list and a value list.

Just like with subkeylist there is also a valuelist debugger extension

lkd> !reg valuelist 8bc1a5a0 9373b06c

Dumping ValueList of Key <Memory Management> :

[Idx] [ValAddr] [ValueName]

[ 0] 9373b1a4 ClearPageFileAtShutdown

[ 1] 9373b204 DisablePagingExecutive

[ 2] 9373b374 LargeSystemCache

[ 3] 9373b39c NonPagedPoolQuota

[ 4] 9373b3cc NonPagedPoolSize

[ 5] 9373b3f4 PagedPoolQuota

[ 6] 9373b43c PagedPoolSize

[ 7] 9373b4e4 SecondLevelDataCache

[ 8] 9373b53c SessionPoolSize

[ 9] 9373b564 SessionViewSize

[ a] 9373b514 SystemPages

[ b] 93724f9c PagingFiles

[ c] 935077ec PhysicalAddressExtension

[ d] 9cd4ffcc IOPageLockLimit

[ e] 9cd0d434 ExistingPageFiles

Use ‘!reg kvalue <ValAddr>’ to dump the value

lkd> !reg kvalue 9373b514

Signature: CM_KEY_VALUE_SIGNATURE (kv)

Name : SystemPages {compressed}

DataLength: 80000004

Data : c3000 [cell index]

Type : 4

In our case the value of SystemPages is 0xc300. You can dump the node address 9373b514 directly and change the value with “ed” command. The actual value is at offset 0x8.

lkd> dc 9373b514

9373b514 000b6b76 80000004 000c3000 00000004 vk…….0……

9373b524 00090001 74737953 61506d65 00736567 ….SystemPages.

9373b534 00098538 ffffffd8 000f6b76 80000004 8…….vk……

9373b544 00000004 00000004 00000001 73736553 …………Sess

9373b554 506e6f69 536c6f6f 00657a69 ffffffd8 ionPoolSize…..

9373b564 000f6b76 80000004 00000030 00000004 vk……0…….

9373b574 00000001 73736553 566e6f69 53776569 ….SessionViewS

9373b584 00657a69 ffffffe0 00086b76 00000042 ize…..vk..B…

Now you can change the value ofSystemPages directly using “ed”.

lkd> ed 9373b514+0x8 <new value>

Just like with sub-keys sometimes the debugger extension may not show all the values in which case you could do it manuallyjust like earlier.First lets get the cell address of the value list.

lkd> !reg cellindex 8bc1a5a0 0x303f70

Map = 8bc22000 Type = 0 Table = 1 Block = 103 Offset = f70

MapTable = 8bc25000

BlockAddress = 934d0000

pcell: 934d0f74

Since our Value list had 15 values lets dump out all of them. Dumping all the Value offsets in the value list.

lkd> dc 934d0f74 l 0xf

934d0f74 000981a0 00098200 00098370 00098398 ……..p…….

934d0f84 000983c8 000983f0 00098438 000984e0 ……..8…….

934d0f94 00098538 00098560 00098510 000aff98 8…`………..

934d0fa4 002cc7e8 00c72fc8 00cb3430 ..,../..04..

The Value “SystemPages” is the last one of the 15 but with hit and trial I found that “SystemPages” in the Value list was not close to offset 15 so it was pretty much manually poking around at each value offset.I found it at offset 11.