Benutzer mit den meisten Antworten

Strange lockup problem when booting Hyper-V Server

Frage

I'm having a strange issue on one of our longest running Hyper-V servers (about a year now). Backup is done using BackupExec 12.5 with the Hyper-V aware backup option.

When booting the machine, it locks up for roughly 50 minutes, and then continues working normally.

After much research, i've arrived at what COULD be the problem, but i'm not sure:

C:\windows\system32\config\SYSTEM has an unusual size of 170MB

When trying to find why using dureg.exe, i've found several trees that would indicate some sort of enumeration problem with the Hyper-V VSS Writer and it's Backups:

All of those keys have 500-1000 Sub Entries, which i find highly unusual: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\DeviceClasses\{53f56307-b6bf-11d0-94f2-00a0c91efb8b} HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\STORAGE\VolumeSnapshot HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI\Disk&Ven_Msft&Prod_Virtual_Disk

Can someone running Hyper-V with backups for quite some time check the size of their registry keys?

Antworten

Looks like my guess was correct - removing the devices using Device Remover (http://www.pro-it-education.de/software/deviceremover/) worked perfectly. DeviceRemover detected 9500, i deleted them, and the system is now at roughly 350 devices (which is similar to other servers).

Boot-Up issue is resolved, but the entries are still being created when running backups.

Alle Antworten

I don't have Symantec Backup Exec installed on my Hyper-V host. I checked the registry keys you mentioned, I only have several entries in most of them, the maximum is around 30 entries.(Maybe I don't have Backup Exec installed). And dureg.exe is a Windows Server 2003 Resource Kit tool, there is no guarantee that it will work for Windows Server 2008.

In order to isolate the problem, please help me to collect the following information:

1.When this symptom occur?

2.What’s the exactly symptom when the server “LOCKUP”?

Please perform the following to tests whether the same issue persists:

1. Symptoms occurs during server startup (sometimes before logon, sometimes after), never at any other time.

2. Server does not accept any mouse or keyboard input. Can't connect using RDP or RPC. Mouse still moving on screen. Clock in Taskbar is stopped. Keyboard still handling Numlock lights. After about 40-50 Minutes, server continues working, completes starting up. May then run for weeks without issues, until the next reboot with the same symptoms.

A question for you: Do you backup the Hyper-V host you've checked this using the Hyper-V VSS writer? I've seen this issue on all our production Hyper-V machines which are being backed up using the Hyper-V VSS Writer, but not on our test machines, which are either not backed up or backed up using Imaging software.

Yes, the size of SYSTEM hive is unusually large which may cause start issue. According to the registry keys, the problem occurs when the SAN is exposing LUN'S slightly different. As a result, the system thinks that the disk is different and it adds a new entry for it when the server is booted. Generally under the \scsi branch you should only see 2 or 3 entries. To resolve this, you may backup and then remove the extra entries from each of the control sets. After that, you can then use the following command against the hive to compact and repair it:

chkreg /f system /c /l /r

How To Use CHKREG.EXE To Check A Hive To Determine What Is Taking Up Space

Here are the steps to determine what part of the registry is taking up the most space

1. Get your system hive and place it in c:\bin

2. Get the chkreg.exe utility(Please download the chkreg.exe from the Skydrive.)

3. Run the following command and cmd prompt

chkreg /f c:\bin\system /d 5 /s>c:\bin\chkreg.txt

the /d 5 switch is used to determine how far down the registry tree is displayed, you may need to increase this but 5 is usually enough to get you heading in the right direction

4. In the chkreg.txt file delete everything above the following section

Keys,Values, Cells, Size, SubKeys

1, 2, 6, 2824, ControlSet001\Control\Arbiters\AllocationOrder

what is listed may differ than your hive. Be sure and keep the column headings

5. Open the chkreg.txt using Excel. Select delimited when prompted, next, then select comma for the delimiter, next, finish

6. Now select all data in the spreadsheet(upper left cell between a and 1)

7. Click data, sort

8. Sort by size

9. Go to the bottom of the results

10. At this point you need to look at the highest size entries. For example here is one

2179 8372 18538 1201808

ControlSet002\Enum\SCSI\Disk&Ven_EMC&Prod_SYMMETRIX&Rev_5567\

2179 8372 18538 1203232

ControlSet003\Enum\SCSI\Disk&Ven_EMC&Prod_SYMMETRIX&Rev_5567\

2685 10600 23443 1482000 ControlSet002\Enum\SCSI\

2685 10600 23443 1483408 ControlSet003\Enum\SCSI\

2939 13773 32787 1566872 ControlSet002\Control\

2939 13773 32787 1567072 ControlSet003\Control\

3383 12975 28936 1764288 ControlSet002\Enum\

3383 12975 28936 1765760 ControlSet003\Enum\

4484 16262 37354 2067568 ControlSet001\

7592 31210 70993 3914608 ControlSet002\

You can see that obviously the CCS001 and CCS002 are going to be big since that is the total for that control set. As you move up in the list you can see that the Enum\SCSI section is quite large. We looked at these branches of the registry and there were hundreds of entries for lun's. This is what was causing the hive to grow so much. You may need to increase the /d switch to get further into the registry if it is not obvious

You mentioned that "the disk entries created are _all_ from the Hyper-V VSS writer."

Did you mean each time when you perform the actions related to Hyper-V VSS writer such as taking snapshot, backup the VMs using Backup Exec, the disk entries will add automatically. If I misunderstand your concern, please feel free to let me know.

I didn't test with Hyper-V snapshots, but when using BackupExec with the Hyper-V VSS writer, additional devices are created. I've looked at test machines that are backed up using Windows Server Backup with the Hyper-V VSS Writer enabled and they have exactly the same symptoms, albeit to a much lesser degree (less VHDs, not backed up frequently).

This is why i suspect a problem with the Hyper-V VSS Writer, and unfortunately i don't know anyone that has been running Hyper-V for a long time (6 months to a year) with daily backups.

Looks like my guess was correct - removing the devices using Device Remover (http://www.pro-it-education.de/software/deviceremover/) worked perfectly. DeviceRemover detected 9500, i deleted them, and the system is now at roughly 350 devices (which is similar to other servers).

Boot-Up issue is resolved, but the entries are still being created when running backups.

I also have the same problem. C:\windows\system32\config\SYSTEM is around 130MB. It takes at least 15 minutes to boot. The server is running Server 2008 x64 SP2 and Hyper-V.

We are performing daily backups of the VMs using Backup Exec 12.5 SP3 Using the Hyper-V agent (with GRT enabled). The HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\DeviceClasses\{53f56307-b6bf-11d0-94f2-00a0c91efb8b} and HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ENUM\VMBUS registry keys contain several thousands (6800) of entries.

On the first server 5 VMs are backed up on a daily basis also using the verify option. I have a second Hyper-V server on which I backup only 2VMs every night. On this server the C:\windows\system32\config\SYSTEM is only 40MB.

On the Backup server itself both registry keys also contain thousands of entries. On this server the C:\windows\system32\config\SYSTEM is 110MB.

I used DeviceRemover to remove the Msft Virtual Disk SCSI Disk devices but this only removes the entries in the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\DeviceClasses\{53f56307-b6bf-11d0-94f2-00a0c91efb8b} key. The VMBUS registry key still contains lots of entries.

I have exactly the same problem except we are using Windows Server Backup. We are not using any 3rd party software. Our System hive is 343MB and growing! We have over 24,000 devices in the registry! This is happening on two of our servers.

Please can Microsoft treat this is a bug and find the cause and a permanent fix? We are running Windows Server 2008 R2 Datacenter with Hyper-V. WSB is taking incrimental backups every 30 minutes, but we have recently changed it to 60 minutes. No other roles or 3rd party software is running on these host machines which are both a Dell T710 (11th generation). Thanks.