Er… problem with HS21 XM (7995) and ESX 3.5

This is a bit of an issue. I’ve just test installed ESX 3.5 onto a HS21 XM (7995) blade BIOS v 1.07, everything is fine and the server boots fine and runs stable but everytime I reboot from the console or restart using VI-Client I get a purple screen of death.

Now I know that there is an issue with quad-core Xeons and HS21 blades, but wasn’t this fixed with the latest BIOS versions? I believe it was fixed with BIOS 1.06 on the normal HS21 but was this same fix applied to HS21 XM (7995) v 1.07?

IBM and VMware support tickets have been opened, but any working fixes out there?

Advertisements

Like this:

LikeLoading...

Related

About Hugo Phan

Discussion

8 thoughts on “Er… problem with HS21 XM (7995) and ESX 3.5”

Make sure the processor stepping levels are matched on the 2 CPU’s. Check the VPD in the BIOS for the stepping levels. There is a known problem with 1.07b (the one with a December release date) and stepping levels with Red Hat. It would make sense if it bleeds over into VMWare since there is some Red Hat in there.Check my site for more information on the bug in the BIOS.www.bladevault.info.Aaron

Aaron,Customer has 1.07 31/01/2008.I will ask the customer to check CPU settings on Monday.I have a theory which I will test this week: this problem wouldn’t be an issue if ESX 3i is used? Since there is no Service Console….I will let you know.If it works I’ll have to persuade them to use 3i on the blades.The other thought I had was to remove all of the CPUs in Socket 2 from the 8 blades, and also remove the CPUs in Socket 1 from the second set of blades and install these into the first set of blades in Socket 2. The remaining CPUs that were all in Socket 2 of all of the blades would then be installed into the second set of blades. That would hopefully alleviate the processor mismatch…Hugo

ESXi is just one of the solutions to this problem. It appears that the problems that I was experiencing were not due to the 2 x quad core CPUs but were down to the USB KVM module that is loaded by ESX.To get around this issue on ESX 3.5 build 64607, run chkconfig gpm offthen reboot.If you are running ESX 3.5 Up 1 or ESXi, then you will not get this issue.

We’re completely running an up to date load of ESX 3.5 on our SuperMicro barebones servers and still receive a purple screen error on shutdown. We believe the problem is caused by an interaction between gpm and the ipmi card w/usb kvm functionality.Like with you, our fix was to simply turn off gpm.