ml350 g3 wont boot

Hi there

I am having loads of weekend fun with an ML350 G3 that up until yesterday was working great.

Here's the sequence of events.

1. Had an existing 36GB drive in place and we wanted to put a new Smart Array Controller in and create a RAID 5 Array.. done it a million times but alas on our 1 millionth attempt things went wrong and to cut a long story short we've decided we just want to go back to the way things were, when things were working that is soo..

2. Removed the Array card and booted up. The original hard disk appears dead, it has a constant arrow and refuses to boot - troubleshooted that for some time and have resolved ourselves to the fact that now the God's have decided we need a new hard disk soo..

3. got another disk in to that slow - we get an error that the SCSI cable is faulty - seems reasonable, after all.. who wants to relax on weekends anyway? So, I've replaced the SCSI cable and NOW...

4. I get the following (in this order)

Proliant Welcome Screen
Backup ROM info blah blah
Press F8 for Multi-Initiator (played around with these settings a lot and got nowhere)
SCSI BIOS is not installed! OR SCSI BIOS installed successfully! (the latter happens more frequently now)
Then it goes straight to HP Ethernet and tries to boot from LAN, and then the system freezes

At no point do I get to boot from CD (though it sounds like the drive spins up)
After the HP Ethernet Screen appears, the server usually freezes ie. no keyboard status lights and can't ctrl-alt-del leaving me the only option of switching off and back on.

I've tried a different CD drive thinking it might be a fault with that so i could at least load smart start - no dice

When I put the Smart Array controller back in I get mixed results, ultimately though I get the BIOS Installed / Not Installed message followed by a crash - sometimes I don't even get that far.

I've tried clearing the CMOS, removed the CMOS battery and set the switch (#6) to on to reset everything back to defaults.

I've tried 3 different hard disks.

I'm thinking either Back Plane or Motherboard...

Mission critical server! (ironically this is why i was setting up the RAID this weekend anyway) so help is greatly appreciated.

Who is Participating?

As I understand it you started with a single hard drive, prior to installing the raid controller. It is just possible that your original hard drive is still Ok and that the problem somewhere on the SCSI bus. Is the original hard drive an 80 pin one or a 68 pin one?
You could also try disabling the onboard SCSI when the Raid controller is installed and see if the raid controller goes through its bios routine then. The intermittent message about the SCSI bios being installed is a bit weird, normally you see this message when the controller detects a logical drive attached to it. When there is no logocal drive the scsi bios is not installed.

Hi it sounds a bit like your drives may not be terminated properly. Do you have a SCSI cable with a buillt in terminator on the end? I probably need to know a bit more about the server, does it have a hot swap backplane or are the drives connected straight on to the cable? Could there be a bent pin somewheere between the motherboard and the hard drive/s?
Regards
dave

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

Does the ML350 have a system config utility? some servers have a utility to assign IRQs to devices and to ensure their are no conflicts. Your symptoms sound a bit like a resource conflict, ie two devices sharing the same interrupt, or as I said before a termination issue.

0

makingithappenAuthor Commented: 2008-11-15

heading back in today to check it out - i'll keep you posted!

0

makingithappenAuthor Commented: 2008-11-15

Hi dmbgo - how do i disable the onboard SCSI? I can't see an option - basically the biggest problem with troubleshooting this server is I can't get into any BIOS configurations at all.

Here's where I'm at otherwisE:

Motherboard

Questionable (also includes onboard SCSI which is where the hang generally occurs)

Back Plane

Thought it was back plane but when I took the back plane out of the boot sequence (i.e. removed the SCSI cable) the server still hung.

RAM

Removed and replaced, still failed.

Hard Disks

Server still wont boot even with hard disks detached I assume I should still get a boot sequence and access to the BIOS with no hard disks present or with faulty hard disks

Cable

Replaced with a brand new SCSI cable no go.

CPU

Questionable but doubtful.

Other steps taken:

Removed CMOS battery and rebooted still boots without CMOS because there is a redundant ROM, anyone got any ideas on how I can wipe this?
Continually try clearing configuration with switch #6 onboard but now it doesnt seem to detect and ask me to reboot like it did in the past.

0

makingithappenAuthor Commented: 2008-11-15

i should add to this that at one stage i was able to boot from CD into Acronis True Image (to restore our backup) but on analyzing disks Acronis reports a bad sector on the hard disk. Thus, I can only presume a motherboard/controller issue here because the disk is also new...

0

makingithappenAuthor Commented: 2008-11-16

Thanks for all of you that helped over the weekend. I believe this is a faulty motherboard so I bought a replacement box off Ebay. Hopefully next weekend i'll spend more time troubleshooting how to fit more beer into my fridge rather than fixing servers. I'll award some thank you points.