DL380G2 Smart Array 5i Disk Problems

Hi,

I have a DL380G2 with integrated Smart Array 5i RAID controller. I have 6x18.2GB Ultra 3 Wide SCSI disks configured as 2 as Raid 1, 3 as Raid 5 and one shared on-line spare.

The system shows failures in some of the drives but the system is limping along. One of the Raid 1 pair has failed, one of the Raid 5 trio has failed and the spare has failed. I have tried replacing the failed drives to no effect - I can't believe that the replacements are also faulty or that I have just been lucky that the drives flagged as failed have not crashed the system - it suggests to me that there is some sort of configuration problem.

Re: DL380G2 Smart Array 5i Disk Problems

Dave,

As Karlo suggests, run ADU and generate a report file. This can be analysed if you attach it to your reply.

There are many instances where replacement disks refuse to rebuild due to a hard read failure on one of the existing raid members that may still be operational. The fact that the read error prevents the rebuild would mean that a backup and restore would be necessary once the faulty drive(s) have been identified.

The ADU report holds the answer so please run ADU and submit the report.

Re: DL380G2 Smart Array 5i Disk Problems

Hi folks,

thanks a lot for the replies. I have attached the ADU zip file. It doesn't tell me much other than that the array can't talk to the three drives but is there anything else in there that others can make sense of ?

Really appreciate you taking the time to have a look.

Is it possible that one drive can be affecting the others ? I have been swapping them around again and seemed to be getting somewhere until I installed the last drive (Address 0) when the same three failed again. May be coincidence I guess ?

Re: DL380G2 Smart Array 5i Disk Problems

Hi Dave,

I've tried a few times today to reply with an attachment of the analysed report but the HP system just hangs if I have an attachement. I will reply in full when this is rectified however your report shows clearly that the drive at SCSI ID 2 has very serious problems and will prevent other drives from rebuilding for sure.

It also has some weird characteristics, for example the service time value (in minutes) is abnormal. This drive has many uncorrectable hard read errors among many other types.

Once I have the chance I will attach the analysed report which shows this clearer.

Re: DL380G2 Smart Array 5i Disk Problems

Hi Brian,

thanks a lot for looking at this. Things have got much worse now. I've spent most of the weekend trying to get to the bottom of the problem, having the system running with no redundancy, i.e., 2 of the RAID 1 system mirrors and 2/3 of the RAID 5 data arrays running. Tried various combinations of drives until DISASTER - the other half of the system mirror failed !

I've lost about a days worth of data which I hadn't got backed up - a real pain but not "mission critical".

I tried rebuilding the machine from scratch with a RAID 1 + spare system array and things seemed to be going OK until about the 5th reboot when the system failed again.

I'm starting to suspect that the RAID controller is goosed - either that or I have wrecked about half a dozen disks over the weekend ! I can't get the system to work reliably now with even three disks, I'm getting loads of hard orange LEDs on the drives at various times.

I am keen on seeing your ADU analysis when the system is accepting uploads again. Do you think that the disk 2 errors could be caused by a bad array controller rather than a bad disk ?

Re: DL380G2 Smart Array 5i Disk Problems

Hi,

just closing this out - thanks for the help with this one.

Brian - thanks for taking the time to look at the fault report - much appreciated.

As suggested by Mark, the critical fault was with the SCSI backplane. Pretty surprising (to me anyway) but I've swapped the part and the machine is up again. I guess the first errors that is saw were the board starting to fail until it gave up completely - probably not helped by the one really bad drive as diagnosed by Brian.