Hi everyone...
We have a system with Supermicro 370LE motherboard that
has 20Gb IDE system disk (primary IDE master) and CD-rom (primary
IDE slave). (2xPentium III 1 GHZ).
On about six of our 136 nodes we have seen errors like the following:
wait_on_bh, CPU 1:
irq: 0 [0 0]
bh: 1 [1 0]
<[c010c289]> <[c0179d91]> <[c017edfb]> <[c01533d4]> <[c0138148]> hda:
status timeout: status=0x90 { Busy }
hda: drive not ready for command
ide0: reset timed-out, status=0x90
hda: status timeout: status=0x90 { Busy }
hda: drive not ready for command
ide0: reset timed-out, status=0x90
hda: status timeout: status=0x90 { Busy }
end_request: I/O error, dev 03:01 (hda), sector 3678384
hda: drive not ready for command
EXT2-fs error (device ide0(3,1)): ext2_write_inode: unable to read inode
block - inode=304663, block=622597
We are currently running 2.2.19-6.2.1 kernel as it came from Red Hat.
----------------------------------------------------
Now, whenever I have seen errors like this before, it has meant
a hardware fault with the disk. But with any of these, we just
reboot the system, it does a fsck of the system disk, and
everything is fine again.
Can anyone give me a clue as to
1) How errors that include an I/O error could mean anything
else than a hardware error on the disk?
2) What may be causing these errors?
3) What resources are out there on the net for IDE faq's on Linux
4) If we go to 2.4 kernels is it likely to get better?
Thanks
Steve Timm
------------------------------------------------------------------
Steven C. Timm (630) 840-8525 timm at fnal.govhttp://home.fnal.gov/~timm/
Fermilab Computing Division/Operating Systems Support
Scientific Computing Support Group--Computing Farms Operations