One node has developed a "slow disk", although at this point I'm not
at all sure that the disk is actually at fault. Have any of you ever
seen something like this:
1. hdparm -t -T /dev/hda on the slow node is typically:
Timing cached reads: 504 MB in 2.00 seconds = 251.61 MB/sec
Timing buffered disk reads: 104 MB in 3.55 seconds = 29.29 MB/sec
but the second line varies A LOT, down to 24Mb and up to over
30. However the same test on the other nodes varies only a little:
Timing cached reads: 514 MB in 2.01 seconds = 256.03 MB/sec
Timing buffered disk reads: 124 MB in 3.04 seconds = 40.74 MB/sec
(plus or minus about 1MB/sec on the second line).
2. hdparm -i -v and -i -m are identical on all machines, except
for serial number. Here is -i -v on the slow one:
/dev/hda:
multcount = 16 (on)
IO_support = 1 (32-bit)
unmaskirq = 1 (on)
using_dma = 1 (on)
keepsettings = 0 (off)
readonly = 0 (off)
readahead = 256 (on)
geometry = 65535/16/63, sectors = 78165360, start = 0
Model=WDC WD400BB-00DEA0, FwRev=05.03E05, SerialNo=WD-WMAD11736294
Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
RawCHS=16383/16/63, TrkSize=57600, SectSize=600, ECCbytes=40
BuffType=DualPortCache, BuffSize=2048kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=78165360
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 udma3 udma4 *udma5
AdvancedPM=no WriteCache=enabled
Drive conforms to: Unspecified: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3
ATA/ATAPI-4 ATA/ATAPI-5
3. Western Digital diagnostics and smartctl show nothing wrong with
the disk. It has no bad sectors or other errors logged. The smartctl
tests do take longer to complete than on the other disks.
4. DMA is working (at least partially), since turning that off drops
the hdparm test down to about 4Mb/sec.
5. Opened the case and all jumpers were as they should be, Power supply
tested good.
6. dmesg from the slow node and a normal node shows no significant
differences. (Changes in the 3rd digit after the decimal of the
bogoMIPS value, for instance.)
The only oddity I've found was that the setting "32 bit I/O" in the BIOS
was disabled for some reason on the slow node. Changing it to enabled
made no difference (even after several reboots, cold and warm.) Is it
possible that the OS has the earlier setting hidden away somewhere and
is still using it? This was particularly weird because hdparm showed
IO_Support = 1 (32-bit)
even when this BIOS bit was disabled.
The speed issue initially turned up in a run where a certain program was
required to allocate about 1.6 Gb of memory (at least .6GB of which had
to come out of the 2GB swap, since there was only 1GB of RAM.) That
large region was then ordered with qsort(). Bizarrely, this took
forever on one node (hours longer than on any other node, with similar
sized data), and when it finished the sort the resulting binary data was
written to disk at only 0.5 Mb/sec. Yes, 500kilobytes/sec. Nothing
else was using CPU time. Prior to that one run this node did
nothing to draw my attention to it as a "slow node".
This is on one of the notorious Tyan S2466 boards. I'm beginning to
wonder if perhaps it now has a bit stuck somewhere in the BIOS, in which
case maybe wiping the BIOS settings and redoing them will fix it.
I already tried powering it off for 15 minutes unplugged, but that
did not help.
Also the whole cluster had to be powered down for about 15 minutes
in the morning before this started for A/C service. If the battery
on that board is iffy it might explain how the "32 bit I/O" became
disabled. However, it did not reset again on a subsequent long power down.
Thanks,
David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech