I have seen references of changing the
kernel io scheduler at boot time…not sure if it applies to RHEL3.0, or
will help, but try setting ‘elevator=deadline’ during boot time or
via grub.conf. Have you tried running a simple ‘dd’ on the LUN? The
drives are in RAID10 configuration, right?

Have you tried a different kernel?
We run with a netapp over NFS without any issues, but we have seen high IO-wait
on other Dell boxes (running and not running postgres) and RHES 3.
We have replaced a Dell PowerEdge 350 running RH 7.3 with a PE750 with
more memory running RHES3 and it be bogged down with IO waits due to syslog
messages writing to the disk, the old slower server could handle it fine.
I don't know if it is a Dell thing or a RH kernel, but we try different kernels
on our boxes to try to find one that works better. We have not found one
that stands out over another consistently but we have been moving away
from Update 2 kernel (2.4.21-15.ELsmp) due to server lockup issues.
Unfortunately we get the best disk throughput on our few remaining 7.3 boxes.

The behavior we see is that when running queries that do random reads on disk,
IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow
3000kB/s (We usually average 40000 kB/s to 80000 kB/s on sequential read
operations on the netapps)

The stats of the NetApp do confirm that it is sitting idle. Doing an strace on
the Postgresql process shows that is it doing seeks and reads.

So my question is where is this iowait time spent ?
Is there a way to pinpoint the problem in more details ?
We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3

I have included the output of top,vmstat,strace and systat from the Netapp from
System B while running a single query that generates this behavior.