Over the last few weeks 3 of our secondary (log-shipped) DB's have been marked 'Suspect', requiring drop+restore. I've been advised to check the I/O and try to faultfind.

What practices/native tools exist for SS2K to get started on the investigation? BTW, if the initial diagnosis involves creating non-temp tables/objects, I would rather avoid this as even making slight changes involves having to raise an RFC.

Also, would you recommend checking I/O on both Primary + Secondary servers?

I've done some Perfmon analysis during the 100 seconds after which log shipping runs (every 15mins on the hour), only the logical disk today (physical tomorrow) but the results for the W: drive to which the logs are copied (and restored from) are as follows, I presume the values are milliseconds:

GilaMonster (4/12/2012)Perfmon is not the place to look, you don't have disk performance problems, you have disk stability problems.

Agreed, but I don't have a lot of immediate avenues of investigation left, so I was reaching. The event log (app/systeM) showed nothing suspicious around or immediately before the initial failure. We don't have the SAN/RAID guys in until Monday, and stopping the SQL service, even temporarily on Secondary, will require a bunch of form-filling. Ok, actually swapping the disk out is not a lengthy procedure, but I need to make a business case for the switch, and thus need proof the disk is not quite stable.

Zip. The app log filled up with infomercials and doesn't stretch back that far. However I DID check it on the morning in question (the 11th) and found nothing. The only other 'critical' error was in te System log, a virtual disk service error, about 8 hours before and after the restore job failed: