Strange issue, have finally got gentoo installed with everything I require, and I notice that mdadm has removed 2 partitions from one of my raid arrays, being more than just a little pissed off, I downloaded some diag tools from the HDD manufacturers site and tested them, all came back clean even after full smart scans and media test, I couldnt test one drive as it is too new for the dos version, and the only other option was a windows version, and i wasnt going to install windows just for that, so I have tested the remaining drive using smartmon tools under gentoo and even that says it is fine.

I have re-added the missing partitions to the array and have changed all the cables for brand new ones on all the drives, but if it was a drive or cable then why did it only affect one partition on two sperate drives, and not have more partitions missing from the other arrays.

Should I be concerned about this?_________________I know 43 ways to kill with a SKITTLE, so taste my rainbow bitch.

In this instance, the drive failed to relocate a bad sector in time, the kernel got fed up waiting and kicked the underlying block device /dev/sda3 out of the array.
It was a 5 element raid 5, so it dropped to degraded mode on 4 drives. The other raids, using sda1 and sda2 kept going.

Run a check on your raid sets

Code:

echo check > /sys/block/md2/md/sync_action

Change md2 to whatever mdX you want to check. This checks that the raid redundant data is valid everywhere, even where space is not used by the filesystem yet.

The above sample is good - no reallocated sectors and none pending. Reallocated sectors are mostly harmless, thats how the drive never seems to have any bad blocks. When the drive struggles to read a sector, its rewritten to a spare. Which is good when it works. This example

is from a less healthy drive. In fact, I've just replaced it my desktop raid.
What happened was another dive was kicked out the set, and during the rebuild, this drive was kicked out too. Thats a really bad thing as I now had a raid5 missing two drives.
ddrescue imaged the dud above onto a new drive, all except 58 sectors. I was able to determine that the unread sectors were in /usr somewhere and I'm guessing they were unused as they were not in the Current_Pending_Sector count, which is only 14.

A write to the dud sectors will force them to be relocated but do I really trust a drive that looks like its slowly dying?_________________Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

from memory it was saying buffer I/O error, looking at the smartctl output there are 0 bad sectors and none pending correction or needing to be swapped out, long pass with smartmon tools has passed all drives, I am wondering if it was a cable issue, I replaced all of them with brand new ones and haven't seen any issues so far, I will keep an eye on it.

Am running the test you suggested now nd will leave the pc for a while

Code:

[12956.221875] md: data-check of RAID array md3
[12956.221880] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[12956.221882] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
[12956.221888] md: using 128k window, over a total of 943010816k.
[13245.715051] md: delaying data-check of md0 until md3 has finished (they share one or more physical units)
[13249.427192] md: delaying data-check of md1 until md3 has finished (they share one or more physical units)
[13253.899782] md: delaying data-check of md2 until md3 has finished (they share one or more physical units)

/proc/mdstat will tell about progress. If you use the system, the check will take longer as it will get out the way for you to read/write your data.
mdadm should have emailed you about the issue, if you had it set up and running._________________Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

i haven't got that part of it setup yet, I have only just got gentoo on there and working when mdadm removed or something happened to remove 2 of the 4 partitions (which were on serperate drives), none of the partitions of the other raid arrays were touched, so I am at a loss. I will leave the other machine for a few hours and see what happens_________________I know 43 ways to kill with a SKITTLE, so taste my rainbow bitch.