[SOLVED] RAID 5 array not assembling all 3 devices on boot using MDADM, one is degraded.

User Name

Remember Me?

Password

Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Welcome to LinuxQuestions.org, a friendly and active Linux Community.

You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!

Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.

If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.

Having a problem logging in? Please visit this page to clear all LQ-related cookies.

Introduction to Linux - A Hands on Guide

This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.

RAID 5 array not assembling all 3 devices on boot using MDADM, one is degraded.

I have been having this problem for the past couple days and have done my best to solve it, but to no avail.

I am using mdadm, which I'm not the most experienced in, to make a raid5 array using three separate disks (dev/sda, dev/sdc, dev/sdd). For some reason not all three drives are being assembled at boot, but I can add the missing array without any problems later, its just that this takes hours to sync.

As you can see it is clean, and degraded.
After running:
sudo mdadm --add /dev/md0 /dev/sda
it goes to a clean state where all drives are active, it's just that all my efforts of adding disappear after rebooting and I have to do the whole process again.

Also there is another config file at /etc/default/mdadm and I noticed that the /etc/init.d/mdadm-raid does has a variable called DEBIANCONFIG that points to it. i tried changing this "default" config to "my" config, but that didn't work either.

Anyway, here is that config file just in case
/etc/default/mdadm

Code:

# mdadm Debian configuration
#
# You can run 'dpkg-reconfigure mdadm' to modify the values in this file, if
# you want. You can also change the values here and changes will be preserved.
# Do note that only the values are preserved; the rest of the file is
# rewritten.
#
# INITRDSTART:
# list of arrays (or 'all') to start automatically when the initial ramdisk
# loads. This list *must* include the array holding your root filesystem. Use
# 'none' to prevent any array from being started from the initial ramdisk.
INITRDSTART='none'
# AUTOSTART:
# should mdadm start arrays listed in /etc/mdadm/mdadm.conf automatically
# during boot?
AUTOSTART=true
# AUTOCHECK:
# should mdadm run periodic redundancy checks over your arrays? See
# /etc/cron.d/mdadm.
AUTOCHECK=true
# START_DAEMON:
# should mdadm start the MD monitoring daemon during boot?
START_DAEMON=true
# DAEMON_OPTIONS:
# additional options to pass to the daemon.
DAEMON_OPTIONS="--syslog"
# VERBOSE:
# if this variable is set to true, mdadm will be a little more verbose e.g.
# when creating the initramfs.
VERBOSE=false
# MAIL_TO:
# this variable is now managed in /etc/mdadm/mdadm.conf (MAILADDR).
# Please see mdadm.conf(5).

And finally, the most important part, the dmesg. Hopefully you guys can make sense of it.

You're using whole device raid, rather than partitions, which is great, but I would've thought that the partitions would therefore look identical on sda, sdc and sdd. Yet the output from dmesg doesn't appear to reflect that.

What is the physical hardware of the 4 disks; I see only the first two SATA disks...

Have you tried changing you config file so that mdadm does not start during boot? Western Digital makes it very clear that their "normal" drives do not support RAID. They want you to spend considerably more money on their "RAID Edition" drives. The problem is that there is a lag time with some of the drives and that RAID hardware engines and possibly mdadm see this as a drive failure.
It may be possible to start mdadm manually after your system is up and the drives have a had a few minutes to stabilize. Also, it may be helpful if you use the "verbose" option so that more detailed info will be presented. It could help with troubleshooting.

Putting a delay in with something like "sleep 20" in your startup file that starts the raid would address the "drive not ready" issue mentioned above; it might be worth trying... I have a delay in rc.local; not sure if that's the same in debian. I think you'll need to put in /etc/init.d/mdadm-raid

If your RAID is started by your initrd, which is likely only if you boot off it, then you need to remake your initrd. Look athttp://wiki.xtronics.com/index.php/Raid , especially at the sections labeled "Regenerate initrd" and immediately below, as well as the end section in "Notes from others"

If all else fails, backup your data and remake the RAID and try again, recreating the RAID from scratch and carefully taking note of the commands used. I still think it's weird that your three RAID disk don't have the same partition table setup, but since sdd is the outlier, not sda, I can't see how that's the problem.

I am fairly green with raid but I setup our server with mdadm with 2 raid-1 arrays....I have a partition on each disk (type fd "linux raid auto-aware") after the raids are constructed I formatted the array /dev/md0 with mkfs.ext3.

I prefer partition tables, even if the whole drive is one partition. You don't make a filesystem (ext3 or anything else), you just make the partition (e.g. with fdisk) then use mdadm to make the array with the partitions
(e.g. mdadm -c (other options)/dev/sda1 /dev/sdc1 etc. n.b. the numerals) then make the filesystem on the array,
(e.g. mkfs -t ext3 /dev/md0 or something like that)

Not to say there's anything wrong with using the whole device, just my preference.

After a bunch of crazy headaches, it seems to work. There was this crazy hiccup that I ran into where fdisk was claiming that some of my identical drives were not identical (different number of sectors). After reformatting and restarting that problem magically disappeared.

I was having the same problem with my 3 disks Raid5 setup after upgrading to Fedora 14 x64. My raid was always starting with 2 drives active and one removed. It also happened to be unable to start at all when 2 drives were not active.
It seems that the system boots to fast and at least one of the drives do not have time to settle and mdadm sees them as not ready/failed and removes them from the raid.

I added rootdelay=9 to my boot line and now everything works like a charm.