Overview of mdadm and RAID5
mdadm is a tool created by Neil Brown that allows for easy administration of multiple storage devices. One task that can be accomplished with mdadm is the ability to create a RAID array such as a RAID level 5 with a hot spare which will be demonstrated in this writeup.

RAID 5 arrays increase the fault tolerance of the data on the array, and depending on the configuration and hardware used it may increase (or decrease) the performance of read/writes. These traits are achieved by slicing data in to chunks, striping the data across multiple devices and then performing an exclusive OR on the data to create parity data. RAID 5 can handle one device failure at a time and can rebuild the data from the failed drive with parity and the regular data. The steps below show how to create a RAID5 array with one hot spare comprised of 5 disks from start to finish and also some administrative best practices.

These steps were ran on a Debian 6.0 "Squeeze" system (Linux debian 2.6.32-5-amd64) virtual machine on ESXi 4.1.0 build 348481. The hardware is one Intel E5620, 24GB of Kingston RAM, and two 1TB Western Digital EARS's all connected through a Supermicro X8DT3. root access is required and it is assumed you have either an Internet connection or the Debian package DVD's available.

The steps outlined below will damage any data on the selected disks. Do not run them on a production system without fully understanding the process and testing in a development environment.

These instructions are not meant to be exhaustive and may not be appropriate for your environment. Always check with your hardware and software vendors for the appropriate steps to manage your infrastructure.

Add your media devices to the machine. This process will be different depending on system setup. Plug in your hard drives, thumb drives, SSD's, floppies and etc to be used for the RAID array. Shutting down the system to plug in interfaces such as SATA is recommended and will not interfere with this process, just remember to perform steps 1 and 2 before continuing. The system used in this example is a VM and new hard drives were added while it was powered on.

Rescan for new media devices. This step can be skipped if the system was powered off to add the storage devices.

Note that just running the scan command will not work, the dashes sent to the command are wildcards (that I believe tell the controller to scan all buses). Unfortunately I could not find enough information on this command to fully understand how it is being used, if you have details please contact me.

If you are using an HBA for fiber channel, iSCSI, or other protocols it is possible you may have to use vendor supplied programs to scan for new devices.

The output shown here describes 6 total disks, /dev/sda sdb sdc sdd sde and sdf. The first, /dev/sda is where my boot, root and swap partitions are installed. Each partition is assigned an identifier based on its disk and partition number, /dev/sda1 /dev/sda2 and /dev/sda5, which will not be used to create the array. The other 5 disks have a size of 5GB and have no meaningful data on them, such as partitions or filesystems (see footnote 1).

The mdadm application was called, placed in create mode, told to print verbose output, the device to create is defined as /dev/md0, RAID level is set to 5, the number of RAID devices is set to 4 and they are defined, and the number of spare devices is set to 1 with it defined.

mdadm responds with its status of RAID parity alignment is set to left-symmetric alignment, the chunk size is 512K (see footnote 2), size to use from each disk is 5241344K, the superblock version is 1.2, and the device has been started.

The line starting with md0 describes all device information regarding the newly created /dev/md0. The device is active, running in raid5, has 5 member devices, their device numbers, and /dev/sdf is marked as a spare.

The next line lists configuration and status of the device /dev/md0. The total number of blocks in the array, it is using superblock version 1.2, RAID level 5, 512K chunk size and using algorithm 2, the total number of devices in the array is 4 and 3 are active. The last part is the status of the devices in the array, the first 3 devices are up and one is down as the 4th is being rebuilt.

The final line in the group is the activity line which shows that the array is currently in recovery mode, 5.1% complete, the amount of data written of the total, the estimated finish time, and the I/O speed.

The last line of the file confirms that all devices have been assigned.

Wait until the array rebuild is completed. Though you can use the device while it is rebuilding, the performance will be slow and it is a general best practice to allow it to complete.

Above is the output when the array is completely leveled. Notice that the total number of devices is 4 and the total active is now 4 from the "[4/4]" text. Also all of the devices are marked as up since the [UUUU] has none marked down.

Create a backup of the new device's configuration, the status of the device and information about each disk:

These steps create two files: /etc/mdadm/md0.info and /etc/mdadm/md0.disks. The first is general information over the new device and the later is detailed information on each disk comprising the device. Data from these two files can later be used to re-create a damaged device. View them to become familiar with the data.

The device /dev/md0 is now ready for use. It is possible to create partitions, put file systems on it, dd images to it, delete the device, or even use it with other devices to create a nested RAID device.

Conclusion
Following the steps from above should allow you to successfully create a device running RAID level 5 with a single hot spare from five drives using the powerful mdadm tool. These steps can be modified to configure a device to use a different RAID level, such as 0, 1, 10, 4, or 6, and possibly use a different number of disks like 4 or 24. Be sure to test different configurations for the best performance and fault tolerance levels.

Footnotes

The devices used to create the new RAID device /dev/md0 do not have any data on them, they are completely null. It is possible to create a device out of specific partitions on drives, though that was not done here. There are benefits to both and without introducing bias it should be researched before deciding which to use.

Chunk size is the amount of data, in bytes, written to each independent device of a stripe-set. The default is 512KB which may or may not be well suited for your configuration. The chunk size is a value that should be tested in your environment before using in production as it can directly impact I/O performance.

Eric Wamsley
Howdy! I am a technology dude based in the USA. My goal is to combine data, technology, and people; then document the process here so we can all learn from my errors and maybe even get a smile or two.