RAID Recovery Planning

RAID is often used for critical stuff for the purpose of keeping data safe and having redundancy to prevent data from being lost if a drive fails. But problems do come up, and when/if they do it is vital that you have everything you need in order to get it fixed and back online.

Throughout this page are commands used to create logs of information on your system and RAID setup, and backups of important files and information.

Speed Boosts

Monitor md:

# tmux split-window -l 8 "watch -t 'cat /proc/mdstat'"

View your current usage with:

# sar -bdpqu -P ALL 60 1

Sync Speed

drivers/md/md.c

/*
* Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit'
* is 1000 KB/sec, so the extra system load does not show up that much.
* Increase it if you want to have more _guaranteed_ speed. Note that
* the RAID driver will use the maximum available bandwidth if the IO
* subsystem is idle. There is also an 'absolute maximum' reconstruction
* speed limit - in case reconstruction slows down your system despite
* idle IO detection.
*
* you can change it via /proc/sys/dev/raid/speed_limit_min and _max.
* or /sys/block/mdX/md/sync_speed_{min,max}
*/
static int sysctl_speed_limit_min = 1000;
static int sysctl_speed_limit_max = 200000;

Increase Stripe Cache Size

Increase Sector readahead

Set read-ahead to 32 MiB for /dev/md0:

# blockdev --setra 65536 /dev/md0

mdadm Options

Help

Any parameter that does not start with '-' is treated as a device name or, for --examine-bitmap, a file name. The first such name is often the name of an md device. Subsequent names are often names of component devices.

Create

# mdadm --create device -chunk=X --level=Y --raid-devices=Z devices

This usage will initialise a new md array, associate some devices with it, and activate the array. In order to create an array with some devices missing, use the special word 'missing' in place of the relevant device name.

Before devices are added, they are checked to see if they already contain raid superblocks or filesystems. They are also checked to see if the variance in device size exceeds 1%. If any discrepancy is found, the user will be prompted for confirmation before the array is created. The presence of a '--run' can override this caution.

If the --size option is given then only that many kilobytes of each device is used, no matter how big each device is. If no --size is given, the apparent size of the smallest drive given is used for raid level 1 and greater, and the full device is used for other levels.

Options that are valid with --create (-C) are:

--bitmap= : Create a bitmap for the array with the given filename or an internal bitmap is 'internal' is given

--run -R : insist of running the array even if not all devices are present or some look odd.

--readonly -o : start the array readonly - not supported yet.

--name= -N : Textual name for array - max 32 characters

--bitmap-chunk= : bitmap chunksize in Kilobytes.

--delay= -d : bitmap update delay in seconds.

Build

# mdadm --build device -chunk=X --level=Y --raid-devices=Z devices

This usage is similar to --create. The difference is that it creates a legacy array without a superblock. With these arrays there is no different between initially creating the array and subsequently assembling the array, except that hopefully there is useful data there in the second case.

The level may only be 0, 1, 10, linear, multipath, or faulty. All devices must be listed and the array will be started once complete.
Options that are valid with --build (-B) are:

--bitmap= : file to store/find bitmap information in.

--chunk= -c : chunk size of kibibytes

--rounding= : rounding factor for linear array (==chunk size)

--level= -l : 0, 1, 10, linear, multipath, faulty

--raid-devices= -n : number of active devices in array

--bitmap-chunk= : bitmap chunksize in Kilobytes.

--delay= -d : bitmap update delay in seconds.

Assemble

# mdadm --assemble device options...

mdadm --assemble --scan options...

This usage assembles one or more raid arrays from pre-existing components. For each array, mdadm needs to know the md device, the identity of the array, and a number of sub devices. These can be found in a number of ways.

The md device is given on the command line, is found listed in the config file, or can be deduced from the array identity. The array identity is determined either from the --uuid, --name, or --super-minor commandline arguments, from the config file, or from the first component device on the command line.

The different combinations of these are as follows: If the --scan option is not given, then only devices and identities listed on the command line are considered. The first device will be the array device, and the remainder will be examined when looking for components. If an explicit identity is given with --uuid or --super-minor, then only devices with a superblock which matches that identity is considered, otherwise every device listed is considered.

If the --scan option is given, and no devices are listed, then every array listed in the config file is considered for assembly. The identity of candidate devices are determined from the config file. After these arrays are assembled, mdadm will look for other devices that could form further arrays and tries to assemble them. This can be disabled using the 'AUTO' option in the config file.

If the --scan option is given as well as one or more devices, then Those devices are md devices that are to be assembled. Their identity and components are determined from the config file.

If mdadm can not find all of the components for an array, it will assemble it but not activate it unless --run or --scan is given. To preserve this behaviour even with --scan, add --no-degraded. Note that \"all of the components\" means as many as were present the last time the array was running as recorded in the superblock. If the array was already degraded, and the missing device is not a new problem, it will still be assembled. It is only newly missing devices that cause the array not to be started.

Options that are valid with --assemble (-A) are:

--bitmap= : bitmap file to use with the array

--uuid= -u : uuid of array to assemble. Devices which don't have this uuid are excluded

--super-minor= -m : minor number to look for in super-block when choosing devices to use.

--name= -N : Array name to look for in super-block.

--config= -c : config file

--scan -s : scan config file for missing information

--run -R : Try to start the array even if not enough devices for a full array are present

--force -f : Assemble the array even if some superblocks appear out-of-date. This involves modifying the superblocks.

Manage

# mdadm arraydevice options component devices...

This usage is for managing the component devices within an array. The --manage option is not needed and is assumed if the first argument is a device name or a management option. The first device listed will be taken to be an md array device, any subsequent devices are (potential) components of that array.

Options that are valid with management mode are:

--add -a : hotadd subsequent devices to the array

--re-add : subsequent devices are re-added if there were recent members of the array

--remove -r : remove subsequent devices, which must not be active

--fail -f : mark subsequent devices a faulty

--set-faulty : same as --fail

--run -R : start a partially built array

--stop -S : deactivate array, releasing all resources

--readonly -o : mark array as readonly

--readwrite -w : mark array as readwrite

Misc

# mdadm misc_option devices...

This usage is for performing some task on one or more devices, which may be arrays or components, depending on the task. The --misc option is not needed (though it is allowed) and is assumed if the first argument in a misc option.

Options that are valid with the miscellaneous mode are:

--query -Q : Display general information about how a device relates to the md driver

Monitor

# mdadm --monitor options devices

This usage causes mdadm to monitor a number of md arrays by periodically polling their status and acting on any changes. If any devices are listed then those devices are monitored, otherwise all devices listed in the config file are monitored. The address for mailing advisories to, and the program to handle each change can be specified in the config file or on the command line. There must be at least one destination for advisories, whether an email address, a program, or --syslog

--test -t : Generate a TestMessage event against each array at startup

Grow

# mdadm --grow device options

This usage causes mdadm to attempt to reconfigure a running array. This is only possibly if the kernel being used supports a particular reconfiguration.

Options that are valid with the grow (-G --grow) mode are:

--level= -l : Tell mdadm what level to convert the array to.

--layout= -p : For a FAULTY array, set/change the error mode. for other arrays, update the layout

--size= -z : Change the active size of devices in an array. This is useful if all devices have been replaced with larger devices. Value is in Kilobytes, or the special word 'max' meaning 'as large as possible'.

--assume-clean : When increasing the --size, this flag will avoid a resync of the new space

--chunk= -c : Change the chunksize of the array

--raid-devices= -n : Change the number of active devices in an array.

--add= -a : Add listed devices as part of reshape. This is needed for resizing a RAID0 which cannot have spares already present.

--bitmap= -b : Add or remove a write-intent bitmap.

--backup-file= file : A file on a different device to store data for a short time while increasing raid-devices on a RAID4/5/6 array. Also needed throughout a reshape when changing parameters other than raid-devices

--array-size= -Z : Change visible size of array. This does not change any data on the device, and is not stable across restarts.

Incremental

# mdadm --incremental [-Rqrsf] device

This usage allows for incremental assembly of md arrays. Devices can be added one at a time as they are discovered. Once an array has all expected devices, it will be started.

Optionally, the process can be reversed by using the fail option. When fail mode is invoked, mdadm will see if the device belongs to an array and then both fail (if needed) and remove the device from that array.

Options that are valid with incremental assembly (-I --incremental) are:

--run -R : Run arrays as soon as a minimal number of devices are present rather than waiting for all expected.