Why is RAID 5 Slow on Writes?

Books about storage often refer to RAID 5 as striping with distributed parity. RAID 5 comprises a logical volume, based on three or more disk drives, that generates data redundancy to avoid the loss of an entire volume in the event of disk failure. The RAID controller creates a special parity block for each stripe of information, as Figure A shows. (Features at the OS level, such as Windows 2000's—Win2K's—disk striping with parity, can also perform this function.) The parity is typically a binary exclusive OR (XOR) operation—indicated here with the ~ symbol—on all the data blocks of the stripe. In Figure A, the RAID controller calculates parity as S3 = A3 ~B3 ~D3 ~E3.

When a write operation occurs on a RAID 5 volume (e.g., on block B3), the controller must update parity block S3. Because the controller must read all the blocks in the stripe to recreate the parity block, most RAID 5 controllers will go through the following steps, in which single quotation marks and double quotation marks represent modifications:

Read the block that needs modification (B3).

Read the parity block (S3).

Remove the knowledge of block B3 from parity S3 (S3'=S3~B3).

Calculate the new parity (S3"=S3'~B3').

Update the data block (B3').

Update the parity block (S3").

In other words, one application I/O requires four disk I/Os, and these four I/Os occur on two spindles, potentially disturbing other transfer operations on those volumes.

Some high-end controllers optimize the disk activities so that if the controller needs to write an entire stripe (e.g., during a restore operation, in which the I/Os are typically large), the controller calculates the new parity on the fly and overwrites the old parity. However, this feature requires that the controller update all the other data blocks in the stripe.