RAID Technology Part 1

RAID is a buzzword that you hear often in the realm of computing. What does it mean and what technology is behind it? In this first part of a series of articles on RAID, learn about the technology behind the word.

Page 1: RAID Tech

Intro:

RAID: Redundant Array of Independent (or Inexpensive) Disks. What does that mean? Basically, you have two or more drive and combine them for either speed or redundancy or both. There are many RAID technologies and RAID levels in the marketplace today. This article will go over the technologies.

When RAID is not:

There are many different factors that determine the RAID level you should or are using. The technical aspects coincide with the requirements of redundancy that you have as well as other performance characteristics. There is a RAID level that has nothing to do with RAID at all and we will get into that in the next article on RAID at a later date. Each RAID level contains technologies and many share the same type of technologies which is why going over the technology behind the RAID levels is important.

Mirroring:

A mirror is the simplest form of redundancy. You take the normal stream of data and output it to two or more drives. Basically, what is on one drive is on all drives. A mirror usually consists of two drives and the space you get is n/2 where n is the total amount of space. If the drives are not equal you would only get n space from m drives where n is the smallest drive. If you have two equal drives you loss half the amount of space to redundancy. This is fine with hard drives being so cheap. Mirroring also gives you an advantage on reads since it can read from both drives at the same time. You suffer on writes since the controller needs to write to both drives. When you use a mirror, if a logical entity (such as a single drive) fails, the array is still functioning and up. The controller should report that a drive failed and should be replaced. No data loss should happen unless all the logical devices fail or the controller fails.

Duplexing:

Duplexing takes the mirroring a step further. Instead of just allowing the drives to be mirrored, the controller itself is mirrored with another one. This situation is usually done with some form of software RAID solutions since most hardware controllers do not support this. Duplexing takes away the single point of failure on the controller to provide extra redundancy. It also costs more than mirroring because you are duplicating more hardware. Since hardware RAID is typically set up so that the RAID controller will handle all the drives in the array, duplexing is not supported as an option in most PC hardware RAID solutions. The only real duplexing solution I have seen is on some external RAID devices.

Striping:

Striping is the concept of streaming data to multiple physical hard drives while seeming like one contiguous space. Striping is the only RAID technology that does not offer redundancy and is only used for high performance non-critical data. There are upsides to using a stripe. The first is speed. You get the fastest speed with a stripe on reading and writing. The second is that you do not lose any space. All your space on your drives is used if they are equal. The downside is that if one drive goes, the entire array is destroyed. This RAID technology should not be used by itself if you do not have a proper backup plan.

Striping with Parity:

There are a few RAID levels that use this. It takes the concept of striping and adds a 'parity' check to the data. It is more than just a parity check, it can correct data errors if a drive goes bad. The smallest array size of this technology would be three. You basically lose one drive due to the use of parity. The upside of this technology is that you get redundancy and speed. On reads this technology is very fast. On writes, not so much. You do not lose as much space with this as with mirroring.

How does this error checking and correcting thing work? XOR! How about an example... We have a three drive array and byte 1 is written to drive 0 and byte 2 is written to drive 1. To determine what we need to write to drive 2 for the parity check we just XOR byte 1 and byte 2 together. XOR is a bitwise function and basically means not equal.

Code

.
11001011 Drive 0
XOR 11101100 Drive 1
---
00100111 XOR

Bitwise, meaning that the operation acts on individual bits independently of each other. Notice that when the bits are equal (0,0 or 1,1) the XOR function outputs 0. When they are different (1,0 or 0,1) it outputs 1. Now, if drive 0 goes down, we can calculate the missing data with the parity. How? If you XOR something that has been XORed, you get the original thing!

Code

.
11001011 Drive 1
XOR 00100111 XOR
---
11101100 Missing Data!

Interesting ehh?

I didn't know there'd be this much talking...

And then the blank stare... This is from Korgoth of Barbaria on Adult Swim.

JBOD:

JBOD? Just a bunch of disks. That is the term used to mean no RAID technology in use. Anyone that has multiple drives with no array is using JBOD. Anyone using a single drive is using JBOD.

Next Up:

The next article will focus on performance and positioning of each of these technologies and stay tuned for a follow-up describing the RAID levels. Thank you for reading and be sure to enter our monthly contests located in the forums. Read the second part »here.