Recommended Posts

Everybody needs storage, some of us need more of it than others (I'm looking at you 10TB+ storage topic). The primary media for data storage these days is the hard disk drive (HDD) and, in recent years, the solid state drive (SSD). From the most basic of users to enterprise-level systems, these two devices are used to fulfill our storage needs in varying configurations to meet a given workload.

My hope is that this will serve as a tutorial to guide storage newcomers in choosing their own storage configurations. While this post will advise the type of drive to use, there will be additional posts linked here to shorter tutorials on:

First, we must ask ourselves: Exactly what ARE hard drives and solid state drives?'

A hard disk drive is a data storage device that utilizes magnetic material on rotating platters to store digital information (bits) and accessed using an actuating reader. A single bit these days is stored on a piece of material whose size is on the order of tens of atoms, enabling high data densities but also requiring sophisticated and precise technology to accurately read and write these bits.

A solid state drive operates in much the same way as a hard drive does, using flash memory (similar to what is found in USB “Flash” drives) and a complex controller which chooses where to write data to and read from. They have no moving parts, however the memory they use to store data can only be written to a limited number of times, and in general have (at present) lower storage density than hard drives.

SSD performance depends greatly on the speed of the flash memory and the controller, but also on how full the drive is, and optimally reaches between 350 and 550 megabytes per second.

Source: Tom's Hardware (link is the source)

Okay good, we got that out of the way. Now let's review what goes on our storage device:

Everything you want to exist when you boot up your computer goes on your storage medium, including your documents, music, videos, applications and, most importantly, the operating system.

When you start your computer up, what happens?

To use a technical term: A lot.

But after a lot of stuff happens, your computer starts booting the operating system (usually). Once you are logged in, your computer starts up dozens of processes and any applications you have designated to run on startup. Then you might launch a web browser, or start up your favorite document writer, play a movie or maybe launch a sweet game.

Now that you know what storage does, you might also be interested in a better or even “the best” (gasp) storage solution. You start asking some questions.

What type of drive is best for me?

Well, that depends on a few things, so I'll answer your questions with more questions:

What size do you want?

How much are you willing to pay for the space you want?

How reliable do you want it to be?

How fast do you want it to be?

What will you be doing with it?

Okay, now you ask another question:

What size is best for me?

I'm going to frustrate you by asking more questions:

How many applications do you want to install?

Do you have a lot of games you want to install on it?

How much other stuff (docs, music, videos) are you going to put on it?

This really is the point to jump off of. Regardless of drive type, you need a certain amount of storage space, and it's helpful to know how much you need before you buy. Skip to the bottom of the paragraph for the TL;DR.

Application space varies wildly from a few megabytes to tens of gigabytes. Games these days are in the hundreds of megabytes for some indie games, to tens of gigabytes for the latest triple-A titles. Documents take up very little space, so don't worry too much. Music can go from less than a megabyte per minute of music to multiple megabytes per minute for FLAC uncompressed music files. Video follows the same trend as music, from ~700MB for a feature-length compressed 1080p movie to many tens of gigabytes for full-quality Blu-Ray movies. If you work with uncompressed video, you probably won't be reading this tutorial since you'll already know what you need, but uncompressed video can be tens of gigabytes per minute of footage depending on the quality.

In short:

Applications: Average

Games: Lots

Documents: Little

Music: Average

Videos: LOTS

If multiple of these apply to you, go with the largest. If you do applications and videos, you still need lots of storage (depending on the size of your video collection).

Cool, so now you know if you need a little, average, or a lot of storage space. Now on to your next question:

How much does storage cost?

Hard drives are much cheaper than SSDs per gigabyte of storage, about eight times less expensive on average. You can find hard drives as low as $0.04/GB, such as the Seagate Desktop drives, or in the higher end at around $0.10/GB for enterprise-level hard drives like the WD RE series. Solid state drives are more expensive, being around $0.63/GB for “cheaper” drives like the Crucial M500 series, or in the range of multiple dollars per gigabyte for enterprise-level SSDs.

Why pay more for a hard drive or SSD, you might ask?

Better technologies in hard drives that make them more resistant to vibrations, run cooler, less likely to fail, or perform better all add to the cost. Similarly, better controllers, flash memory and other tech in high-end SSDs make them perform better and last longer, but add to the cost.

In short:

HDD: Cheap storage

SSD: Expensive storage

Now you might be worried that, at some point, your storage device is going to fail.

All storage devices will fail at some point, but the question is when?

Hard drives can be written to essentially an infinite number of times, however they have moving parts, and everything with moving parts will break at some point. General warranties on hard drive range from 2 to 5 years. Depending on their usage, they may last a shorter or longer amount of time. However, if dropped they are essentially dead. If they still work, get your data off it and get a new drive; it's not reliable anymore.

Remember how we mentioned that SSDs can only be written to a certain number of times before becoming unusable? They have a lifetime too, and with their increased speed they could die much sooner than hard drives, right? Maybe. But the amount of writes you would have to do to get this is tantamount to writing tens of gigabytes to the device every day for the length of the warranty. Higher-end SSDs can withstand multiple full drive-writes daily for years. Most likely, you won't ever encounter this problem. Accidentally dropped it on the floor? No moving parts to wreck, no big deal. SSDs are rated for a good amount of beating.

In short:

HDD: Be careful, and they'll last.

SSD: Don't worry unless you're a super user.

Let's talk about performance.

We mentioned earlier that hard drives can get about 100 to 200 MB/s. That's when you're accessing big files like movies and music, so the read head doesn't have to move back and forth all over the place. When you're grabbing a bunch of little files (like loading applications and games and day-to-day operation), hard drive performance tanks. Big time. Like less than a megabyte per second tanks.

We also mentioned that SSDs can get between 350 and 550 MB/s. That also is for big file transfer. However, since SSDs can grab data at a more consistent speed than hard drives, they are also faster in little file transfers, in the range of multiple tens of megabytes per second to hundreds of megabytes per second. This is why SSDs make computers feel “snappier”, they speed up the applications that access the drive in a random way compared to a hard drive.

It has to do with something called Queue Depth (QD), basically how many access commands are waiting . The higher the queue depth, the more efficiently the drive can get and write data, so the faster it operates. The problem is, most operations take place at a QD of 1, maybe going up between 2 and 4 for some operations. This makes SSDs much slower for day-to-day tasks than if you had a queue depth of, say 32 or more, where the drive will be able to achieve hundreds of megabytes per second in random operations. However, only enterprise environments will ever really see lots of operation in this range. In loading some games, you can see higher queue depths being achieved but only for a handful of operations, and applications like the Adobe suite can utilize higher queue depths on a scratch disk.

When buying an SSD, you might see a rating for something called IOPS (Input/Output operations Per Second. This rating indicates how many data chunks of a small size (~4KB) can be processed per second. You might see close to 100,000 IOPS for some drives (like the Samsung 840 Pro). Doing some math, we calculate that this comes out to 100,000 IOPS * 4KB/IOP = 400 MB/s in random operations. However, this level of performance is measured only at those insanely high QD like we talked about earlier. In practice, you might get between 5000 and 10000 IOPS at a QD of 1, which corresponds to ~20 to 40MB/s. This is still way faster than hard drives though, and one of the reasons why SSDs are becoming so popular.

In short:

HDD: Good for sequential data

SSD: Good for all kinds of data

Okay, now it's time to ask:

What will you be doing with your system?

Since lots of people do many different things with their computers, we'll start generalizing to make it easier.

Average: You browse the internet and create office documents.

Movie/music buff: You do the regular stuff, but also have a lot of music and movies.

The gamer: You do the regular stuff, and have lots of games.

The Pro: You use large applications and have lots of projects in storage.

For the average user, it depends on your budget. You could buy a 500GB hard drive for 50 bucks and have a working computer. You could spend $25 more and get a 64GB SSD and use 75% of it, and have a snappier computer. You have to decide if it's worth it, but it comes highly recommended by any PC enthusiast.

For the movie/music buff, you will have media players on your computer in addition to the regular stuff. But where to store all those movie and music files? We know that current hard drives have no trouble playing back media files, so we'll put those on a large hard drive. Since you still have applications, you could store those on an SSD and have a snappy experience.

For the gamer, you could use an SSD for applications, but how much storage space do you need for your games, and how much are you willing to pay for it? If you want fast-loading games, you might want an SSD but you will have to pay more. If you want lots of storage for games, you could get a hard drive for cheap but have slower load times. Once again, you decide what's worth your money.

Finally, the pro. You need an SSD man. Your time is money, and those professional applications are pretty large, so pick yourself up a properly-sized SSD. If you have the money, maybe pick up a ~120GB high-performance SSD like the 840 Pro to use as a dedicated scratch disk. Grab a hard drive for your media files.

In the end, the choice is up to you, the user. I hope this guide has been successful in helping you choose the type of storage that will give you the most value for your money, and I wish you luck as you explore the vast possibilities of the wonderful world of storage.

I'd also like to give credit to Whaler_99 for reviewing and providing some critique on this guide, and to looney for directing me to Whaler.

Finally, I'd like to thank Linus for his videos, from which I learned much of my initial knowledge about storage, which was built upon by my years in the IT industry.

Or, maybe you want to run drives in a NAS or in a custom RAID setup. In this case, you'll want to go with a Seagate NAS drive or a WD Red or SE, all of which feature special technology which make them perform better, cooler, and more reliably in such an environment.

Either way, you will have plenty of choices of size for what type of drive you need. Find your needed size and then pick the best for your application based on price.

If you're buying an SSD, it matters much less.

As we saw before in our discussion on Queue Depth, most high-performance SSDs don't differentiate themselves in day-to-day operation.

What differentiates SSDs?

Once again, use case. Since SSDs have a lifetime limited by the number of writes, we would then choose our drive based on the number of writes it can sustain.

For example, if you are using an SSD for the OS, applications and games, it's better to buy an SSD based on size and $/GB rather than performance. The Crucial MX100 and the Sandisk Ultra Plus are great examples of this.

However, if you are doing lots of writes to your SSD, like using it as a scratch disk for Adobe software, then you might want to get a Samsung 850 Pro or Intel 730, and just buy the size you need. Using it as a scratch disk can generate operations that access the drive at higher queue depths, making the drive more effective.

If you have a high need for data protection, then the Crucial MX100 and Intel 730 both have some form of power loss protection (the MX100 only protects against existing data, though), and the 730 has end-to-end data protection.

If you want the absolute cutting-edge, then the up-and-coming NVMe standard is showing significant improvements in both random and sequential performance. As of this writing, the Intel 750 series is the fastest on the market.

Will a PCI-E or NVMe SSD get me around the SATA bottleneck?

There is a bottleneck, however the SATA interface only bottlenecks the sequential transfer speeds. In most other workflows, such as 4k random r/w, SSDs don't even come close to approaching the bottleneck of the SATA interface. You can get PCI-E SSDs like the Plextor M6e and the ASUS RAIDR. These drives have shown potential to be faster than SATA III drives, but only in sequential operations. Here is a performance analysis of the RAIDR, which is not much better than most SATA SSDs. We also have a LTT user review of the speed of the RAIDR.

The NVMe standard is designed to replace the old AHCI standard, and performance of drives that support it show significantly better performance in all categories -- see the review of the Intel 750 for a current example.

Miscellaneous Q/A:

Can I use a NAS drive as my boot drive?

Yes. The only real difference is in performance, noise, and power consumption, which you can find on the spec sheet. You won't gain any benefit from the drive's TLER feature, but it won't hurt either.

What the eff is Bit Error Rate?

Bit Error Rate (BER) is a measure of how often Unrecoverable Read Errors occur. They are usually specified in # of bad sectors per # of bits read. For example, the WD Red series specifies 1 bad sector per 1014 bits (12.5 TB) read. Confusingly, WD's datacenter drive rate their BER in terms of 10 bad sectors per 1015/1016 bits rather than 1 bad sector per 1014/1015.

What the eff is an Unrecoverable Read Error?

An Unrecoverable Read Error (URE) occurs when a hard drive is unable to read a sector. This could be due to physical drive damage, radiation, cosmic rays, natural degradation of the hard drive platters, etc. This usually will result in corrupted data, but it doesn't happen very often (see Bit Error Rate). On consumer grade drives the drive will attempt to read the sector for a long time, then give up and continue reading the next sector, while on NAS and enterprise drives they will stop after only a few seconds, allowing the overarching RAID controller to deal with the error and try to fix the data.

What makes a drive a "NAS" drive or "Enterprise" drive?

A NAS drive is a hard drive that supports Time-Limited Error Recovery (TLER). That's usually about it, though they may have some other additional technologies that make the drive perform more consistently or be more reliable. Examples include the WD Red/Purple and Seagate NAS drive. They're optimized to run in a consumer NAS, usually 1-5 bays.

There's a lot more to Enterprise drives; they're basically drives with many of the same features as NAS drives, but with additional tweaks that optimize them for some given task. Examples are below:

One thing to keep in mind: All of these drives are perfectly reliable as long as they are used in the appropriate environment. Putting an SE drive in the same environment as an XE drive will probably result in the drive dying much quicker.

By now you've probably chosen your drives, so the next step is to attach them to your computer. Barring some elaborate RAID configuration, you'll likely be asking the question:

Where do I plug my drives in to?

Easy. On your motherboard, there are these things called Serial Advance Technology Attachment ports (SATA) which allow your computer to access the data on your hard drive. There are multiple kinds of SATA ports, SATA I, II, and III, which operate at 1.5 Gb/s, 3 Gb/s, and 6 Gb/s respectively.

So which ports do I plug my drives in to?

Well now you have to ask yourself:

What port is the slowest port that will still allow my drive to reach its maximum performance?

To answer this, let's go with the best case scenario in the operation of a drive: sequential performance.

For example, one of the fastest hard drives available is the WD Velociraptor, which has an transfer rate of around 164 MB/s. Let's first get the maximum throughput of the SATA controllers in MB/s:

SATA I: 1.5 Gb/s → 187 MB/s

SATA II: 3 Gb/s → 375 MB/s

SATA III: 6 Gb/s → 750 MB/s

However, these speeds aren't what you'll see in practice. SATA uses 8b/10b encoding to send data, which basically means that it must send 10 bits of data for every 8 bits of "real" data that is sent. In practice, you get about 150/300/600 MB/s for the SATA revisions, respectively. This also assumes maximum efficiency (every possible timestep is dedicated to transferring data), and in practice we can get close to this speed, around ~550 MB/s for the highest performing drives on the market (as of this writing).

Notice that one of the fastest hard drives, even though it goes over the practical limits of SATA I, won't gain very much from going to SATA II, although there will be some gain. This means that we can plug almost any hard drive into any port and it will still operate at its maximum performance.

NOTE: Hard drives do have a cache of around 64 MB which will operate faster than SATA I ports. However this cache fills up so fast that it really doesn't matter much in practice.

Now let us consider an SSD. Most SSDs these days have sequential performance over 250 MB/s and a majority of those are between 400 and 550 MB/s. So clearly we should choose SATA II for lower-end SSDs and SATA III for high-end SSDs, right?

Well, mostly right. That's in sequential performance, but remember that most SSDs operate at between 20 and 40 MB/s in day-to-day operations. This is well below the SATA I maximum bandwidth, and further below the SATA II and SATA III maximum throughput.

Now if we attach it to a SATA I port, we will have slower sequential writes than if we plugged it in to a SATA II or III port. However, it indicates that there is still value in adding an SSD to an older computer running SATA II or even SATA I.

In short:

HDD: Don't worry

SSD: SATA III preferred, SATA II or I still useful.

What about USB hard drives?

USB hard drives are external, and are great for backups. You also can install applications and programs on them if you really want to.

There are two factors to consider:

Is my HDD USB 2.0 or USB 3.0?

Am I plugging it in to a USB 2.0 or USB 3.0 port?

Remember how we said that hard drives get around 100 MB/s average transfer rates for large files? That corresponds to 800 Mbps, which is much faster than USB 2.0's 480 Mbps maximum transfer rate. This means that hard drives will (generally) get better performance if they are USB 3.0 enabled, as long as they are plugged into a USB 3.0 port which is capable of 5 Gbps, which is plenty for modern hard drives and enough for the few external SSDs out there. For a USB 2.0 enabled drive, it won't matter since it doesn't support the USB standard which will allow it to go faster than 480 Mbps (about 60 MB/s)

RAID stands for Redundant Array of Independent Disks, and allows the user to combine multiple drives into a single volume. Popular in enterprise environments and among storage enthusiasts, it divides data among its drives in multiple possible ways:

RAID 0 (Stripe)

Hardly used in mission-critical setups, RAID 0 arrays are mostly used to service large numbers of IOPS that occur in frequently-accessed data banks. Unreliable.

Writes every other block of data to a different drive in the volume

Performance is roughly equivalent to the sum of the performance of the available drives.

Total storage is equivalent to the number of drives.

There is no data replication whatsoever, and any one drive failure will result in complete loss of data.

NOTE: Must have at least 2 drives.

I don't consider this RAID at all, since there is no redundancy.

RAID 1 (Mirror)

Mostly used among individual users due to its inefficiency, RAID 1 is mostly used to provide data security in small NAS boxes like the WD MyBook Live Duo. Most reliable.

Mirrors all data across multiple drives, usually two drives, although you can have a 3-drive RAID 1.

Performance is equivalent to that of a single drive.

Total storage is equivalent to a single drive.

Can sustain the loss of all drives but one in the RAID array.

NOTE: Must have at least 2 drives.

RAID 5 (Parity)

Popular among enthusiasts, RAID 5 provides efficient data security using either chipset, hardware, or software RAID. Mostly fallen out of favor in the industry in favor of RAID 6, 50, or 60. Most efficient.

Involves calculating parity bits and spreading them across all drives.

Assuming RAID hardware is fast enough, performance is equivalent to that of the sum of all drives (minus 1).

Total storage is equivalent to the sum of all drives (minus 1).

Can sustain the loss of any one drive.

NOTE: Chipset and software RAID 5 can tank write performance below that of a single drive due to the intensive parity calculations. Hardware RAID usually performs better, but has slower write speeds than read speeds. Must have at least 3 drives.

RAID 10 (Striped + Mirrored)

A combination of RAID levels 1 and 0, which involves running RAID 0 on a series of drive pairs which are mirrored.

Read performance equivalent to the sum of all drives, write performance equivalent to sum of half of the drives.

Total storage is equivalent to half the number of drives

Can sustain the loss of one drive per stripe, between 1 and n/2 (n is the total number of drives)

NOTE: Must have at least 4 drives, and only an even number of drives.

There are more advanced RAID levels that are only possible through use of hardware RAID cards or systems like ZFS, such as RAID 6 (same as RAID 5 but with two parity calculations), RAID 50/60 (striped RAID 5 and RAID 6 volumes), and RAID-Z1/Z2/Z3 (ZFS parity RAID with 1/2/3 parity calculations).

If you have a need for a high level of data redundancy, availability and security (other than nightly backups), then you will not only benefit from RAID, but you should use RAID.

If you have a need for a lot of storage that is also high-performance but relatively inexpensive, then you can benefit from RAID.

If you have need for a super-fast storage device that is relatively small, then you can benefit from RAID.

Or if you're extreme and want a lot of solid-state drive space and data redundancy, you can benefit from RAID. Mind you it'll cost you a lot of money.

Otherwise, don't bother.

What kind of controller should I use for RAID?

There are plenty of options:

You can get a hardware RAID card which allows for higher performance RAID-arrays with dozens of drives and advanced RAID levels. These cost a good deal of money.

You can use FlexRAID which is a software RAID that is done on a scheduled basis. It runs on Windows operating systems and allows for multiple RAID levels (but not advanced RAID levels that can be found on high-end RAID cards).

You can use Windows Storage Spaces which allows for the usage of random hard drives in RAID configurations, but is software RAID and relatively slow.

There are also file systems and logical volume managers like FreeNAS, ZFS on Linux, etc. which can support advanced RAID levels as well as protocols like FTP, iSCSI, and the creation of NFS exports and CIFS shares.

Finally, you can use the chipset on your motherboard to run RAID.

NOTE: Don't use a hardware RAID card with ZFS! If you must because you have lots of drives, configure it in JBOD mode. The hardware RAID card can interfere with aspects of ZFS that relate to data integrity. Pick one or the other. In general, it's bad to run a combination of hardware and software RAID.

If you don't need a hardware RAID card, the simplest are Windows Storage Spaces and FlexRAID, which @looney has a nice tutorial for here.

If you want to experiment with FreeNAS and ZFS, I believe there is a tutorial being developed for it, but I'm not sure who is doing it. Thanks to IdeaStormer for linking this post by Nex7 on ZFS tips. Here is a link to performance information on ZFS solutions, and here is an LTT user example of a ZFS system running RAID-Z2.

If you are creating a home server, you probably don't need a hardware RAID card and can go with any of the other options. Run in some sort of redundant configuration like RAID 1, 10, or 5 (or 1/Z1/Z2/Z3 for FreeNAS/ZFS).

If you want a high-performance home server, then you might want to consider a hardware RAID card or run a FreeNAS or ZFS setup with lots of CPU power and RAM. Run in some redundant configuration.

There are reasons other than having an awesome setup to consider RAID. For example, lets say I'm watching a movie stored on a hard drive in my server. Now let's say someone else decides to copy lots of files to the server or even starts running a backup. A single hard drive is going to be hit hard with data, and my movie could start lagging.

Or, if multiple people are backing stuff up or copying files to/from the same server. Each person will get much slower data transfer.

By increasing the maximum performance of our storage, I might be able to keep watching my movie uninterrupted while someone else is backing up their data (assuming the network doesn't become the bottleneck)

To summarize:

Running a hardware RAID array can dramatically increase performance and help to alleviate some of the previously mentioned problems.

RAID can protect against data loss for an always-on system.

If you want a dedicated scratch disk or some other small, high-performance volume, chipset RAID should be fine using SSDs. Run in RAID 0.

NOTE: RAID is NOT backup! If your RAID array fails and isn't recoverable, you're boned. You should always have a backup system in place. Companies realize this, that's why there are extremely complex storage systems in existence and entire data centers designed to protect valuable data.

What about using smaller SSDs and just RAIDing them? That'll be super fast.

I would not recommend SSD RAID 0 for your OS drive unless you are a power user. Yes, you can restore the volume from a backup if it fails, but that takes time. Yes, you will get awesome sequential performance, but I doubt you'll be transferring many gigabytes of large files often. SSD RAID provides NO benefit unless you do lots of work at queue depths of 4 or higher, and it can even make other operations slower.

With that said, there are cases where a RAID 0 SSD array might make sense. For users who do a lot of editing work with Adobe Premiere, for example, the disk gets hit at queue depths between 1 and 8. Using an SSD RAID 0 would provide a tangible benefit here.

Or, for users recording with FRAPS (especially if recording at 60 FPS at 2560x1440 uncompressed), the disk will get hit with incredible high data rates at queue depths between 1 and 6. With writes that could go as high as 700 MB/s, having a large SSD array will not only provide better storage performance, but in this case could also improve in-game frame rates. Note that at 700 MB/s, you will quickly run out of space.

For users who do work closer to enterprise-class loads, running dozens of VM's on a disk will also benefit from an SSD RAID array. Since every VM is competing for storage resources, the disk will be hit at queue depths that are much higher than normal, which a RAID array will provide a tangible benefit for.

It's important to recognize that while RAID can increase performance, that performance increase is still dependent on the drives you use. For example, using SSD's will be much faster than using hard drives (but much more expensive).

You could run a bunch of 5400 RPM drives in RAID and get more performance, but don't expect it to be on the same level as a bunch of 7200 or 10000 RPM drives in RAID.

NOTE: While it's true that you can run a modern SATA III hard drive on a SATA I port without bottlenecking, modern drives tend to have better performance than older hard drives, and thus it's better to use newer hard drives (which conveniently are labeled as SATA III).

I'd mostly recommend SATA drives, but you can also get SAS (Serially Attached SCSI) drives, which are mostly used in enterprise environments and is very similar to SATA. The main difference is that you can chain dozens of drives together to form extremely large arrays. I wouldn't recommend them, since you will only be able to use them in a RAID setup with a hardware RAID card, and you can get similar capacity with SATA drives -- assuming you don't need more than 96 TB of storage (24 - 4TB enterprise drives).

As with so many other things, it depends on your use case. If you're running a server or a NAS, you will want RAID optimized drives for storage like the Seagate NAS or the WD Red. If you're creating a large scratch disk, you'll want lots of small, high-performance hard drives like the WD Velociraptor or a few high-performance SSDs. Other than that, there isn't much else RAID is particularly useful for.

Can I RAID different hard drives together?

Technically, yes you can. This is called heterogeneous RAID, which is the placement of drives in an array where not all drives have the same size, model, or performance numbers. This differs from homogeneous RAID, where all drives in an array are identical.

This RAID configuration can still provide redundancy and performance increases over that of a single hard drive. However, it will be slower than a homogeneous RAID array, since the slowest drive in the configuration will cause the other (faster) drives to be underutilized.

Single Point of Failure: If your drives are hooked up to hardware RAID controller, then what happens if your controller fails? Your entire array goes down, meaning all your data is offline until you can get a replacement. Once you get one, you might be able to restore the array. Maybe. The same idea follows for motherboard chipset RAID or poor software RAID implementations.

Solution: There isn't one for a hardware RAID controller, but it can be done with some software RAID solutions (ZFS being the most notable one). By using a JBOD controller (or running a hardware RAID controller in JBOD or IT mode), a software RAID solution can build a RAID array with drives from multiple controllers, meaning one controller failure doesn't have to spell D-O-O-M for your data, if planned properly.

Unrecoverable Read Errors: This isn't unique to a RAID configuration, drives that aren't running in RAID can experience UREs just the same as drives running in RAID. However, consumer drives (that is to say, anything but the WD Red/Purple/SE/RE or Seagate NAS/Constellation) lack a feature called TLER (Time-Limited Error Recovery), which will prevent the drive from trying to read a bad sector on a drive after a few seconds. This is important for a RAID array, because if a drive stops responding for more than a few seconds the RAID controller can label the drive as bad.

What this means is that if you aren't using proper hard drives, and you hit a URE, your RAID array will become degraded even though only one or two sectors might be bad, whereas on a TLER drive the RAID controller might be able to reconstruct the data and work around that bad sector (or fix it). Once you replace the drive, the RAID array will have to rebuild. If you hit another URE and aren't using a TLER drive, the RAID array will deem that drive bad too. If you happen to be running RAID 5, you now have two "dead" drives in the eyes of the controller, meaning all your data is gone, even though you might only have a few bad sectors.

Solution: You should be using WD Reds or Seagate NAS drives at a minimum. If you aren't, then I hope you have reliable backups, because you'll likely lose your data during a RAID rebuild. The risk is greatest to parity RAID setups, and is smaller with a RAID 1 or RAID 10 because less data needs to be reconstructed; in parity RAID, all data must be read and rebuilt, while in RAID 10 only the corresponding mirror needs to be rebuilt.

If you only have a partially full array and your hardware/software RAID controller/software supports it, you will have a reduced risk of a URE because it will only rebuild the actual data, ignoring empty sectors. An example of this is ZFS. Most hardware RAID controllers will rebuild all sectors on the drive, even empty ones. This is because RAID controllers only know about blocks, not files, so they can't tell which blocks need to be rebuilt and which don't.

Cost: RAID controllers aren't cheap. The cheapest one that I could recommend goes for around $200 USD, and you won't get good performance out of it. Better controllers with onboard cache go for hundreds of dollars more, and controllers with more ports go for even more. Then there's the matter of you paying for drive space you can't use due to parity or mirroring. It is expensive to set up RAID.

Share on other sites

If you're reading this, you've seen Linus' videos and might be curious about having a centralized storage location in the form of a NAS or a home server, so let's talk about that.

There are many benefits to having a home server or a NAS, the most prominent of which is a single location to store data and to back up computers to, keeping your data safe. Other aspects might involve additional services that your centralized storage location can provide, including shared printers, media streaming, and e-mail services.

There are plenty of existing systems out there, such as those made by Synology, Qnap, and Netgear. There are also small business units available, such as the WD Sentinel. All of these come either pre-filled with drives or empty (in which case you'll need to buy drives). They tend to have a web-based interface as well as a Windows folder interface, which allows you to access your data from both inside and outside your home network.

The advantages of pre-built systems are:

Easy to set up

(Relatively) inexpensive

Come pre-loaded with features

Come with warranties

Have built-in RAID

Can have "special features" like being fireproof/waterproof/shockproof, such as the IOSafe N2

However, they also have their disadvantages:

Performance can be lacking sometimes, especially when running in RAID 5

Unless you buy an expandable unit, you are limited to the number of drives the NAS can hold

You also have the option to build your own, which allows you to build any number of configurations from the very simple to a system rivaling those found in the LTT 10TB+ Storage Topic.

Link to post

Share on other sites

I would love to see a build using RAID 5 or 6 as a video/photography editing/content creation rig. or the best implementation for large data sets, together with reliability, speed and redunency

One of the issues with RAID 5 is that it isn't very redundant. RAID 6 is better, but requires a RAID controller (adding a single point of failure).

In terms of end-to-end data protection, you are better off with network storage with software RAID (FreeNAS is a good one in general, but FlexRAID is good for Windows) that, when configured correctly, will eliminate single points of failure and provide better data protection through other features. That's just my opinion, though.

Link to post

Share on other sites

Your posts are epic.. would it be too much to ask for one about how you got into Enterprise Storage, and suggestions for an aspiring Storage engineer to get into the field?

I got a job at Dell Equallogic, but I work in software automation for quality assurance. I get to play with millions of dollars with of equipment, though, and it's all really cool stuff. For testing we need all sorts of other software (lots of VMware products, Backup Exec, various server OSes). The coolest thing I've seen so far is called VMware Virtual SAN and VMware SRM.

Make yourself valuable to companies that work in storage (a Computer Engineering or Computer Science degree is a good place to start). If you've built PCs, mention that to them. If you have a PC lying around, experiment with different storage OSes. Familiarize yourself as much as possible with software like FreeNAS and ESXi.

Enterprise storage companies take people from all sorts of areas. They have network teams that work on the network stack for their storage products (our products are iSCSI based, so we have a large network team). They have power engineers (power delivery to the controllers). They have computer engineers (design of the controller hardware). They have general CS majors (software design). They have hybrids between these groups and product marketing (coming up with new features).

Once you have the hardware, though, the future really is software-based. We support years worth of hardware, but add new software features that they can use (time periods vary). I would like to work in an area that focused on the merging of hardware and software. To come up with a feature and design the hardware/software interface that would allow it to happen.