Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

NicApicella writes "My new system has two sparklin' SATA drives which I would like to mirror. After having been burned by a not-so-cheap, dedicated RAID controller, I have been pointed to software RAID solutions. I now stand in front of two choices for setting up my RAID: a Windows 7 RC software RAID or a hardware RAID done by the cheap integrated RAID controller of my motherboard. Based on past experiences, I have decided that only my data is worth saving — that's why the RAID should mirror two disks (FAT32) that are not the boot disk (i.e. do not contain an OS or any fancy stuff). Of course, such a setup should secure my data; should a drive crash, I want the system up and running in no time. Even more importantly, I want any drive and its data to be as safe and portable as possible (that's the reason for choosing FAT32), even if the OS or the controller screw up big time. So, which should I choose? Who should I trust more, Microsoft's Windows 7 or possibly the cheapest RAID controller on the market? Are there other cheap solutions?"

RAID is only marginally valuable. In my experience, for all but the most carefully controlled environments, RAID simply adds complexity, the number of things to go wrong increases, along with the likelyhood of lost data. Do it only if you want the *experience* of running RAID, but don't count on RAID to "save your data".

I've worked as a system administrator for more than a decade, in medium-large scale deployments with good success, (think: servicing thousands of users, hundreds of domain names, tens of thousands of email addresses, etc) so I think I have some useful experience you can benefit from.

IMHO, you most likely to lose data from the following things (in order)

1) Aw sh1tz. "I didn't mean to delete that folder"... or "Whoops! I formatted the wrong drive", "I saved the wrong version of the file!", whatever. Although I *myself* don't have this happen often, it does happen. And even in my case I've lost about as much useful information this way as by drives dying. Users delete stuff all the time, and it's usually my job to bring it back, which is why I perform redundant, historical backups EVERY SINGLE DAY.

2) Malware. Don't minimize this - it's real, and it's why I reply to Parent. You are more likely to lose information from a virus/worm/malware and/or b0rked install of something that hoses your filesystem than by a hard disk crash given stable hardware.

3) Bugs. Filesystems have bugs. So do applications, utilities, anything with software. Strange, unexpected conditions, often caused by bugs in applications can cause data to "disappear", files to get corrupted, filesystems to get corrupted, folders to be incompletely written, etc. This is about as likely to cause lost data as:

4) Hardware failure. This is one of the lowest orders of lost data, although when it happens, it can be one of the most extreme.

Let me say this: RAID 1/5 only PARTIALLY protects you from the last one. Actual, bona-fide backups protect you from all of these. If you care about the data, get backups. If you care about uptimes at great expense, RAID *may* be worth it.

My advice is something most people don't want to hear: for personal use, get backups online for $5/month. Mozy/Carbonite/etc. There are zillion vendors, just Google it. In two years, it will cost you about as much as that 2nd hard drive. It protects you far better than that 2nd hard drive, and it's so automatic that you'll hardly notice it until the moment it actually matters: when you just have discovered that your data is gone.

Your 4 points are correct. However, the reason for using RAID is NOT as a backup. RAID != Backup.

RAID is for redundancy and performance increases.

I had a drive die in my NAS a few weeks ago. It took 5 minutes to walk to the server room and plug in a new drive. There's no added complexity for the sysadmin, everything is done automagically by the RAID controller. Losing a server or data for hours while the drive is restored from tape is more expensive and complex.

I learned this important lesson when setting up a RAID array for audio and video production. It increased my throughput tremendously. When you need to stream digital video or audio into an editor, or a multitrack recorder, It really helps to have more than one disk doing it. Of course, I can't use FAT32 like the author because I often have very large files to move around.

However, RAID on my regular desktop has never been much more than a headache. Now I j

Plus, the RAID array can keep track of where the head is on each drive and choose the one that's closest to the requested sector. Linux software RAID does this, though I don't know specifically who else does.

Heh. Linux software RAID doesn't do jack. I've looked at the source code. The mdadm RAID1 driver just alternates drives for reads whenever the requests are not contiguous. That is all. Nothing more. There's no intelligence in there. No keeping track of head positions, no attempts to discover or infer physical drive geometry. Nothing. Just a simple round-robbin. It just so happens that for MOST things that involve random access, the effective throughput is nearly doubled. More intelligence wouldn't actually buy you much in the general case, so why bother?

Also, the dmraid (fakeraid) RAID1 driver only does reads from one disk. I made the mistake of using dmraid instead of mdraid, only to discover through performance tests and iostat that there are basically two software RAID drivers that CLAIM to do identical things but in fact do not.

I would expect this - modern drives (when I say modern I mean any drive made in the last 15 years) is effectively a black-box to the computer. The OS has absolutely no idea where the heads are, or even how the sectors are actually laid out on disk. Any attempt to be "clever" in ordering reads is doomed.

... you'll hardly notice it until the moment it actually matters: when you just have discovered that your data is gone.

Or until the backup company disappears. I suspect hardware is much more stable than any company providing any online backup.

I'd feel safer by far with an outfit that picks up your physical tapes and can return them as needed. If the company is going belly up, it's a lot harder for them to "lose" a warehouseful of tapes than a bunch of files on rotating memory.

I'm not sure why the parent and GP were modded funny, I refuse to buy drives by Maxtor or WD these days because of the crap quality of the past. Sure it's been long enough that they probably have fixed the issue, I just don't trust them, even running mirrored ZFS.

I'm still somewhat astonished that WD would think that it's acceptable to have external drives that work on OSes other than Win except for the power management features. Saying you're just supporting Win for a hard disk is nowhere near acceptable.

Personally, what I do at home is I use ZFS to mirror a pair of 1tb Seagate drives and that seems to work fine, it's not really the best set up, but it's hard to get such things located off site for the amount of money I have to spend.

My advice is something most people don't want to hear: for personal use, get backups online for $5/month. Mozy/Carbonite/etc. There are zillion vendors, just Google it. In two years, it will cost you about as much as that 2nd hard drive. It protects you far better than that 2nd hard drive, and it's so automatic that you'll hardly notice it until the moment it actually matters: when you just have discovered that your data is gone.

And is so slow that a LS120 drive reading a 1.44MB floppy would actually be faster. Or a 1x CROM. Or a 16 year old hard drive.

Also, I have to trust that the service and my internet connection will be available when I need to restore my data.

Ive had #4 happen to me. A power supply in my computer failed (a name brand one, not a cheap no name brand) and damaged everything attached to one of the 12v rails. This included both drives of a raid1 set. (ironically all my drives that wernt part of a raid set were completely undamaged) I was later able to recover the data from both drives (both had damaged sections but different areas were damaged on the 2 drives allowing for a complete recovery between the 2 of them)however it goes to show that just having a raid array wont completely protect you from hardware failures.

That's an interesting thing... the power supply has more potential to cause damage than anything else in the PC, but nobody ever thinks about protecting against its failure. Makes me wonder why we don't have surge protectors on the 5/12V rails as standard yet.

This is why instead of a simple "name brand" PS, I have one that not only employs internal surge protection features, but they actually place an insurance guarantee against all the components connected to it internally, including data recovery(up to $10,000). It's conditional on you having a properly rated UPS attached (you even have to send a copy of the UPS receipt with the waranty registration card), and if it's a true electical failure, they expect the UPS company's insurance guarantee to kick in first

I had a similar case where the controller decided it wanted to die and started writing spurious data to the disks. RAID won't protect you from the controller itself dying - and that also can occur for software RAID as well, the controller can still bork your data.

Personally, I haven't yet encountered anyone who really got benefit from those personal Internet backup services like Mozy. In regular use, it always seems like the person exceeds their storage allotment or Internet connectivity issues prevent them from recovering what they need, when they need it.

I tend to recommend people buy an inexpensive external USB or firewire drive, leave it attached and assigned as a backup device, and have some software package run a daily backup of all the relevant folders and files they might need to save.

It's great that your data is stored offline and off-site... but I'm just not sold on most of the implementations for "home use" being as great a solution as they first appear to be. Many of the providers have come and gone over the years, too. What happens when your offline backup company goes under?

Having run RAID quite a bit myself one must remember having all your drives in one box is always an invitation for trouble since hardware failures on a higher order will likely hit all the drives.

Not to mention the temptation to use _Identical_ disks in your redundant array... I've had a RAID1 pair fail totally when both drives died within 24 hours of each other because of a firmware bug. This happens a lot more than most people think. Statistical analysis of the reliability of RAID _always_ assumes failures arrive independently of each other, but a large proportion of failures are caused not by random events but by external circumstances and therefore happen either simultaneously or nearly simultaneously.

Remember that you don't have to backup 1TB every month, just the changes to your files, which for most people are very minimal. You don't need to backup your entire collection of movies from thepiratebay, just important documents, photos, things that can't be replaced. And then you only need to upload every month the new important files, or ones that have changed. These deltas for most people are probably less than a gigabyte. Assuming a 1mb/s upload speed would take less than 3 hours _PER MONTH_ to upload. Now just schedule your backups to run nightly while you sleep and I think you'll be just fine.

4) Hardware failure. This is one of the lowest orders of lost data, although when it happens, it can be one of the most extreme.

I don't think this is quite right. Hardware fails all the time, it should not be underestimated, and often it is useful to augment backups with RAID. It just depends on your exact needs... what is this data? Does the world end if you can't get back up and running within an hour of a hardware failure? If 12 hours of downtime is ok, then restoring a backup, is of cours

The article smacks of false dichotomy. There are a number of solutions, not just Windows 7 or a hardware RAID controller.

To begin with, every NT-lineage Windows version ever produced supports software RAID out of the box. Add that to the fact that any major Linux distro today supports software RAID. And so do the *BSDs. And Mac OS X. And Solaris. And probably a bunch of other platforms I can't think of right now.

Hell, you could buy one of these one of these [linksysbycisco.com] and throw the drives in it, connect it to your network switch, and presto -- instant RAID+NAS.

I think we would all like to know why you think Windows 7 is your only option, because if that's what you think, you don't know how mistaken you are.

The article smacks of false dichotomy. There are a number of solutions, not just Windows 7 or a hardware RAID controller.

Agreed.

As I see it, if you want guaranteed repairability then you basically have two options: enterprise-class hardware with a support contract (and price tag to match), or an Open Source software solution.

Put another way, either you pay someone to take responsibility for fixing it, or you take responsibility yourself. A Microsoft solution doesn't give you enough control to take full responsibility, because you can't be certain that it will be legally or technically possible to recreate your current setup

You data is most important and you plan to use FAT? Good luck with that!

Seriously, though. No RAID solution that is not totally S/W is portable. But do you really need RAID? It sounds like what you need is a good backup solution with frequent backups. Does you data change so much that losing one day's worth of data would be a problem?

Windows can toast NTFS just as often as FAT. I know Microsoft has trained everyone on the gospel of NTFS but it isn't a big selling point. One difference is that FAT gives you a much larger variety of recovery options. You can have a FAT toasted beyond recognition and still get it back by putting it into a Win 9X box. It is amazingly resilient.

The big problem in this picture is the way that Windows deals with drive errors. It doesn't report them and people commonly discover that one of the drives in a mir

No. NTFS is not perfect, but to think FAT is as bad is deluded. I've honestly never seen a HD formatted with NTFS that I couldn't repair with built-in tools, unless it had physical defects, and in such a case ANY file system would have problems. But I've seen so many FAT drives get hosed by little problems, it's not even funny.

I've messed with software RAID in Windows 2003 server for years and have never had a drive failure not reported to the Event Log.

Well, sure, but how often do you read your event log? Most users _never_ read their event log, so logging the failure like this is next to useless. This is something the user needs to know about, immediately. At the very least a notification area icon and pop-out box would be appropriate.

"Most users _never_ read their event log, so logging the failure like this is next to useless."

The original complaint was that Windows doesn't report disk errors, which is not true. If this were Linux, people would point out that the information desired is right there in the logs..

Software RAID isn't (wasn't?) even available on consumer desktop versions of Windows, so you'd expect some minimum level of cluefulness on the part of the user and less handholding on the part of Microsoft.

Agreed, RAID prevents from a drive failure, period. If your controller fails you lose data, if your mother board fails you may loose data, if your memory fails, you may loose data, etc...

As the poster said, I would go first with a proper backup strategy, it is more important in order to secure your data. Then, go with RAID if you evaluate that you still need it. Heck, as the other poster mentioned, you may find out that you do not need RAID anywhere although I always like to add RAID AFTER my strategy is ma

I'm no expert, but it seems that RAID1 doesn't provide as much safety as some people think, because corrupted data just gets copied twice, so now you have two copies of the corrupted data. Same with accidental deletion--both copies are gone.

If all you want is multiple copies of your data, then really what you want is an automated incremental backup system, that copies your files to a second hard drive, and ideally keeps a few older copies so that if a file gets accidentally deleted or somehow corrupted, you have a chance to go back and find a usable copy. This is what I do on my system: I keep multiple incremental copies from the last few days/weeks/months. It was easy to set-up (in Linux, mind you). Do hourly syncs if necessary.

Also critical, if the poster is truly concerned about never losing data, is to get some kind of offsite backup. Two hard drives don't do you much good when the computer is stolen or your roof leaks. You need to have a way to regularly copy data offsite (ideally automated over the network, or via external hard drive if you're sufficiently disciplined).

RAID has its uses, to be sure. But if the poster is most worried about never losing important user files, then it seems like what he wants is is the multiple-redundancy of backup, not the immediate failover of RAID.

With two drives you can set up a backup scenario. Put the second drive in an external enclosure and use robocopy on a regular basis. Most data loss is from user error. RAID 0 won't protect you from that. Screw FAT32 and use NTFS. If you need to access it in an emrgency put the drive in an external enclosure.

A serious backup solution HAS to include incremental backups where you keep ALL previous copies of a file for a given period of time.

Otherwise, you will quickly get screwed in various scenarios. One of them is your main controller failing so slowly that you won't realize that your data is slightly corrupted before it is too late (e.g. you have already overwritten the copy and you no longer possess any non-corrupted copy)

It also needs to have all components redundant. The cheapest way to achieve this is to h

If you want data integrity, use NTFS. Using Fat32 is like saying you want a reliable car, so you're buying a Edsel because they've been around a long time-- it doesn't make sense. Every other OS on earth can read NTFS (if not write it), so it won't affect your portability requirement.

Secondly, before you make any decision regarding Windows 7 RAID, make sure the edition of Windows 7 you want to buy ships with software RAID support before you put all your eggs in that basket-- early betas and RCs of Vista had software RAID enabled, only to have it disabled before release. I've seen no guarantees about Windows 7 software RAID support, and which editions will have it enabled. (If any.)

If you're planning to move to a server OS after Windows 7 expires, I can practically guarantee software RAID will be enabled, but that still doesn't mean you can necessarily upgrade your Windows 7 software RAID array to a Windows Server software RAID array. Do your homework.

You sound like someone that need to be reminded that RAID IS NOT BACKUP! Google for that sentence. All you talk about is saving your data, and RAID will not do that for you. You'd be better off just using the second drive as a backup. RAID will not save you from accidental overwriting of data, corrupt filesystems, broken chipsets, etc. The only thing RAID will save you from is downtime. If you lose that much money on the downtime it takes to recover from a backup, then by all means, use RAID, but don't treat it as a backup solution that will protect your data. That's not what it's made for.

All that it is is that it writes the metadata on the disk in specific format so that you can see the raid volumes via BIOS. Note: Only "see" their status - in case you replace one drive, the resync is still done by software and you must boot to operating system. One clue is the fact that in Linux the dmraid package uses exactly same driver for accessing fakeraid-mirrored drives and Linux's own software-raids - device mapper just does a bit of magic at init.

However, if faced with choice of Windows-only or motherboard-raid, I'd go with the motherboard-version, because that's at least supported both by Windows and Linux so in case something goes wrong with your Windows installation you can always pop in Knoppix or some other Linux CD for recovery.

DO NOT buy a real RAID card unless you have a pretty good budget for your system, and need the highest performance. The problem with buying a real RAID card is that you need to buy not 1, but 2 or better yet 3 of them, so that you can have spares. If your RAID card dies (and they do, more often than you'd think), the only way you'll be able to access that data, because of the proprietary on-disk storage method used by RAID card vendors, is to have an identical card (with the same firmware version, to be safe). And since hardware is constantly being obsoleted, you need to buy your replacements when you buy your card, not hope they're still available later. It's also a good idea to have spares of the same make and model hard drive, because hardware RAID controllers aren't usually that flexible in allowing you to pair up different sized drives like Linux sofware RAID.

For many purposes, software RAID using Linux is really a much better solution, because the on-disk format is open-source and standardized, so it doesn't matter what hardware you have, you can plug the disks into a different Linux system and you'll be able to read the data with no trouble. The only downside is a slight performance decrease since the CPU has to do all the work, but even then unless the system is heavily loaded, it's still faster than hardware RAID because the hardware RAID cards aren't that fast.

With the giant drives that are now common, I think the best solution, at least for home/desktop systems, is to forget about RAID5/6 altogether and just get a couple of 1-2TB SATA drives and mirror them with software RAID 1 in Linux.

Cheap RAID controllers suck - at least you can trust Windows to be consistent between installations if need be. External (preferably offsite) backup is also a must! As I'm sure you'll be reminded 1000 times in this thread, RAID is not backup.

RAID1 serves only one function. Increased uptime. If avoiding having to spend 2 hours restoring from a backup is your primary goal, then RAID1 might make sense for you. Do you have an office full of workers that will all lose productivity if you have a system crash? If so, then RAID may make sense.
Any other use of RAID1 is fool's gold. It will not protect your data from a system-level problem. It will not protect your data from corruption (especially not on a FAT32 file system, which was never intended for any partition size above 32GB in the first place). It will not even always protect you from a single drive failure, since the rebuild process in a RAID1 setup often kills the second drive while trying to recover data.
As many have said already on the thread, RAID is not backup. Backup needs to be a completely independent device. Unless you have serious uptime considerations, RAID1 should not be part of your backup strategy.

Actually RAID1 is quite good for reading data: it minimizes seek time . Of course, it works fine as long as there are not many writes. For example think analytic databases, cubes, etc. Those are not written to in real time (like the more common transactional databases)

RAID is no substitute for backups. RAID is very good at propagating errors and problems very quickly, be they software glitches or human errors.

For consumer class storage, weekly / daily backups might be more efficient than investing a lot of effort into live RAID. Since I'm a Mac guy, I see the best answer to this question as Time Machine to a network / USB attached drive -- hourly (configurable for more or less often) differential backups, almost transparent to the user. To my knowledge, Windows has no similar set of software to allow reinstallation to the last hourly backup -- my wife had the misfortune of having to restore a blank drive from her last backup and it was a flawless process that truly left her where she left off less than an hour before the hardware failure. The reinstall wizard just had to ask where the backup was. Casting aside MacOSX advocacy, there is truly no substitute for a good automated backup solution that is regularly tested. I think the best method would use the fewest common components, like a NAS, followed by an external drive with its own power supply. My least favored option would be an internal drive with every single component shared.

most likely its just BIOS headers for some RAID functions. The actual RAID stuff is done by the OS. Almost if not all "SATA RAID" cards that cost less than $100 are just SATA controllers with a thing in the cards bios that says RAID. The OS will still see the individual drives and will have to piece them together. Go search Linux Sata Raid and you will see many, many articles on this. A "true" raid card will show up as a single device to the os (or several LUNs), you will not see the individual physical dri

The only way to keep your data secure in any reasonable fashion is to make a copy of it and store it offline, off site. Ideally "off site" would be in another building or city, but it at least has to be on something not attached or accessible to your computer.

Without regard to if you use software or hardware RAID or the quality of the RAID system, RAID only protects you from a physical disk failure. If you as a user screw up (delete or change something you didn't want to) or if some software bug screws up for you, or if you have a non-disk related hardware failure (causing a data corrupting machine crash) then you have lost your data -- RAID doesn't help.

Even if you are only trying to protect against disk errors, if the RAID system fails (even expensive quality ones can), or if you don't know and follow the recovery procedures EXACTLY, you can lose all your data.

The only reliable solution is making a copy or a "backup". Backup does not mean making a copy of the data on the same machine. (Whatever took out your RAID might also take out the other non-RAID disk or directory that you put your copy on.) If you are paranoid (or just prudent) your backup should not be a mapped or mounted drive on another machine. (Viruses can write to the network as well.)

And finally... Backups only count if you have tested your restore process.

I won't try and improve on the comments above, almost all of which I agree with, but I will make one observation. The reason for mirroring is to protect against drive failure. The one time I had a drive failure, mirroring saved the day's data. The best way to protect against drive failure is to buy server grade SATA drives, which are designed for 24/7/365*5 operation, and not cheap PC drives which are designed for 10 hours per day for 3-4 years. Buy server grade SATA drives, mirror them using a hardware con

The best way to protect against drive failure is to buy server grade SATA drives, which are designed for 24/7/365*5 operation, and not cheap PC drives which are designed for 10 hours per day for 3-4 years. Buy server grade SATA drives

This is just pure marketing baloney. Do you have any real-world tests that actually back this claim up? I've never used "server-grade" drives, and never will. I've seen "server grade" drives fail in large quantities and "desktop" grade drives last for years running 24/7/365.

With RAID mirroring, if you overwrite or delete an important file, it's copy on the mirror is immediately overwritten/deleted too, and the file is lost. Wouldn't you rather need a good regular backup?

And as someone pointed out already, FAT is really not a reliable file system. If you are on Windows, use NTFS. It is still portable, having read/write drivers for both Linux and Mac (see this guide [alma.ch]).

Since the files you want to keep safe appear to be regular files, not system files, any simple file copy mechanism could do. For an easy and simple system, you can use the Windows robocopy.exe tool in a batch file. For a more sophisticated system which can keep older file versions, and can easily be adapted for use over the network, you could try a Windows version of rsync like cwrsync. There are also a few rsync GUI frontends for Windows.

If you decide you really want RAID mirroring and go with the hardware solution, my understanding is that you need a replacement controller in case yours breaks. Since your controller seems to be embedded in the motherboard, you would need a replacement motherboard.

With the Windows software RAID, you are dependent on that software, and have portability only between machines with this Windows 7 software RAID (possibly even only this particular version).

Wrong. You need to buy at least two of these controllers, at the same time, or else when your "real" RAID card dies (and they do), you'll lose all your data unless you can find an identical card (you may even need the exact same firmware version).

Software RAID on Linux is a much better solution, as the underlying hardware doesn't matter. You can mix and match different drive models/sizes (can't do that on HW RAID), and swap the drives to a different system and still read them thanks to the standardized on-disk data format.

I'm with him. I have a bunch of external FW800 RAID enclosures (for both SATA and older IDE drives.) Advantages include not using/depending on the computer for RAID drivers (in my case, Mac OS X software RAID), independence from computer failures (e.g. bad power supply), cooling (less hot drives in the case). Disadvantages include performance (unless you have a good eSATA RAID case) and Size, Weight & Power (more than for drives within the computer case).

...then you probably don't need RAID. Use NTFS and set up some kind of scheduled backup to the second drive. Or, build a Linux NAS device and run BackupPC (backuppc.sourceforge.net). BackupPC works great for this sort of thing, it can do incremental and full backups of all your data, on the schedule you choose.

mirroring with RAID 0 does not take the place of regular backups of data. RAID 0 is for rapid recovery to minimize downtime is MOST instances. Data should still be backed up separately in case of a catastrophic failure.

mirroring with RAID 0 does not take the place of regular backups of data. RAID 0 is for rapid recovery to minimize downtime is MOST instances. Data should still be backed up separately in case of a catastrophic failure.

If you're just want a convenient backup of your music collection, porn collection, musical pr0n collection, or your pr0n musical collection then RAID is not a horrible thing. However, if you're backing up important files, like the only existing scans of the now-burned dossiers William Mark Felt [wikipedia.org] left you, then you should not stop at RAID. Statistically speaking, if something happens to one HD in your machine, like a massive power surge or being confiscated by tight-lipped men in black suits and black sunglasses, it has a pretty high probability of happening to the other HD. Offsite backups are, therefore, prudent. Leaving a HD in a box at the bank and giving the key to your lawyer is one of the safer things you can do, but not terribly convenient. There are a variety of online backup services available that are decent. I'll leave it to others to speculate on which ones are least likely to be fronts for the NSA. If you feel that your data might actually be interesting to more than one human being on Earth, don't forget to encrypt it. (Be honest with yourself. You are posting to/. after all.) I'm rather fond of emailing moderate risk files to my gmail account. (Stupid, I know, but very low effort and they're available anywhere you feel safe enough to check your email.)

As for Motherboard RAID chipsets... Keep in mind that your motherboard has a non-zero probability of frying, having it's caps go bad, being peed on by irate government agents, etc.. I once had a RAID 0 array that was hooked up to one of those things. After the Mobo died I had to do without letters K through P of my Japanese horror-comedy-porno-game-show collection until I was able to find a used computer with the same RAID chipset. (I don't know if it's changed, but at the time each different RAID chipset made RAID 0 arrays that were not compatible with anything else on this lump of rock.) If data portability rather than performance is a priority for you, my advice would be to avoid hardware RAID entirely.

There are a number of reasons why not to use FAT:
1. Unreliable
2. Doesn't support large files
3. Doesn't support advanced permissions
Since you are running Windows, use NTFS, an external USB drive for backup (also NTFS) and the free Microsoft SyncToy to make periodic backups to the external drive.

NTFS and ext3 have journaling, FAT12/16/32 and ext2 don't have journaling.

FAT12/16/32 have a central structure (the FAT). Damage it and your data is lost. ext2 and ext3 store their meta data redundantly.

RAID is no replacement for Backup.

A real hardware RAID is expensive, and appears to be a single disk to both BIOS and OS. Its on-disk meta data is propritary, i.e. if your HW RAID controller dies, you need exactly the same controller again to get access to your data. HW RAID works with every OS, because it appears to be a single disk (typically, SCSI). Booting from complex RAID configurations is no problem, as each RAID appears to be a single disk. The RAID controller is a small computer on its own, taking care of the reqired calculations for non-trivial RAID levels, of switching to hot-standby disks, and of detecting broken disks.

A software RAID is cheep as dirt, every single disk of the RAID appears in BIOS and lower levels of the OS. The on-disk meta data depends only on the OS, so you can mix controllers as you like. A broken controller is no problem, replace it with any controller that has the same connectors and your data is back. Booting can be a problem, because the BIOS does not know anything about the RAID. Usually, booting is only possible for RAID-0 and RAID-1. Booting another OS is problematic, because there is no standard for Software RAIDs. Linux may be able to work with Windows RAID volumes, but Windows can't work with Linux RAID volumes. Calculation and monitoring is done by the host CPU.

A host RAID is nearly as cheep, the only difference to a software RAID is that the BIOS decides about the on-disk meta data. Special drivers for each supported OS know the structure of the meta-data, but they don't allow to use other controllers in the same RAID. A broken controller is a problem, because drivers will refuse to work with other controllers. Booting is no problem, because the BIOS knows about the RAID.

I prefer pure software RAIDs, for a simple reason: They do not depend on available hardware. If one controller dies, switch to another one: Other brand, other type, other drivers, and the RAID still works. If you insist, you can even mix an IDE drive, a USB drive, a SATA drive and a SCSI drive into a single RAID. Try that with a hardware or host RAID. Some people even built RAIDs of floppy disks or USB sticks (not for pemanent use, of course).

My faithful old Linux home server runs two RAIDs, both in software: a RAID-1 for the OS (remember: the BIOS does not know about the RAID), and a RAID-5 for the data. The RAID-1 used to run on old SCA drives, but recently, I switched to two small IDE drives due to unrecoverable SCA cabling problems. The RAID-5 is composed of four IDE drives, connected to two IDE controllers, each disk on a single IDE cable. An external USB disk is used to back up my data, rotating through 10 days. All filesystems are ext3, all disks are monitored using SMART, all RAIDs are monitored. If anything wents wrong, I will get an e-mail from the monitoring software.

Until recently, one of the controllers was an el-cheapo non-RAID controller, and the other one was a donated, expensive, well-known brand, RAID-capable controller running in non-RAID mode. The latter one decided to randomly take some free time on the job, and either disconnected from the PCI bus or disturbed it, causing panics in the OS above. Only pure luck protected me from data loss. I ripped it out of the machine, kicked it into the trash bin, rewired the RAID to use two disks per IDE cable, and verified and reconstructed my data. Some days later, another el-cheapo non-RAID IDE controller arrived, the same brand, model and type that already sat in the next PCI slot. So I rewired the RAID again to work with one disk per cable, everything was fine again.

For a new small business or home server, I would use nearly the same setup again: Two software RAIDs, one for the OS, and one for the data. Upgrading the OS is just fun when you can

My faithful old Linux home server runs two RAIDs, both in software: a RAID-1 for the OS (remember: the BIOS does not know about the RAID), and a RAID-5 for the data.

Beware of RAID-5, it's dangerous.

The problem is that reconstructing the array after a disk fails is a very intense operation that touches every sector of every disk. If another disk in the array has a latent failure, the reconstruction operation will trigger it, and when you lose two disks from the array, you're hosed.

This happened to me. I had a RAID 5 array with a hot spare, one drive failed and dropped out of the array, and the process of reconstructing onto the hot spare triggered another failure.

I often use inexpensive SATA RAID controllers from Promise (I do not work for promise). They don't cost very much and they have been absolutely reliable for me (for many years now!). I often stick with RAID 1. I've built several RAID5 arrays and I don't find a lot of value in them for low capacities. Mirroring (RAID 1) is straight forward, and if you ever have a problem you can always read one of the RAID 1 drives using a SATA to USB interface, or if you ever need to clone a hard disk it's easy. Promise seems to use the LAST 64k of the hard drive for it's mirror info, not the FIRST 64k! this makes any of the two drives in the RAID array easy to use out of the array when/if you're in a jam (for whatever the reasons). As far as RAIDing your data only, in my opinion RAID is designed to avoid lengthy recovery procedures - don't put yourself in a disadvantaged position - all hard drives fail eventually - RAID the OS, your data, everything! If your server is a very busy server - start looking at higher end RAID solutions.

if you want security, you need backups, and backups are:- off line (viruses, power surge, sabotage...)- off site (fires, theft...)- tested (i've got horrors stories of people that THOUGHT they had backups...)- multiple (... and of backups that turn bad at the worst possible moment)

Raid is none of that. I know plenty of people who thought their data was safe because they had raid. It isn't, it wasn't, it ain't ever gonna be.

Then forget about RAID. RAID is designed to protect the integrity of the underlying volume - NOT the data that's on it.

> Of course, such a setup should secure my data; should a drive crash,

Then forget about RAID. RAID will only secure your data under some very specific cases of hardware failure of the drive. It does absolutely nothing towards preventing data loss due to (say) a corrupt file allocation table, virus, accidental deletion, or corruption.

> Even more importantly, I want any drive and its data to be as safe and portable as possible

Then use proper backups - not RAID. Preferably off-site backup. I use Carbonite which backs up to the 'cloud' at minimal cost.

By all means use RAID to protect you from hard disk failure, but don't under any circumstances assume it stops you losing your data. For backups, I always use the rule that at any given point in time, assume that the next time you walk back into your house/office, that NOTHING in that building is still there. Do you have a copy of everything you care about somewhere else?

I'm still amazed by people that carry 12 months of work around on a single floppy disk/USB stick/laptop, then cry when they go to the helpdesk asking what "sector not found reading drive A:" means, or perhaps "A USB device attached to the system is not functioning".

Get your data in as many places as possible - preferably three. A drive which is mounted one inch above the main one is *NOT* a valid second place!

"Based on past experiences, I have decided that only my data is worth saving"

See? He is asking for backup, not RAID. It has been said one thousand times but it seems it must be said again: RAID is *NOT* in order to protect your data. NOT, NOT, NOT and then NOT again.

RAID (not talking about RAID-0) is there in order to enhance your data's avaliability (as in, say, instead of being able to get to my data 99% of the time, I can get to it 99,9%) but when it's hosed, it's hosed. To protect your data you need backups, not RAID.

"Of course, such a setup should secure my data"

Of course not. Of course you will get quite a funny face when you discover it. Quite more or less the one that had the guy from this story, about six months ago, with the very enlightning title "Why Mirroring Is Not a Backup Solution": http://hardware.slashdot.org/article.pl?sid=09/01/02/1546214 [slashdot.org]

"Even more importantly, I want any drive and its data to be as safe and portable as possible"

Then, *even* if RAID could be considered for data security (which is not) you already answered your question: as a general matter, hardware RAID will only work when using exactly the same controller model, possibly up to its minor revision. You can't count to break a hardware-managed mirror, take one disk to a standard SATA controller and get any data out of it. If your controller dies and miracolously doesn't take the disks with it you can't count on buying a different RAID card (as it will most probably be in about a year for consumer-grade hardware) and get any data out of the mirror. So you should go with software RAID.

Preach Brother! I have had several of my customers ask about RAID, but when actually sitting down and talking to them it turns out they are looking for a backup solution which RAID most certainly is NOT.

Here is what I recommend to my clients-Use whatever you want inside your machine, but get a USB HDD or even better a NAS for backups. Most come with very capable backup solutions provided, and is much better for the purpose than RAID which as you so very eloquently is for access NOT backup. There are several

Yes, Mod parent up. RAID is for providing data resiliency, not data protection. In corporations where either very large data sets would simply take too LONG to restore, or where spindle count for acheiving IOPS is critical, RAID permits a reduction in failure rate. However, is it NOT a replacement for backups, and RAID should generally only be considdered when you;re already considering a multiple disk setup (either for capacity or performance reasons).

Why RAID is not a backup:1) not fireproof.2) not mistake proof "oops, didn;t mean to delete that"3) not immune to file system corruption.4) not immune to power supply failure/surge/lightning/other destructive forces5) more expensive than a good backup6) not protable offsite7) does not track versionb history or old files (something that should be of critical importance to a programmer...8) Viruses, mailware, hackers oh my!9) bad/corrupt install10) OS failure

I could easily go on. I worked in DR for 4 years...

Nearly all of the above have a higher frequency of occurance over a 5 year typical HDD life. Even if you continually replace drives without a data failure, you're still eventually going to have an issue RAID can not deal with.

My Qnap was a $399 device. The 4 drives in it were $90 each (and the 5th spare too). The HDDs I run the PC off on the RAID 1/0 were $40 each. I only run the RAID 1/0 for performance during video editing. I chose 1/0 vs 1 since 1 halves the reliabiltiy of the drives. Even though I do have a good recovery solution, the downtime, nor the effort involved in recovery, would be welcome, and the extra $80 to mirror the performance stripe was easily spent.

The Qnap is also my iTunes media server, my FTP server, included the price of the DR software, and runs 2 IP cameras I set up at home too (which let me tell the insurance company I have real-time video monitoring, and they knowcked an extra 5% of my homewoners policy cost, which by itself is enough to fund replacement drives as I'll need them).... Oh, yea, and it's a NAS too... It has a lot of value beyond a backup system.

I'm guessing you've not got a child yet, or a large family. You probably don;t value to pictures you take, files you have, and other stuff on your PC. That's fine, someday you likely will.

There are cheaper ways than mine to do backups. I have over a TB, and 3 (currelty, soon to add 2 Macs to the list an decom 1 old laptop leaving me with 4) computers I'm backing up, so centrally makes sense. If you have 1-2 machines, a small amount of data, and don't value most of it, then 2 external USB drives and a safety deposit box (Dad's house) usually suffice... Or, just an online backup account for $5 a month...

RAID 1 might save you from a firmware failure, or a disk going bad, but that's about it... Also, RAID 1 may be cheap, but a backup is cheaper. Also, good luck rebuilding that RAID if your MOTHERBOARD fails... RAIDs are proprietary to a particular controller. Unless your new board usues the same chipset (and firmware too in most cases) you;re screwed without a backup.

And what allot of people don't realize is if you build a RAID array and a drive fails can you replace the drive with the exact make and model? Raids work best when every disk in the array is the same model and revision. If you plan to build a 5 disk raid array you should also purchase a 6th drive to keep as a cold spare.

I built a RAID 5 array using three 500GB disks via mdadm under Linux. I assembled the array and formatted it. Within minutes of testing I was getting mail from mdadm telling me the array was degraded. I then began to test each disk for defects and lo and behold one disk was bad right from the start. I tried to RMA the disk but newegg had informed me those disks were now obsolete. Great. I was credited for the bad disk and purchased a new one that closely matched the other two. It was a nightmare as during some boots the disks went haywire and I would get a "Could not bd_claim sdaX" And it would hang for a while and I would have no array. It happend once in a rare while until it became a real problem. I kept my most precious data safely backed up on different disks I had spread around. It finally got so bad that I would have to constantly reboot the machine for up to ten times before the disks were synced up and the array worked. I purchased a 1TB disk and copied all the data off the array to it and used the 500gb disks in other systems. RAID is great for big fat storage arrays but it can become very sensitive and then one day POOF its all gone.

This is the reason OEM drives from Dell, Apple, HP etc. Cost four times what a retail drive would cost. The cost is no way associated with quality but rather consistency. Retail SATA drives are constantly changing: less/more platters, faster seek and read speeds and firmware revisions. Those costly OEM drives are the same disk every time right down to the inner workings and firmware. So if you buy an Apple 1TB disk on a sled and it takes a dump in three years you can be confident Apple will replace that drive with the EXACT same one. Its not a magical Apple disk of superior quality but a Maxtor/WD/Hitachi disk that is produced for Apple with no revision changes unless Apple orders it. Unlike retail drives which are changed at the manufactures whim.

So if you are building your own raid plan for failures and try to buy a spare for your array. I don't know disk shelf live but it will save you down the line. Also keep a USB or 1394 disk around for backups. Spread your most precious data around like pictures home movies and documents. If you have a few computers around the house keep a mirror of that data one those machines. Music, and downloaded video can be re downloaded but home movies and pictures cannot. Put all the silly stuff on the raid along with the precious stuff for access but keep backups of the good stuff!

Whoa, hold the boat. I've had a lot of experience with Dell & HP/Compaq(Proliant) provided RAID systems and they are not sensitive to disks with vastly different innards. All that matters is block count and software mirroring doesn't even care about that, because you'll simply be limited to the size of the smaller disk. If you're using mirroring or RAID, try to go with different makes of the same size. This article [ssdirect.com] talks about MTBF. It turns out if 2 drives of the same exact model comes off the line and end up in your PC, there is a chance they could fail within a very close time to one another. So your mirror or RAID could fail permanently while rebuilding from the first failure. But if all your drives are of a different make, chances are they won't fail at the same time and you'll get the critical time needed to rebuild your array.

When I'm going to do mirroring or RAID on hardware that doesn't have high-end dedicated server RAID controller, I use Windows or Linux software RAID. Performance is surprisingly good and I'm not married to a specific hardware implementation. I've had _none_ of the issues you've described with Linux software RAID on several servers for several years. Mdadm has only whined after a power outage or genuine disk failure.

Raids work best when every disk in the array is the same model and revision. If you plan to build a 5 disk raid array you should also purchase a 6th drive to keep as a cold spare.

I hate to break it to you, but you're actually wrong.

A RAID array is most effective using completely different drives, but of the same capacity. Five hard disks from the same manufacturer, of the same model, bought at the same time means that you're highly likely to get five drives from the same batch. Let's posit that there was some defect in this batch. Now all five of your drives have a significantly higher probability of failing at the same time. Oops! RAID can only deal with one (or two) drive failures!

Using drives from different manufacturers or model lines means you spread the risk of simultaneous drive failure.

A typical RAID implementation writes stripes at a time, by issuing a series of writes to each drive. If your disks have the same geometry, then each write will be at the same physical location on each drive and so complete in almost exactly the same time. If they are not, then the different disks will be moving their heads at different times. The RAID controller (hardware or software) will then be bottlenecked by the slowest drive. To make things worse, the slowest drive can be different for each write. One write may require moving the head sideways on one disk, the next may require moving the head sideways on the other. In both cases, you are limited by the worst-case performance for the disk. The same is true for reads on RAID-5, but not RAID-1, which can just use the result for whichever disk returns first.

You may have noticed that some hard drives are marketed as being designed for RAID use. These work slightly differently to most consumer disks. Typically, a small region of a disk is hidden. If the disk discovers a bad sector then it will use one from the hidden region to replace it, so every write to the bad sector goes to one of the spare ones instead. This is very bad for RAID, because two drives writing to the same sector may be writing to two different physical locati

You may have noticed that some hard drives are marketed as being designed for RAID use. These work slightly differently to most consumer disks. Typically, a small region of a disk is hidden. If the disk discovers a bad sector then it will use one from the hidden region to replace it, so every write to the bad sector goes to one of the spare ones instead. This is very bad for RAID, because two drives writing to the same sector may be writing to two different physical locations (if one is remapped), with the same problems I outlined above.

All modern disks remap sectors as necessary. The main difference between consumer and RAID drives is the timeout for error correction [wikipedia.org].

Is it worth keeping a spare for the sole purpose of having the same model available in the event of a failure when you can get a newer and faster drive in the future?

I would say not, but when one drive fails you should replace all of them. For a home array, expect one drive to fail every few years. I had a disk in RAID-1 array fail last year. It was a 40GB disk which cost around Â£100 new. For the same price, I can buy two 500GB+ disks now.

Is the difference in performance between modern SATA drives so significant?

It's not a question of performance, it's a question of the difference between a linear access and a seek. The time for a seek is 4ms+. If a drive can read 50MB/s then a linear access is around 10 microseconds. If your one disk is doing a linear access while the other is doing a seek then you are limited by the time of the seek (for RAID-1 writes and RAID-5 reads and writes). If you have to seek after every block, your maximum throughput is 125KB/s. If you do a linear read, your throughput is 50MB/s. If your drives have different geometries, you double the number of seeks you are needing, dramatically reducing your throughput.

He wants to mirror the drives. This means he wants RAID 1. Therefore, the failure rate of the array is 1/2 the failure rate of each disk (more, actually, because they're like;y identical drives that will fail at the same time, but you get my point).

Funny... cause that "onboard" raid is most likely NOT a true raid controller. Sure, there are real raid controllers that get build into mboards, but that generally adds a good $200+ to the cost, and is almost always SCSI. Sata "RAID" cards that cost less than $100 are usually just sata cards with a RAID tag in the bios for certain basic raid-like functions. They do not handle the actual raid IO, the OS (in this case Windows) does the work. A true RAID card will appear as a single device (or multiple LUNs if