Why not become a lifetime supporting member of the site with a one-time donation of any amount? Your donation entitles you to a ton of additional benefits, including access to exclusive discounts and downloads, the ability to enter monthly free software drawings, and a single non-expiring license key for all of our programs.

You must sign up here before you can post and access some areas of the site. Registration is totally free and confidential.

Weirdest thing about RAID-1. When it breaks, it sometimes takes out both drives.

Shit happens - drives from the same batch can die shortly within eachother (especially if you have very disk-intensive rebuilds... mirroring isn't too bad, raid-5 is BAD). And then there's stuff like power surges etc. So yeah, stuff dies.

It wouldn't have been so disturbing if it were just the drives that failed. They're mechanical devices so you have to expect that. "Omnes vulnerat, ultima necat," as those old sundials used to say.

But it wasn't the drives that caused the problem. In both cases it was a controller issue (one HP and one IBM branded) using drives originally installed by the manufacturer. Since these are big league server manufacturers, I'm confident they did the necessary mix & match games to minimize the chance of getting two "bad batch" drives on the same machine.

In both instances the controllers unexpectedly started writing total garbage to both drives thereby rendering them useless. In the case of the IBM card, a firmware update corrected the "engineering issue." With HP, a replacement was necessary because there was a "marginal hardware condition" on the card.

Having it happen two different times on servers from two different manufacturers is a little too much bad luck AFAIC.

Raid-5 (and other "big storage" schemes) would be silly on SSD until their storage capability goes massively up. The added writes of raid-5 is a real concern,

Okay, this has been bugging me. What added writes?? RAID5 is striping with parity...So 2 of the drives split the writes and each get half the file. Parity is written to drive 3 which is (in that config) its sole purpose for existing. What's extra? Traffic on the controller?

Never the less, the parity info is stagged between the drives. So if something is written to the array it will be striped between 2 of the drives, and a third drive will catch the parity info. for any given write operation. Which goes back to my original quandary ... Where is the extra per-disk write that would possibly cause it to prematurely fail?A gets halfB gets halfC gets parity

All are on separate physical disks. So other than controller traffic (that's a given) I don't see where anything is really getting doubled-up...At the per disk level. The parity section of each disk isn't going to take any more or less of a hit than its corresponding data segments. So it's not like it's being subjected to exhaustive localized rewrites that are going to "burn-a-hole" in it.

It's not so much the actual read/write is it is the fact that every drive in the array spins up for every read/write - so there's more wear and tear on the drive mechanics rather than the disk platter's surface.

If you saved a file to a single drive, only it would spin up and be written to (along with the housekeeping of finding sufficient free clusters. On a three element RAID-5, three drives would be spun up to accomplish the same thing, plus need to write additional information (i.e. parity) above and beyond that contained in the actual file itself. That's three times the disk activity plus "parity tax" plus three times the heat generated over a single drive save operation.

So when you add in the MTBF for each of the three drives, you have a higher probability of a drive failing all other factors being equal. And most arrays have more than three drives since that's the least cost effective RAID-5 configuration since you always sacrifice one drive to parity even if that drive doesn't exclusively hold the parity data.

Most times, the drives chosen for arrays are built to a higher quality standard than those normally deployed in PCs - so that may even up the failure occurrence rate up between server and non-server drives despite a higher utilization rate.

I'll have to see if I can locate any hard stats for drive reliability on a per disk basis when used in an array. I'm sure studies have been done. It's just a matter of finding them.

It's not so much the actual read/write is it is the fact that every drive in the array spins up for every read/write - so there's more wear and tear on the drive mechanics rather than the disk platter's surface.

If you saved a file to a single drive, only it would spin up and be written to (along with the housekeeping of finding sufficient free clusters. On a three element RAID-5, three drives would be spun up to accomplish the same thing, plus need to write additional information (i.e. parity) above and beyond that contained in the actual file itself. That's three times the disk activity plus "parity tax" plus three times the heat generated over a single drive save operation.

Hm... three times the disk activity regarding spinning up three drives, ok. But three times the i/o I ain't buying (I'm thinking closer to 1.5). The scenario also assumes the drives weren't already spun up for some reason (do SCSI drives ever spin down?).

Energy consumption/heat issues I can see (kinda) but it makes me wonder how much extra is another HDD actually gonna cost in a year (filed under why I hate accountants ) $5?

So when you add in the MTBF for each of the three drives, you have a higher probability of a drive failing all other factors being equal.

Granted statistics isn't my thing ... But if the MTBF for a given drive is 3,000 hours, then for 3 drives it should still be 3,000 hrs. Or is this just the Murphy's Law More moving parts... argument? <-I'll buy that - as it appeals to my cynical side (hehe)).

And most arrays have more than three drives since that's the least cost effective RAID-5 configuration since you always sacrifice one drive to parity even if that drive doesn't exclusively hold the parity data.

Funny, I would consider 3 or a multiple thereof to be the best choice for RAID5. As regardless of how many drives you have you're going to sacrifice 33% of the total storage for parity info. Goes back to the less moving parts is better argument. Use a smaller number of larger drives. *Shrug* Having 3 drives just makes the 33% parity "overhead" more obvious, not higher.

Most times, the drives chosen for arrays are built to a higher quality standard than those normally deployed in PCs - so that may even up the failure occurrence rate up between server and non-server drives despite a higher utilization rate.

I understand where you're going here, but I can't help but think that the design of a Server/Enterprise class drive would sort of have to be predicated on the fact that it would not be getting very much sleep (e.g. spinning down) ... Know what I mean?

Keep in mind that when you RAID, you're not addressing at sector or filesystem cluster sizes anymore - you're addressing RAID block sizes. So a 1-byte change change to a file on RAID-5 can end up pretttty expensive - multiple drives as well as large blocks per drive.

But I guess you'd have a smart administrator that tries to match FS cluster size, RAID block size and, in the case of SSDs, erase-block sizes to something reasonable.

Keep in mind that when you RAID, you're not addressing at sector or filesystem cluster sizes anymore - you're addressing RAID block sizes. So a 1-byte change change to a file on RAID-5 can end up pretttty expensive - multiple drives as well as large blocks per drive.

...sssSo, a 1 bite (file change) write automatically requires/results in (assuming 3 drives) a complete rewrite of both of the corresponding blocks? That does sound a bit pricey. But it does explain why the block size selection is so critically dependent on intended usage during setup.

But I guess you'd have a smart administrator that tries to match FS cluster size, RAID block size and, in the case of SSDs, erase-block sizes to something reasonable.

Hm... Any chance you could give an example on the first part before I commit to a yes or no on that??

The price, performance, reliability trinity will be keeping SSDs out of my range for a while yet. I just can't justify paying top dollar for cutting edge performance that might grenade if ya look at it funny. Pretty much the same reason I never got into overclocking heavily.

...sssSo, a 1 bite (file change) write automatically requires/results in (assuming 3 drives) a complete rewrite of both of the corresponding blocks? That does sound a bit pricey. But it does explain why the block size selection is so critically dependent on intended usage during setup.

Yep - read+modify(inmemory)+write. Just like you've gotta do when dealing with a plain IDE drive, you're only dealing with a single drive and a single sector there, though.

Okay, to add my 2 bits. SJ brought up a good point - if the main penalty is due to spinup/spindown, then an SSD shouldn't be affected WRT MTBF issues. There is still that nasty penalty that f0dder eluded to, a change as small as one bit requires all drives to be rewritten to (read data -> change bit -> recalculate parity -> write data+new parity) which will take it's toll on the write-life of an SSD, but that shouldn't change it's MTBF, just it's lifespan, if you will.

As to SJ's question on why Mean Time Between Failure -

Quote

Granted statistics isn't my thing ... But if the MTBF for a given drive is 3,000 hours, then for 3 drives it should still be 3,000 hrs. Or is this just the Murphy's Law More moving parts... argument? <-I'll buy that - as it appeals to my cynical side (hehe)).

You have the essence of it. MTBF measures any failure within the system. The more parts, the more pieces there are to fail, and the more failures there will be - eventually. This does NOT measure the severity of a failure or even provide a directly useful measure of lifespan, since most failures will occur as the device ages, but it does give a good idea of the expected quality. The correlation is that lower MTBF means it will fail sooner, and while that may statistically be the case, it doesn't mean that any one device (or system in this case) will last longer than any other one. It just means the one with the lower MTBF is statistically more likely to fail before the one with the higher MTBF.

Example cancelled - Can't find the formula's other than calculus that I don't want to get into....

One failure of MTBF Marketing is that redundant systems automatically have a lower MTBF even though they are actually more reliable (because they are redundant and can be fixed without loosing system operability). This is not a knock on the measure, but on the usurped use of the measurement for marketing purposes. Further research proved this false. Redundancy is one way to reduce MTBF at the expense of complexity since MTBF looks at the system holistically.

MTBF (Mean Time Between Failure) as I understand it is the earliest statistically likely point for a given device to fail.

Now (operating completely without a net...), the odds of a coin landing heads up are 50/50. Which is to say that statistically there is a 50% chance of it coming up heads (I do believe it's safe to interchange them in that fashion ...Yes?).

The fun starts when you look at the odds of a coin coming up heads if it's flipped (oh lets say...) 3 times ... Because it is still 50/50 due to each flip being a separate event with 2 possible outcomes.

So I have a bit of trouble getting my head around the idea that the MTBF of 3 devices, is lower than the MTBF of 1 device. When they all individually have the same odds (statistically 0 until age X) of failure at any one given point in time.

So just to summarize, I want a server at home. My needs are simple, but because I'm so unreasonably picky and overengineer things, I make a big show out of it. I basically want a really big external hard drive. In this case, the really big external drive is the server, and the bigness comes from having several hard drives in the rack somewhere. That's the clearest way to describe what I want.

Well, first a crucial piece of information is missed, and then the rest of it just falls into place naturally...

[Supervisors hate me]

two things bugging me about that config. It's a rack system, why? It's much easier to stuff a tower in a corner somewhere. Rack systems require a...Rack. Which is going to take up a good bit of room, or it'll have to sit on top of something large/flat. Which is still not a stellar option as cooling could get tricky.

6 drive bays with 4 1TB drives and 2 emptys, doesn't sound like a lot of room for expansion. I thought you were after something like 13TB+??

Although you could go with a Gen III 4U PowerEdge 2900 with 10 Drive slots...That'll give you some room to grow.

Well, first a crucial piece of information is missed, and then the rest of it just falls into place naturally...

[Supervisors hate me]

two things bugging me about that config. It's a rack system, why? It's much easier to stuff a tower in a corner somewhere. Rack systems require a...Rack. Which is going to take up a good bit of room, or it'll have to sit on top of something large/flat. Which is still not a stellar option as cooling could get tricky.

6 drive bays with 4 1TB drives and 2 emptys, doesn't sound like a lot of room for expansion. I thought you were after something like 13TB+??

Although you could go with a Gen III 4U PowerEdge 2900 with 10 Drive slots...That'll give you some room to grow.

Yeah, I noticed that too, but wasn't sure. I actually asked for a rack, so I want it that way. But I don't want to buy any storage stuff for the server component. For the drives, i want to buy one of those big Norco enclosures that hold 10-15 hard drives. So I may ask to remove any storage things that I don't need, but it's pretty cheap anyway, maybe I'll just keep it for now. I don't know. That's why I want to kind of figure it out here.

So if the storage is going to be in an external enclosure, what's with the 4TB Server storage? Me confused.

I still think the more mainstream (and brutally tested) straight server hardware option is the best/safest. Those external boxes make me a bit nervous about getting parts/support in a few years down the road when something fails. Name brand server parts of today will still be available to our grandchildren out of a warehouse somewhere.

SJ does make a good point about rack mounting. Resting it flat on a sturdy shelf is suboptimal since most rack enclosures are designed to have a few inches of airspace all round them. If you do go the sturdy shelf route (since equipment racks are expensive and generally unsightly in living spaces) try for one of those open wire shelving units that usually come in chrome or black. Get the chrome if at all possible since it absorbs less heat than a dark finish will.

Note too that most rackmount servers are NOISY because they have multiple high-velocity variable speed fans. The fan speeds are likely something like high and turbocharged. But they're designed for server rooms installations where noise levels usually aren't a consideration. I'd plan on keeping your rackmount beastie in a spare room - or down in a cool dry basement - unless you like the sound of fan noise.

The rest of your configuration is an absolute bear for a personal server! The phrase 'massive overkill' does not begin do it justice. I have business clients that aren't packing half of what your rig has. And they're running serious business functions on them.

I don't think you'll really be needing that remote access card unless you plan on doing a lot of out-of-band system management. That's more for remote service management types (like me and SJ) who might need to diagnose and reboot servers without going to a client's site. Read a bit more about it here. So unless it's required for your support contract, I'd forgo it if it will save you some decent money. It may not affect your price much since I'd guess it was part of the unit when it came in for refurbishment. In which case I'd just leave it in. (You might also want to play with it. Out-of-band management isn't a bad thing to have some experience with.) But it's normally an expensive accessory to buy - so it might be worth thinking about how much you'll really use it.

As far as storage capacity goes, SJ again makes a good point. But with what's happening (OMG! 3.0 and 3.5 TB drives now coming) in the marketplace it's kinda moot. Get what you need for now. You can always backfill and regroup if you actually do end up needing that much. I'd go with a separate basic OS storage server if I ever needed that much. By the time that came around we'd finally be using btrfs or a similar "super" file system.

I basically want a really big external hard drive. In this case, the really big external drive is the server, and the bigness comes from having several hard drives in the rack somewhere. That's the clearest way to describe what I want.

6 bays with 2TB each (to many 1TB drive is a semi waste of slot space these days) would in theory get you 12GB of storage, however since we scared of losing all the eggs in one basket in one go you'll probably have to consider RAID 10 (or at minimum RAID 5 if you use all enterprise-grade drives), that'd leave you a usable capacity of 5.XTB-9.XTB.

You end up with slightly more than 9TB of practically non-expansible storage at best.

A plain but reliable home server coupled with one or more raid-enabled NAS boxes/appliances is probably a more cost-effective solution in this case. Much more flexible, expansible, less noisy, lower maintenance and cheaper!

P.S. If by any remote chance your projected storage growth increments by a few TB month over month, you might want look into periodically rotating out your data onto spindles of Blu-ray dual layer discs, and optionally getting a software/hardware disc cataloger. That's my current workaround anyway as I tried and was utterly unsuccessful in finding any affordable permanent solution going forward.

I can let go of the rack, since everyone is saying it's way too much overkill even for me. I agree. So let's say I build a normal tower instead, I still need some box that will be able to handle 5-10 drives. What is that box? How will it connect to the tower? Can I have everything in one enclosure somehow? Is there some kind of enclosure where I can stack a tower and an additional disk bank in?

I'd like to hear more about what options exists as far as the disk drive banks. I've settled on the Norco because it has the best bang for the buck by far, from what I've seen.

Yes, do that (Synology). Let go of your over-engineering and save yourself $1000s. Seriously. Please. Dear lord. If you have money burning a hole in your wallet, I can help you spend it more usefully.

There's overkill, and then there's just "will needlessly consume space and power, and generate lots of heat, with absolutely no benefit and increased cost and complexity to boot". You seem hell bent on making this impact your life in a big way and I'm not sure why. I feel like you can meet your *actual needs* with much lower cost and much less hassle. If you'd skipped the whole server idea you could have bought something by now.

Let go of the idea that "more = better". It doesn't. It really doesn't. Also, you will never get a perfect solution, ever. Trying to do so only makes it take longer until you have something, in the mean time your data is not as protected/secure/available as it could/should be.

The NAS above is still a bit of an overkill in many ways lol. One or more $300 4-bay NAS (no frills but with gigabit NIC and raid controller) will do. Plenty of these on eBay up for grabs.

Storage/Network/Server is really at the bottom of any quality home theater architecture... important infrastructure but do not affect audio/visual experience. Money is better saved from these areas and later spent on things that do directly affect your eyes and ears, such as a good HDTV and HTS. You'd be surprised to find out what $5,000 can buy you in an immersive, 100% digital HD HT setup, really not that much, maybe just a TV if you get a bargain.

IO Mega StorCenter 150, power blip wiped the config. Something was corrupt because it couldn't be reconfigured in a fashion that allowed anyone access the the files.

Tech Support says... It can't be fixed without a firmware update, but the firmware update it needs frequently wipes all the data on the box. Can't pull the data off the drive externally, because it's using some type of *nix based software RAID ... Good Times!!!