1) If you have a 6 drive array, you really should have raid-5. It is free with linux, and if windows doesn't support it, then that is a good reason to avoid it.

2) You really need ECC memory. Since the motherboards listed don't support ECC, it rules them out, despite them being small, quiet and having lots of nice features.

WHS Drive Extender is quite a useful and unique drive management system that requires very little technical expertise. There are many excellent articles about its strengths and weaknesses, which is why we did not cover it here. In terms of protection, pulling a drive (or a drive failing) incurs no loss of data as long as the total capacity of the remaining drives is not exceeded. So for a 6 identical HDD WHS setup, keeping at least 17% of the total space free is a good policy for defending against data loss from HDD failure. But there is no substitute for backups, which is why we have a section on that topic.

I disagree about absence of RAID5 being a good reason to avoid WHS altogether. This is NOT meant to be a business server, it is a home server, used mostly by people who have no desire to become network/server experts. Besides, RAID is protection only against a HDD failure. If there is a problem in the server that affects all the drives.... Same points apply to your comments about ECC.

[quote="MikeC" Besides, RAID is protection only against a HDD failure. If there is a problem in the server that affects all the drives.... Same points apply to your comments about ECC.[/quote]

So what is the excuse not to use ECC? It is a home fileserver?Do memory errors avoid the home?It isn't any more complex to buy a motherboard with ECC memory support and some ECC memory than a plain motherboard.Or is it that there are few if any micro atx ECC motherboards?

Do you know any company that sells business class servers that doesn't come with ECC?If business needs it for reliability, doesn't a home user need it also?The price difference between ECC and non-ECC is trivial.

So what is the excuse not to use ECC? It is a home fileserver?Do memory errors avoid the home?It isn't any more complex to buy a motherboard with ECC memory support and some ECC memory than a plain motherboard.Or is it that there are few if any micro atx ECC motherboards?

Do you know any company that sells business class servers that doesn't come with ECC?If business needs it for reliability, doesn't a home user need it also?The price difference between ECC and non-ECC is trivial.

It's not clear why you're coming on so aggressively. There is no "excuse" not to use ECC. It simply was not a priority.

The scores of folks who use an old P3 for a home file server certainly aren't using ECC memory, neither are ANY commercial WHS boxes from HP, Asus, etc, nor any commercial NAS boxes, even the multi-drive bay models that cost >$1000. Are they all wrong? Have they all suffered data losses due to memory errors?

micro-ATX? Did you even read the article? We eliminated micro-ATX boards early on for a bunch of reasons, our SFF build is mini-ITX.

The tiny number of mini-ITX boards that support ECC are rare industrial models with poor feature sets that carry hefty price tags.

I doubt anyone looking to build a SFF home server is going to seek out a pricey industrial ECC-supported mini-itx board just to avoid memory errors that according to IBM, who you cite in your article, might occur once a week with a gig of RAM.

Maybe you're just here to shill your article at THG or Chipkill Memory for IBM?

[quote="MikeC"]The scores of folks who use an old P3 for a home file server certainly aren't using ECC memory, neither are ANY commercial WHS boxes from HP, Asus, etc, nor any commercial NAS boxes, even the multi-drive bay models that cost >$1000. Are they all wrong? Have they all suffered data losses due to memory errors?

Maybe you're just here to shill your article at THG or Chipkill Memory for IBM?[/quote]

My 2 old P3's based NAS's use the asus cur-dls motherboard which has ECC.How do you know if you have data loss?I write a md5sum for each file I have, and I check periodically.This was after a hard drive started corrupting my data, but passed all the mfg tests.I was already paid for my THG article. I don't make more money if it is viewed more, though they might.I have never worked for IBM or any company making Chipkill products.

I just don't like losing data. Why don't home users deserve raid-5 to protect their data?Why don't home users need ECC?It is true that my newer motherboards don't log ECC errors, but my old intel pr-440fx did, and I got one every few months.That was with 320mb of ram. Most 'server' systems have 10 times that much memory.I posted 3 references showing why ECC is important.

I do think all the small embedded NAS machines without ECC are a joke. They are overpriced and run windows software.Building your own is cheaper, faster, and more reliable.Looking at the toms small business reviews, most NAS's won't do anything near gigabit speeds for file reads or writes.My cheap NAS with 8 sata drives can saturate my gigabit on reads or writes.Mine is cheaper, more reliable, and faster. Therefore the commercial ones are a joke.

@Mike: I'm not crazy about some of the stuff lopgok pushes (such as RAID5 or burning >160W on idle) but the cheap and efficient Biostar mobo that was discussed heaps on SPCR is supposed to support chipkill for instance. This ain't some kind of exclusive technology. But it's a uATX board. So far as I know, you'll indeed be hard-pressed to find ECC support on something smaller. I tried and the new HP server is the only reasonably-priced option I'm aware of.I don't think ECC is a must-have but I wouldn't spend over 1K on a home server either. The more you're spending and the more data you have, the more attractive ECC becomes I think. If one were not using Windows and not using much RAM (like a cheap NAS), I guess it would be less of an issue. If you were willing to forego mini-ITX, I understand you could get ECC (along with a more reputable mobo) for a Clarkdale like the one you selected while keeping the power consumption on the low side.

Drives:Is there any substance to the rumors of WD green drives dropping off arrays? Maybe Windows knows how to deal with that. Has anyone used these drives with md RAID for any length of time?

Backups:Especially if you're going to give ECC a pass on a Windows-based 4+TB array holding large static files, it might be prudent to keep hashes of these files. Your files can get corrupted and, unless you're backing up to another server, it would be risky and/or time-consuming to check every file against your backups. It doesn't have to be complicated: if you've got torrents of your files, your Bittorrent app should be able to verify all your files anytime. Best excuse ever for installing a file-sharing app on your server!A single eSATA enclosure is not a very good backup but it's not like there are good affordable options when you have so much data. Recycling an old tower into a backup server might be some people's best option (noise, size and power consumption aren't such a big deal if the thing is powered off and stowed away when it's not needed). And, depending on the nature of one's data, single drives (plus a dock) might be worth considering as well.

EDIT: I hadn't considered all potential issues with the poor man's backup schemes under consideration. If all you have is a mirror (presumably that's how you would use the options discussed by SPCR), moving a corrupted file to another directory (or altering its timestamp) and then syncing would corrupt your only backup. Unless you can make sure your backup's files are never going to overwritten, better check those hashes!

I am just regular user with photos, and video... As the routine goes, ecc is nothing to notice at first... but then the same vids in experimenting in compression get WAY smaller for the same compression routine. It is no mystery to me.. there is a pile of crap with every file that babbles nonsense.. you can read a data size, but it is NOT the whole file. This means time reveals it eventually.

ECC is the way to go. you do something serious, ECC is a just rule of thumb. Even the older cpus can go years further with that one helper...and I am glad that was mentoned about older stuff being a server. One can find, among other hoaxes, power is related to the functions demanded. a switch is a switch and time is time. 13.4 mhz crystals are all the freakin same. You don't save anything in other words. small micron or not... sub 2 volts donlt add to anything. I have kept a 3.4 prescott wide open for days (supposedly 130w)... all my pc since the 90s, the same result for 24/7 runtime. This power stuff is a prblem relaized by sme big corparate network. As a home user, it is nothing... Corporate is also where the concept of "cores" came from BTW. It means nothing to regular people. Feel free to go back a few years and spark up a once decent setup.

I find it hard to take anything that's said seriously from the anti-MS anti-Windows Linux zealots.

ECC memory may not add much expense in the cost of the memory, but it adds significant cost in other components. I don't know about AMD systems, but with Intel you have to use Xeon CPUs and the server chipsets to get ECC support with any of their current or recent products.

I am just regular user with photos, and video... As the routine goes, ecc is nothing to notice at first... but then the same vids in experimenting in compression get WAY smaller for the same compression routine. It is no mystery to me.. there is a pile of crap with every file that babbles nonsense.. you can read a data size, but it is NOT the whole file. This means time reveals it eventually.[/quote]It's a mystery to me, because that makes no sense whatsoever. ECC does not make things compress better! Data is data, ECC doesn't change that.

It is true that ECC ram isn't very expensive. I own 2 asus cur-dls mb's (dual pIII 933 with ecc), an intel sti-2 (dual pIII 1000 with ecc), an asus pc-dl (dual xeon 2.4 with ecc), asus ncch-dl (dual xeon 2.6 with ecc), and an intel pr-440fx (dual ppro 200 with ecc). Lately, for reasons that escape me, intel has decided to limit ecc to their expensive xeon chips. As a result, I have voted with my walled. My last 2 systems have used the asus m3a78t (phenon II with ecc) and asus m3nws (phenom II with ecc). As long as amd and amd chipsets support ecc at reasonable prices, they are the only game in town for me.

There are many windows based systems that choke up every so often. I am sure sometimes it is a software issue, but I suspect it is often a memory error. With ECC, memory errors are greatly reduced. My phenomII motherboard support chipkill (a great ibm innovation), which does 4 bit detect and 3 bit correct. I posted a reference to it in my first post in this thread.

Software raid is great. When my ncch-dl mb became unreliable, I bought a new mb, new ram, and a new cpu. I moved over my drives, reinstalled linux, and my data was all intact. A mb/cpu/ram combo costs less than a hardware raid controller (typically starting around $500). The software solution might not have an onboard battery backup, but my systems all have big UPS's. A software system is far more flexible. I have 2 supermicro 5 bay hot swap enclosures, and I have enough sata ports to use them all. I could go to raid-6. I can change my filesystem as newer, more reliable file systems become available. I have been building fileservers for a long time. My first system (which is still operational) uses 6 250gb pata drives.

As for windows vs linux, does windows support software raid-5 and raid-6? Until recently windows didn't support 2tb+ sized drive arrays. How much innovation does windows file systems have? I do use windows on some desktops, but all of my servers are running linux.

Last edited by lopgok on Mon Oct 18, 2010 2:37 am, edited 1 time in total.

About the EVDS WD drives, I don't believe this is a good choice...in fact a bad one...

I read an article - can't find the link at the moment - explaining that these drives have their error correction functionality limited to prevent video record operations from missing frames...i.e. if an error is not corrected within X time, the drive will skip error correction for that piece of data and move on...

For video recording devices and for video files only, that may be a good choice...for a file server or backup solution and a raid configuration that's a "stay the hell away" option if the above case is correct.

[quote="HFat"]For the people claiming Intel does not support ECC on non-Xeon CPUs: please do your research and make sure you're not confusing "fully buffered" with ECC in general.[/quote]

I know about buffering and ECC. Buffering reduces the load on the data lines, and is used when there are lots of memory chips to drive.If it is used on a mb, then it is required. My pIII boards use buffered ECC, my pentium xeon boards use unbuffered ecc.

So what exactly are the repercussions of not having ECC? I mean, majority of regular computer systems don't have ECC, why should a file server? Maybe ECC memory is cheap in the US, but in Australia ECC RAM seems to run about 30-50% more expensive, before factoring in MB/chipset compatibility, and limited processor compatibility.

The repercussions of having memory errors which are not properly handled by the hardware or the OS can be dire... as dire as you care to imagine such as having an app write all over your data, corrupting many files (and you won't know which ones). Crashes and a little corruption here and there are more likely. The real question is what are the odds of something really bad happening in any particular situation and I don't know that anybody actually knows. But the more data you have, the more chances you're taking. I assume you'd want a server to be more reliable than a cheap computer, not less!In any case there are things more important than ECC such as good backups. Something else that people are advising is to disable write caches if you don't have a battery-powered controller and can't guarantee power to your drives (that would mean redudant power supplies and UPS I guess).Your choice of filesystem (and OS) also matters when it comes to the consequences hardware problems can have. Google for people loosing data with high-end filesystems on low-end hardware! And the kind of data you're handling will determine the actual consequences corruption might have. If your file server is a DVD rip library or something, who cares? You can easily detect corruption of DVD rips and the like if you care to and getting a corrupted rip back should be trivial.

@lopgok: It says there is no ECC support for those chipsets. So what? There are other chipsets (some of which take both registered and regular ECC RAM by the way). Be careful with your assumptions...

I like the build guide. Even if the component choices don't necessarily meet one's requirements (e.g. ECC), it's still an informative article. If you're building a car, you can still learn from someone building a tractor. :)

Some random thoughts...

I'm in the "you should use ECC" camp. My understanding is that memory bit errors become more likely as the amount of memory grows. A few years ago, most systems didn't have >1 GB memory, but now 2 or even 4 GB is the norm. I think it's like what is being said about modern hard drive data integrity. The manufacturers cite expected bit errors, I forget the exact numbers, but say 1 in X bytes. The point is, X hasn't increased proportionally with capacities, so the likelihood of bit errors has gone up. There's a few articles out there suggesting that RAID-5 is no longer sufficient given >1 TB hard drives. The gist is that the likelihood of a bit error is high enough that it could strike while you're rebuilding your array, for example, after replacing a failed drive. And then you're toast! (But as we all know, RAID is for high availability, not a substitute for backups, right? ;) )

For those of us going the ECC route, the cheapest way I know of is to get a Biostar A760G-M2+ motherboard and an Athlon (not Sempron) or Phenom CPU. A few other AMD motherboards have been mentioned as well. The biggest drawback seems to be that the ECC support is "unofficial". That is, if you ask Biostar, they say they don't support ECC, but there's a whole page of ECC options in the BIOS.

If you want official ECC support, you either pay for a true server grade AMD board (which generally costs substantially more than a consumer board). Or go the Intel Xeon route, which we all know is expensive. I use the Supermicro X8SIL-F motherboard and a Xeon 3440.

I'm also a little paranoid about bit rot on long-term storage. Taking hashes of your data is good; it allows to to see if something got corrupted or not. I thought I read that the ZFS filesystem has built-in checksum capabilities; i.e. the filesystem itself can detect media bit errors. But ZFS requires you to use Solaris, or a BSD (such as FreeBSD). (Technically, you can use it on Linux, but you have to jump through some hoops.)

For stuff I consider really important, I create "par2" files. These are basically parity files that add redundant information, so if some part of your data is lost, it can be reconstructed. (Anyone ever download binary files from Usenet? Par2 was popular in that arena, because if your news service dropped posts, it gave you a better chance of actually being able to get the files.) I find par2 to be a nice compromise between no backup and full duplicate copies.

With regards to backups... my strategy is basically just to build a second NAS. But for this one, I go much cheaper (mostly old recycled parts), as it stays powered off except for when I'm actually using it (so noise and power efficiency aren't concerns in this case). I have it on a rack with my main server, and I find wake-on-lan (WOL) to be the most important feature of a backup server's motherboard. I can remotely turn on the backup server, do the backing up, and power it down.

Asus is one board maker which makes cheap boards which officially support ECC. On the AMD side there's a wealth of options and they make at least one affordable Intel server board which is supposed to take affordable Clarkdales and ECC. But the power consumption and reliability of Asus boards might not be stellar...I've not read any study which would lead me to believe that RAM errors have been scaling with the capacity of RAM chips of late. But I'm willing to be enlightened...

But it must be paired with the 3450 server chipset, which is not widely used outside of oem workstations. But you can get a base Dell T110 server for $399 with a Xeon X3430 and 2GB of ECC DDR3. Our recent Dell server purchases at work have NOT been quiet, however.

But it must be paired with the 3450 server chipset, which is not widely used outside of oem workstations. But you can get a base Dell T110 server for $399 with a Xeon X3430 and 2GB of ECC DDR3. Our recent Dell server purchases at work have NOT been quiet, however.

Thanks for the info, Jay, esp. the DRAM Study, it's very educational. Here's a key stat: About a third of machines and over 8% of DIMMs in our fleet saw at least one correctable error per year.andIn many production environments, including ours, a single uncorrectable error is considered serious enough to replace the dual in-line memory module (DIMM) that caused it.

The real issue is whether the approach of the second quote makes sense for the typical home server user. It's safe to say everyone has experienced random, unexplainable crashes with their PCs which could be attributed to memory errors. Sometimes we have lost valuable data as a result. Would such errors be catastrophic in a home server? Is it worth moving to ECC RAM for this -- even tho this does not cover uncorrectable errors? IMO, the answers are generally no, and the absence of coverage in our article can hardly be called a critical or uncorrectable error. However, we aim to please...

I've searched for mini-itx boards w/ECC memory support. There aren't many, hardly any Intel based ones, mostly AMD. It's not a searchable criteria even at such mini-itx-focused resellers like Logic Supply. I've asked my contact there for a list of current ECC-supported boards -- and will add this info in an addendum to the home server article, for all those who believe ECC to be essential.

[quote="MikeC"]Is it worth moving to ECC RAM for this -- even tho this does not cover uncorrectable errors? [/quote]

Actually, ECC will correct single bit uncorrectable errors.Chipkill will correct up to 4 bit uncorrectable errors.Of course, if you are getting uncorrectable errors it is time to R&R the memory.All memory I buy has lifetime warranties. Virtually all of the name brand memory companies offer lifetime warranties.

As for raid-5, I use it to deal with a single failing hard drive. I am sure everyone has experienced that, or knows someone who has had a failed hard drive.

quoting ilovejedd: "I personally use unRAID for my media server"I do as well. After a lengthy search covering a huge variety of OSes, I concluded that it was the best available product for my needs (mainly media storage).

@ MikeC:As I understand it, single bit errors (the correctable kind) are much more common than uncorrectable errors. See Table 1 in the study.

I think questions relating to "whether X makes sense" or "is Y worth it" have to be answered based on personal or case-by-case applications.

In my case, at the time I was shopping for unRAID server hardware - back in the days of cheap ram (2x1 GB ECC DDR2 for $29!) - ECC memory was only a few dollars more expensive than non-ECC memory. So the question of "is it worth it" was easily a YES.

But I did not design my server around the need for ECC. I am happy that I could accommodate the want, though. I was lucky to read Matt_Garman's Biostar A760G-M2+ motherboard thread while assembling my server shopping list. The combination of 6 onboard SATA, a PCIe slot that takes non-graphics cards, (unofficial) ECC support, and low cost satisfied several needs/wants. M-ATX was OK since I was planning for 12 HDD's.

Despite the minimal difference in cost, I think ECC's relative obscurity IS because of cost. ECC dimms have 9 chips instead of 8, and even as inexpensive as they are, those costs are non-trivial as you scale up quantity. For an OEM buying - what, millions of dimms per year? - the small difference in unit cost adds up in total. I believe this is why you don't see ECC in most consumer devices, including NAS appliances. But it's standard in even low-end enterprise servers and switches.

Quoting MikeC: "and the absence of [ECC] coverage in our article can hardly be called a critical or uncorrectable error."LOL - only if you make the same error twice! I wonder if ECC is "more silent" than a regular DIMM? ;)

What you don't say, Mike, is that the average DIMM in that study had nearly 4000 correctable errors a year! It seems that most memory subsystems are solid but the flaky ones can be quite the wreckers. And ECC can apparently handle most of them (the per-DIMM incidence of correctable and uncorrectable has a 6-1 to 24-1 ratio even for the non-chipkill systems). It might be possible to weed out the worst of the defective hardware with some patience and dedication and therefore to mitigate the need for ECC but I'd rather not go through such testing sprees or play at the memory roulette if I can avoid it.That said, it would of course be foolish to say it was an error on your part not to advise the use of ECC. Clearly, memory issues are not the biggest threat hanging over the average 6-HDD home server. My main beef with your article really is that you're using your skills to write for this unseemly market I don't get instead of designing quiet work servers for a big name. But I guess it wouldn't be half as fun.

As to the RAID thing: yeah, we've all lost drives. Sure, we know people who lost drives... and we also know people who have lost their RAID5 array. I've never used RAID5 so I only lost single drives, never an array.

What you don't say, Mike, is that the average DIMM in that study had nearly 4000 correctable errors a year! It seems that most memory subsystems are solid but the flaky ones can be quite the wreckers.

Yet, one of the mini-itx vendors I contacted on this issue just emailed me...

Quote:

AMD or Xeon is typically the way to go, but it’s really enterprise level servers that use ECC – modern RAM is good enough that it’s really hard to justify the additional cost for most people. Spec’ing ECC RAM on your MB really limits your customer base, so not too many companies do it, especially in small form factors.

Modern RAM won't be good enough for me until someone figures out a way to take duds off the market. It's certainly not good enough for the consumers who are returning it. And I've wasted enough time arguing about what makes a DIMM defective to know that return rates don't tell the whole story. ECC is useful to the extent that it can mitigate some of the damage.You might say that speccing ECC limits your consumer base *precisely because* not too many companies are doing it. AMD and Asus have long demonstrated that the "added cost of ECC" is just a pretense to justify certain parties' market segmentation. And, on the SFF front, HP's new AMD-based server is cheap and its ECC support ain't going to limit its consumer base. But I figure it took a heavyweight to pull it off. How would common mortals source such parts at a reasonable price?It's very easy to justify the artificial added cost of ECC. Three magic words: risk, uninsured, mitigation. They won't work on the average home user obviously but the average home user wouldn't know what to do with 2TB either, nevermind 12TB. Like I said, I simply don't get this market.So far as I know, the CPU you selected for this build supports ECC (contrary to what your supplier believes) and consumes less power at idle than Xeons and Athlons (but I'm out of my depth here so someone please correct me). A Xeon might actually be a downgrade. The only reason your build can't take advantage of ECC is that Intel has the best CPUs and you have to play by their rules if you want them. And Intel has deemed that low-power SFF be crippled.

So, what makes ECC so important that you need it in a home server, but not your desktop computer?

BTW, my original question still stands, what's the secret sauce that makes AV-GP drives special? What other differences apart from disabled head parking and error correction is there from regular old Green Power drives?

Mike, the mini-ITX vendor said, "Spec’ing ECC RAM on your MB really limits your customer base, so not too many companies do it, especially in small form factors." How does adding additional features limit your customer base? You can offer ECC support, but the end-user (or even OEM) can still use non-ECC memory.

What follows is pure speculation on my part: I think what the vendor is really saying is: "the additional cost to manufacturer and validate ECC support isn't justified by the marginal increase in sales." I thought I read once that the motherboard market is pretty competitive, with slim margins. So adding features needs to be fully justified by a proportional increase in sales. (Again, that's all a guess on my part.)

Another speculation: maybe it's too hard to market ECC boards to the non-server/enterprise market. How do you advertise the ECC feature without scaring away people who don't want/need it, or don't know what it is?

Anyway, I find it interesting that ECC has sparked such a lively discussion. I've seen similar enthusiasm for ECC on other forums as well. If that's any indication, I think there is a market for a mini-itx board targeted at the DIY NAS group. Take, for example, the Biostar A760G-M2+, shrink it down to mini-itx, add dual intel NICs, add an additional on-board SAS or SATA controller (e.g. LSI SAS1068E), and keep the PCIe x16 slot for general use. And support ECC officially, of course. Oh, and an integrated IPMI/IP-KVM (like the Supermicro X8SIL-F). Not sure if all that would fit on a mini-itx board, but it would be nice!

[quote="Monkeh16"]It's a mystery to me, because that makes no sense whatsoever. ECC does not make things compress better! Data is data, ECC doesn't change that.[/quote]

I think his point was that bit errors are much more noticeable in information-dense (compressed) formats. A single bit error could ruin an entire archive if proper safeguards are not in place. The same error might result in a minor video glitch in your high-bitrate video file. And in a bitmap it could be a single pixel error you might never see.Not sure what sort of archiving he was doing, but it sounds like he was noticing errors because of high compression with limited parity.Disclaimer: I am not familiar with the specifics of modern compression formats. I have no idea how much parity (or what type) they have.

KayDat:>So, what makes ECC so important that you need it in a home server, but not your desktop computer?

+1. This is a question I'd like to ask, too. My home server only stores DVD, HD-DVD and Blu-ray rips. It doesn't really do anything mission critical that would require ECC RAM. Only reason I can think of for opting for ECC memory is most people tend to leave servers on 24/7.

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum