Backblaze runs most of its storage on consumer-grade drives, but also has a selection of enterprise-class systems from Dell and EMC.

It turns out that the consumer drive failure rate does go up after three years, but all three of the first three years are pretty good

"We have also been running one Backblaze Storage Pod full of enterprise drives storing users’ backed-up files as an experiment to see how they do. So far, their failure rate has been statistically consistent with drives in the commercial storage systems," said engineer Brian Beach in a blog post.

Over four years, Backblaze has tracked 14,719 drive-years - representing the number of drives it has, multiplied by their lifespan (for consumer-grade hardware) – finding 613 failures, which is a failure rate of 4.2%.

However, he admits that the data covers only the first two years of operation, since the firm has only used enterprise drives for this period.

"It turns out that the consumer drive failure rate does go up after three years, but all three of the first three years are pretty good," he notes. "We have no data on enterprise drives older than two years, so we don’t know if they will also have an increase in failure rate. It could be that the vaunted reliability of enterprise drives kicks in after two years, but because we haven’t seen any of that reliability in the first two years, I’m sceptical."

He also pointed out that the company's usage of the drives is different, with enterprise drives being used more heavily than their consumer counterparts. However, he noted that the enterprise drives are "coddled in well-ventilated, low-vibration enclosures".

Worth the cost?

Are enterprise drives worth the extra cost, then? "From purely a reliability perspective, the data we have says the answer is clear: No."

However, Beach noted one reason why a company might choose enterprise-class drives.

"Enterprise drives do have one advantage: longer warranties. That’s a benefit only if the higher price you pay for the longer warranty is less that what you expect to spend on replacing the drive."

"This leads to an obvious conclusion: If you’re okay with buying the replacements yourself after the warranty is up, then buy the cheaper consumer drives."

Read more about:

You are here:

Comments

There's more to "reliability" than "has my drive broken". Look at non-recoverable read errors per bits read for typically a 100 fold difference between consumer and enterprise-class drives. Sure, your consumer drive is spinning nicely and handing back data. But is it the same data you stored? Much less likely with consumer drives. The NRER for a consumer drive is typically 1E-14 which sounds really low until you start working out how many bits are actually read in a full rebuild of a 3-disk RAID 5 array of 4GB disks.

My sample was much smaller (about 30 disks over 3 years period), but when I had consumer level Seagate drives in some heavily utilized Neatgear ReadyNAS they were failing like crazy (hard drives were on Netgear compatability list). Once I replaced them with enterprise level WD, failures stopped.

"we replace the drives when the read/write errors become too high"AS I read that, some of your drives exhibit read and/or write errors, and you *continue to use them* until the error rate exceeds some threshold. Does that mean you have drives in production that you know are not providing complete data integrity?It's one thing to have a failing drive and not know it - it's quite another to have a known failure and keep using it, especially for an online storage firm.I really hope I've misunderstood what you wrote.

... is there actually any physical difference between consumer and enterprise? Maybe it is just a commercial distinction regarding the warrantee and level of service surround.The longer warrantee might be valuable is for non-random failures i.e. a whole batch started failing earlier than expected due to a manufacturing variation or design fault.

@ElectronShepherd: If the filesystem can guarantee data integrity, individual disk errors can be logged and corrected. If it can't, chose a better one! It is better to know when errors occur than risk silent data corruption.

Hi ElectronShepherd, we actually checksum all files written/read from the disks, and we replace the drives when the read/write errors become too high.Taking that into account, the enterprise drives still fail more than the consumer ones.

@ElectronShepherd, never trust important data to a filesystem without end to end checksums! My observations are similar to Backblaze. We have a mix of consumer and enterprise drives in production, with similar failure rates (in our case, ~2% after 5 years). One thing that is impossible to predict is how the current crop of consumer and enterprise dives will fair long term. Manufacturing processes are continuously evolving, so historical data only have limited value.

@ElectronShepherd, never trust important data to a filesystem without end to end checksums! My observations are similar to Backblaze. We have a mix of consumer and enterprise drives in production, with similar failure rates (in our case, ~2% after 5 years). One thing that is impossible to predict is how the current crop of consumer and enterprise dives will fair long term. Manufacturing processes are continuously evolving, so historical data only have limited value.

Hi ElectronShepherd, we actually checksum all files written/read from the disks, and we replace the drives when the read/write errors become too high.Taking that into account, the enterprise drives still fail more than the consumer ones.