Backblaze’s Basic Cloud Storage is 25 Times Cheaper than Amazon S3

Backblaze has launched version 2.0 of its cloud backup product, and the new version brings with it a host of changes that remove even more limits from a service that has always been billed as “unlimited.” I got the chance to talk with Backblaze co-founder Gleb Budman about how the service manages to offer cloud storage that’s some 25 times cheaper than Amazon’s S3 product. The secret, it turns out, is all in what you optimize for.

Backblaze’s basic pitch to consumers and businesses is simple: for $5 per month, you can back up an unlimited amount of data over the Internet. Not only is Backblaze cheap, but it’s also designed to be easy to use — install the client on your Mac or PC, and it starts backing everything up in the background, much like Apple’s Time Machine.

Previous versions of the backup service have had a hard filesize cap of 9 GB, and the types of files you could back up were limited. Backblaze’s new 2.0 release removes both of those limitations, so that you can now use the product to back up those extremely large VMware images and 1080p home videos. Backblaze has also included a host of performance tweaks that will reduce the service’s load on your system and on your network connection.

After checking out Backblaze’s pitch and the details of the 2.0 release, I called the company’s co-founder, Gleb Budman, to find out how his team managed to get their storage costs down so far below Amazon S3’s rates. After all, shouldn’t S3 be cheaper, since it’s a public cloud with massive economies of scale?

The answer to the question of how Backblaze beat Amazon in cost per petabyte comes in multiple parts, the first of which is that Backblaze’s storage is cheap because it had to be cheap, or else the company couldn’t have launched. When Budman and his co-founders decided to bootstrap their own online backup service in 2007 out of their own pockets, they began by surveying potential users to find out what the perfect monthly price point was for maximum adoption. They found that a price of $5 per month was the threshold for getting the majority of those surveyed to sign up for an online backup service.

“Our Initial plan was to put everything into Amazon S3,” Budman told Wired.com, “so we did the math on that to see how much data we could back up at $5/month, and it was 30 GB of data. So S3 just wasn’t going to work.” The company then began looking at enterprise-grade storage from the likes of Netapp and Hitachi, but those products command a 10X price premium per terabyte versus bare hard drives. Since they didn’t have venture capital backing, buying this kind of heavy-duty hardware was out of the question.

The reason that both Amazon’s S3 service and storage servers from Netapp are so expensive is that they both have many layers of complexity that are aimed at enhancing their performance, reliability, and accessibility for a wide variety of application classes.

“S3 is designed to serve many masters, to do many different things, to provide a public API,” Budman said. “Some people are going to do backup, some are going to host a website, and some will do other things.”

But all Backblaze wanted to do was write users’ data to hard disk and archive it long-term — they don’t even need to read the data very often. “We don’t need to do database transactions against our system,” Budman said. “The data is write-once, read-rarely.”

So what Budman and company set out to build was the simplest, cheapest possible long-term backup solution — a networked storage system made entirely of commodity parts, and without even a database component or a load balancer.

“What we wanted was raw hard drives, and we wanted those hard drives sitting on the Internet,” Budman explained. “In an ideal scenario, we’d plug an Ethernet cable right into a hard drive.”

The stripped-down storage system that the Backblaze team ultimately designed packs 135 TB of storage into a single, 4U Backblaze Pod system. Connected together via simple Gigabit Ethernet, these Pods make up Backblaze’s “write-once, read-rarely” storage infrastructure.

Backblaze later open-sourced the hardware specifications for the Pod, and now the product is in use at a wide variety of sites. Shutterfly switched all of their back-end storage to these Backblaze Pods, and Vanderbilt’s medical school is using the machines to store medical images.

Ultimately, Backblaze’s low cost points at something very important about cloud platforms that’s worth exploring at length in another piece. Specifically, more generalized public cloud platforms can suffer from what we might call diseconomies of scale — they have to do more in order to grow, but as their size and functionality grow so does the cost of their overhead. Because that overhead cost is distributed across a cloud’s entire user base, each user ends up paying for complexity he or she doesn’t in fact use. As Backblaze has demonstrated, when the amount of complexity that a user actually needs is much lower than what they’d be paying for in the public cloud, it can be drastically cheaper to go with something rolled in-house. This goes directly against the conventional wisdom that using a public cloud for a task is cheaper than taking that task in-house, but it’s nonetheless true.