How big is the cloud?

Share This article

Last month, ExtremeTech revealed to you the true scale of internet porn. At any one time, streaming adult videos probably utilize around 30% of the internet’s total bandwidth, which equates to around 6 terabytes of porn being consumed every second. But what about the other 70%? Netflix, YouTube, and other non-adult video sites are huge bandwidth hogs, possibly accounting for as much as 40% of internet traffic. Digital file lockers, such as Rapidshare and Megaupload, account for around 10% of traffic worldwide. Web surfing and email (and spam!) are another 15%. And then there’s cloud computing.

Today, the vast majority of web services and sites are hosted in the cloud. By this I mean that, instead of companies (such as Ziff Davis/ExtremeTech) managing their own hardware, third-party cloud storage and computing services are used. Amazon Web Services (AWS), Microsoft Azure, and Google are three prominent examples of huge cloud clusters, but there are hundreds of smaller operations that range in size from a whole data center down to a few racks.

The power of the cloud is vested in the fact that it can be coerced and shoehorned into tasks as disparate as a cloud-based supercomputer, to webmail, to simple document storage. On a single cloud cluster, Google can host and serve petabytes of YouTube videos and store all of your email and documents. Of all the facets of the cloud, though, today we’re going to focus on cloud storage.

A Microsoft data center

While storage might not be as sexy as terabytes of RAM and thousands of CPU cores, it is the most reliable way of measuring the size of the cloud, especially when we factor in bandwidth usage. From the total amount of storage we can also work out the cost of cloud storage — and from there, we can finally work out why the likes of Google, Microsoft, and Dropbox are falling over themselves to provide cloud storage services.

Like the porn story, we’ll first start with some theoretical numbers, and then move onto some real-world figures (and hardware) from Backblaze, a cloud backup provider.

Petabytes

For the most part, real numbers from the big companies, such as Google, Facebook, Amazon, and Microsoft, are few and far between. If you scour the web, though, some rough ballpark figures emerge:

Facebook, in its IPO filing, said it stores over 100 petabytes (PB) of media (photos and videos). It’s not unrealistic to say that Facebook probably has a total storage of capacity well beyond that, once you factor in backups and other data (status updates, likes, and so on), possibly in the 300PB range.

Microsoft recently admitted that Hotmail stores over 100 petabytes, and that SkyDrive, with “17 million customers,” stores 10PB of data. Like Facebook, Microsoft’s total capacity, once we factor in the rest of Azure and its web properties, is probably well over 300 petabytes.

Amazon, rather than giving us a nice, easy number of petabytes, instead announces the total number of objects stored by its S3 cloud storage service. As of April 2012, Amazon S3 stored 905 billion objects. If we assume an average size of 100KB, that’s around 90 petabytes; if the average size is 1MB, that’s 900 petabytes — almost an exabyte!

Dropbox, a year ago, stored “10+ petabytes” of data. It had 25 million users then, and 100 million users today, so all things being equal the company now stores around 40PB of data.

To put these figures into perspective, an average computer probably has a 500GB or 1TB hard drive, and a petabyte is 1024TB. At the very least, then, Microsoft and Facebook data centers play host to more than 100,000 hard drives. Without building custom hardware, you can squeeze 48 drives into a 4U enclosure. After accounting for networking gear, that means you’re probably looking at around 400 hard drives per 40U rack — or 250 racks, each of which occupies around one square meter of floor space. This might sound like a lot, but when you consider that Google, Amazon, Facebook, and Microsoft regularly roll out data centers with floor plans of over 30,000 square meters (300,000+ square feet), it’s really not that much. In the grand scale of things, a lot more space is dedicated to servers (i.e. CPUs) and networking gear.

Bandwidth

Bandwidth-wise, we have even less data from the big boys. We know that, as of last year, one million files were being saved every five minutes — so today, with four times as many users, that’s 800,000 files per minute. Amazon S3, which is significantly larger than Dropbox, handles “650,000 requests per second.”

If we assume that the average file stored on Dropbox is 500KB (a mix of photos, videos, and documents) then Dropbox stores a total of 400,000 megabytes (0.4TB) per minute — or 6.7GB per second (54Gbps). We don’t have any data on how much data Dropbox sends per minute (i.e. people downloading files from their Dropbox), but it’s probably in the region of 10 to 20Gbps.

Amazon S3, which is mainly used to store static files for websites (images, style sheets, videos), probably has a lower average file size than Dropbox. If we assume an average size of 100KB per file, then 650,000 requests per second comes to a grand total of 61 gigabytes of data transferred per second, or 488Gbps. This is very close to the 800Gbps figure that we estimated for a large porn site, which equates to around 2% of total internet traffic — Amazon is pretty darn big!

Facebook and Microsoft, with between 100 and 300PB of storage each, probably fall somewhere between Dropbox and Amazon in terms of bandwidth usage — maybe 200Gbps a piece.

I pay for both dropbox and google drive, and I find they both have their place (for now).

Dropbox gives me some peace of mind knowing that they don’t have advertising or world domination motives behind their cloud services ;) I see dropbox as an OS neutral alternative to the corporate giants. Dropbox specializes in cloud storage and it works quite well on the devices I’ve tested it on. It doesn’t seem to favor one OS over another as far as I know. I don’t like dropbox’s pricing, but I’m guessing they will have to change their pricing in the near future in order to compete.

Some features I love about google drive, is the collaborative document editing and versioning. I can roll back to previous versions if I need to. I also like their tagging structure because it helps keep me more organized than Windows or dropbox.

http://cnp-keythai.com/ Cindy Wu

going great cloud. It’s very fast.

ChrisChristoff

Ad for GoDaddy says it manages 25 petabytes.

René W. Vergé

Your probably have double counting in there since DropBox uses Amazon for storage. I would asume that this may also be the case for the other numbers. What we know though, is that there’s a lot of data out there…

Ellad Kushnir

Thanks for this interesting article! I was wondering where I might be able to find information on the average storage usage by end users. Thanks!

Use of this site is governed by our Terms of Use and Privacy Policy. Copyright 1996-2015 Ziff Davis, LLC.PCMag Digital Group All Rights Reserved. ExtremeTech is a registered trademark of Ziff Davis, LLC. Reproduction in whole or in part in any form or medium without express written permission of Ziff Davis, LLC. is prohibited.