“How much server space do companies like Google, Amazon, or YouTube, or for that matter Hotmail and Facebook need to run their sites?” is the question I’ve been asked to answer on ABC Radio National Drive this evening.

This isn’t a simple question to answer as the details of data storage are kept secret by most online services.

An exabyte is the equivalent of 50,000 years worth of DVD video, a typical new computer comes with a terabyte hard drive so one exabyte is the equivalent of a million new computers.

The numbers when looking at this topic are so great that petabytes are probably the best way of measuring data, a thousand of these make up an exabyte. A petabyte is the equivalent to filling up the hard drives of a thousand new computers.

Given cloud computing and data centres have grown exponentially since 2007, it’s possible that number has doubled in the last five years.

For Amazon details are harder to find, in June 2012 Amazon’s founder Jeff Bezos announced their S3 cloud storage service was now hosting a billion ‘objects’. If we assume the ‘objects’ – which could be anything from a picture to a database running on Amazon’s service – have an average size of a megabyte then that’s a exabyte of storage.

The amount of storage is only one part of the equation, we have to be able to do something with the data we’ve collected so we also have to look at processing power. This comes down to the number of computer chips or CPUs – Central Processing Units – being used to crunch the information.

Probably the most impressive data cruncher of all is the Google search engine that processes phenomenal amounts of data every time somebody does a search on the web. Google have put together an infographic that illustrates how they manage to answer over a billion queries a day in an average time of less than quarter of a second.

The numbers involved in answering the question of how much data is stored by web services are mind boggling and they are growing exponentially. One of the problems with researching a topic like this is how quickly the source data becomes outdated.

It’s easy to overlook the complexity and size of the technologies that run social media, cloud computing or web searches. Asking questions on how these services work is essential to understanding the things we now take for granted.

12 Responses to “How much server space do Internet companies need to run their sites?”

As I got older, there were times when I have moved house, upsized, downsized for my changing needs. Through this process, I learnt to let go of the physical load of memories and ‘junk’ that weighed me down. So instead of coining the next “wecan’tfit anythingmoreinhere’byte term, when will the concept of culling be applied to the digital world? When do we stop packing everything into the shed that we all share now(ie. the internet)? Let’s stop, breathe and clean out the junk that we will never look at again and relieve us and the people around of us of this burden. When will we stop building bigger places in our physical, mental, emotional and digital states to store more stuff rather than cleaning out our cupboards, thoughts, lives and the internet to make our lives cleaner and more simple?

Kim, I don’t think we’ll see that culling soon. If anything it’s going to get worse as digital storage increases. The real challenge, as much for individuals as businesses, is in managing these masses of data.

On the other hand, finding an obscure video on Youtube that reminds you of something you experienced in your childhood 20 years ago is an amazing feeling. The Internet is like an unlimited Library of Alexandria. For almost every piece of information there is somebody somewhere who will be extremely happy that it was preserved. After all, if we start judging what information is “worthy” of preserving, who gets to decide what’s “worthy” information and what is not? Is your opinion that something is “junk” more valuable than somebody who will cry when they see it twenty years from now?

[…] engines of our informational and social world (Google, Facebook, etc.) could currently or soon have at least exabytes (10006) of information on us. This means that we have to know much more about them: secrecy in corporations as powerful as these […]

[…] engines of our informational and social world (Google, Facebook, etc.) could currently or soon have at least exabytes (10006) of information on us. This means that we have to know much more about them: secrecy in corporations as powerful as these […]

Wow…it’s just mind blowing. I was just thinking of my first hard drive. 20Mb probably the size of 8 CD cases stacked,just the physical media of it and probably cost 500 bucks. I just spent 20 minutes looking for a 32gb sd card I dropped cause it’s the size of a finger nail. That sd card was 15 bucks. I remember Moore’s law,thought I Understood it,was excited by it when I was younget,now I’m just scared by it honestly. A new ok desktop is 500 bucks and that comes with a terabyte of storage, that’s 5 million times more than my first computer. My cell has more computing power than NASA had in the 80’s and probably most of the 90’s. Just mind blowing and headache causing when you really stop to think about it.