Americans consume 3.6 zettabytes of data, most of it pixels

An estimate of US consumers' data consumption places the figure at 3.6 …

Researchers from the University of California, San Diego, have produced the latest in a series of reports entitled "How Much Information?" The goal of the work is to provide some estimate of the amount of content that a typical US consumer goes through in a given year, in this case 2008. When expressed in terms of raw bytes, the report's authors estimate a staggering 3.6 zettabytes, which works out to 34GB a day per consumer. But that figure counts everything, including pixels that are rendered and thrown onto the screen by gaming consoles. When considered in terms of words delivered to the actual human consumers, games barely register.

The report involved collecting a large number of estimates of various forms of media consumption: hours spent gaming, number of newspapers sold, etc. These were combined with estimates of the amount of information content of each of these, such as the number of words in a typical newspaper, and (when necessary), converting that into bytes. As such, there are undoubtedly significant error bars on most of these estimates, although they're not provided with the numbers in the report. Still, some of the differences are pronounced enough that it's fair to say that even large errors wouldn't change many of the overall conclusions.

The other caveat is that past studies of this sort, produced primarily at UC Berkeley (in part by Hal Varian, who's now with Google), have focused on unique content. As a result, they're not directly comparable to the current figures. The only prior study that is directly comparable dates from 1980, which is ancient history in digital terms.

The numbers are broken down in three ways. The first is raw hours of consumption, which counts time spent multitasking—browsing the Web while watching TV, for example—twice. All told, US consumers spend nearly a dozen hours a day immersed in some sort of media. Given the typical work and sleep requirements, that implies a lot of us are heavy multitaskers. Combined, traditional media like TV and radio still account for over half those hours, although the shrinking market for print is apparent, as only 0.6 hours a day were spent on that. Computers and gaming, which seem to dominate media discussions, still only account for about a quarter of the daily consumption.

When it comes to the information conveyed by these media, the report turns to a fairly crude measure: word count. Here, TV comes into the fore, accounting for nearly have the words ingested by the US consumers. Print actually registers as more significant here, while radio shrinks, as a lot of people are simply using it for listening to music. Computers do play a significant role in this measure of consumption, accounting for over a quarter of the words we see, but the impact of games plunges into insignificance here. Compared to the 1980 figures, we're seeing a 140 percent increase in the total number of words we encounter in a day.

The last measure the analysis examined was the number of bytes involved in the different forms of media. Here, because of their visual richness, computer games accounted for half the total data that US consumers are exposed to, TVs account for just over a third, and movies, for the first time, appear to be a significant contributor. Because of the amount of data involved in displaying visual information, the amount of total bytes is enormous—this is where the 3.6 zettabytes figure for annual consumption appears.

But it's important to note that much of this data is transient: TV broadcasts that only exist as a limited number of permanent copies, scenes rendered on-the-fly in games, etc. It's a necessary thing, too, as it's estimated that the total number of bytes consumed is 20 times larger than the entire world's hard disk capacity.

Compared to 1980 figures, the number of bytes has increased by 350 percent. That sounds like a lot until the annual rate of growth is considered: 5.4 percent, a number that's far below the rate of change for storage and processing power over that period. In short, we're able to handle a lot more content, but we don't seem to be doing so.

The authors provide a number of possible explanations for this discrepancy, including the fact that commercial data isn't considered, and the amount of direct, computer-to-computer data exchange appears to be on the rise. Still, part of this difference may reflect real trends. Although HD hardware and some HD programming are now available, for the most part, TV signals haven't changed dramatically since 1980. And, as more consumers turn to online video, compression ensures that they're actually consuming fewer bytes.

Although many of the numbers provided by the report are intriguing, and a few of the trends identified are likely to be significant, the large uncertainties associated with any of these figures suggest that they should be used cautiously. And the authors go into an extensive discussion of why measures like total bytes, or even total words may be poor measures of significance. After all, they point out, a typical episode of Lost blows away Lincoln's Gettysburg Address on both counts, but the latter will probably remain quite a bit more significant.