I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.

Saturday, June 9, 2007

A Petabyte For A Century

A talk at the San Diego Supercomputer Center in September 2006 was when I started arguing (pdf) that one of the big problems in digital preservation is that we don't know how to measure how well we are doing it, and that makes it difficult to improve how well we do it. Because supercomputer people like large numbers, I started using the example of keeping a petabyte of data for a century to illustrate the problem. This post expands on my argument.

Lets start by assuming an organization has a petabyte of data that will be needed in 100 years. They want to buy a preservation system good enough that there will be a 50% chance that at the end of the 100 years every bit in the petabyte will have survived undamaged. This requirement sounds reasonable, but it is actually very challenging. They want 0.8 exabit-years of preservation with a 50% chance of success. Suppose the system they want to buy suffered from bit rot, a process that had a very small probability of flipping a bit at random. By analogy with the radioactive decay of atoms, they need the half-life of bits in the system to be at least 0.8 exa-years, or roughly 100,000,000 times the age of the universe.

In order to be confident that they are spending money wisely, the organization commissions an independent test lab to benchmark the competing preservation systems. The goal is to measure the half-life of bits in each system to see whether it meets the 0.8 exa-year target. The contract for the testing specifies that results are needed in a year. What does the test lab have to do?

The lab needs to assemble a big enough test system so that, if the half-life is exactly 0.8 exa-year, it will see enough bit flips to be confident that the measurement is good. Say it needs to see 5 bit flips or fewer to claim that the half-life is long enough. Then the lab needs to test an exabyte of data for a year.

The test consists of writing an exabyte of data into the system at the start of the year and reading it back several times, lets say 9 times, during the year to compare the bits that come out with the bits that went in. So we have 80 exabits of I/O to do in one year, or roughly 10 petabits/hour, which is an I/O rate of about 3 terabits/sec. That is 3,000 gigabit Ethernet interfaces running at full speed continuously for the whole year.

At current storage prices just the storage for the test system will cost hundreds of millions of dollars. When you add on the cost of the equipment to sustain the I/O and do the comparisons, and the cost of the software, staff, power and so on, its clear that the test to discover whether a system would be good enough to keep a petabyte of data for a century with a 50% chance of success would cost in the billion-dollar range. This is of the order of 1,000 times the purchase price of the system, so the test isn't feasible.

I'm not an expert on experimental design, and this is obviously a somewhat simplistic thought-experiment. But, suppose that the purchasing organization was prepared to spend 1% of the purchase price per system on such a test. The test would then have to cost roughly 100,000 times less than my thought-experiment to be affordable. I leave this 100,000-fold improvement as an exercise for the reader.

9 comments:

I forgot to mention that, if we compare 64 bits at a time, we need to do about 140 peta-comparisons to do the test. And we need to be confident that there is a very low probability that any of the comparisons will be performed incorrectly. Say we need only a 1% probability that an error will creep in. We need to have comparison hardware and software capable of about 1 in 10 exa-operations reliability. That's 18 nines of reliability.

Is it easier and feasible to store the blob of data for a year or two, then to have a person look at it and perform some tests to show that it is still fit for purpose? This may be a case where people are less costly than technology, and where a fit for purpose definition is more useful than a bit-flip definition.

David Bowen makes two points. First, suggesting that "fit for purpose" might be a more useful standard than faithful preservation of the bits. Second, that a "fit for purpose" standard could be assessed by human viewing rather than automatically.

For obvious economic reasons, large static collections of data will be stored in compressed form. Data is compressed by eliminating redundancy. As redundancy is eliminated, the effect of a single bit flip is increased, until in the limit of perfect compression a single bit flip would corrupt the entire compressed data block. Thus the standard to which storage needs to be held is faithful storage of each bit; anything less will increase the cost of ensuring that the result is "fit for purpose".

As regards human viewing of the data to detect preservation failures, a petabyte of data is far too large for this to be feasible. Consider that a petabyte would be about 500,000 hours of DVD video. If it were to be reviewed every two years, as suggested, it would need a team of 125 viewers watching 40 hours a week. How certain would you be that not a single dropped or corrupted frame would escape their eagle eyes? Adequately auditing preserved data is always going to be such a mind-numbingly boring task that only programs could do it.

Clearly, storing multiple replicas of the data, auditing them, and repairing any damage are all essential tasks of any preservation system. But my larger point is that, no matter how well these tasks are implemented, whether by programs or by humans, the task of storing a petabyte for a century with a 50% chance that every bit survives unimpaired requires such an extraordinary level of reliability that we cannot know whether any system we build to do it will actually succeed.

"Lets start by assuming an organization has a petabyte of data that will be needed in 100 years. They want to buy a preservation system good enough that there will be a 50% chance that at the end of the 100 years every bit in the petabyte will have survived undamaged. This requirement sounds reasonable, but it is actually very challenging. They want 0.8 exabit-years of preservation with a 50% chance of success. Suppose the system they want to buy suffered from bit rot, a process that had a very small probability of flipping a bit at random. By analogy with the radioactive decay of atoms, they need the half-life of bits in the system to be at least 0.8 exa-years, or roughly 100,000,000 times the age of the universe."

This para produces a very alarming number, but it has worried me ever since I heard it. I have been struggling to work out why. One reason I'm worried is that a petabyte sounds like a large number, but I've been in this game long enough to remember when a Gigabyte was an impossibly large number, and now there are 100 of them on this laptop.

If a PB-century requires a half-life of 100 million times the age of the universe, then keeping 1 GB for a century with the same stringency should require a half life of 100 times the age of the universe. Keeping my 100 GB for 1 year should require the same half life. Except I don't want a 50 % chance of success; I'm expecting a >99% chance of success. So the bit half life I can realistically expect must surely be much greater than 100 times the age of the universe. Which means its intuitive "badness" is not so.

Presumably the incredible half life of real bits is due to multiple layers of technology, checksums etc?

I guess there is no such thing as immortal data. The best the companies having those huge data can do is to replicate again and again all the data that they have from one machine to another repairing all that has been damaged. The very questions goes into how to create a storage sturdy enough to withstand the test of time preserving the data it contains. The elements composing the data storage has a varied half-life unless we could invent something of pure silver, or pure gold to predict the lifespan itself. Then we can go into preserving the data to match the half-life of the storage itself.