Thursday, November 15, 2012

Our analysis of the costs of cloud storage assumes that the only charges for bandwidth are those levied by the cloud storage service itself. Typically services charge only for data out of the cloud. From our privileged viewpoint at major universities, this is a natural assumption to make.

At The Register Trevor Potts looks at the costs of backing up data to the cloud from a more realistic viewpoint. He computes the cost and time involved for customers who have to buy their Internet bandwidth on the open market. He concludes that for small users cloud backup makes sense:

I can state with confidence that if you have already have a business
ADSL with 2.5Mbps upstream and at least a 200GB per month transfer limit
(not hard to find in urban areas in most developed nations) then cloud
storage for anything below 100GB per month will make sense. The
convenience and reliability are easily worth the marginal cost.

For his example large user at 15TB/mo with a 100Mbit fiber connection, the bandwidth costs from the ISP are double the storage charges from Amazon, for a total of $4374. And recovery from the backups would cost about as much as a month's backup, and would take a month to boot. That simply isn't viable when compared to his local solution:

The 4TB 7200 RPM Hitachi Deskstar sells for $329 at my local computer retailer. Five of these drives (for RAID 5) is $1,645; a Synology DS1512+ costs $899. A 10x10 storage unit is $233/month, and the delivery guy costs me $33 per run. So for me to back up 15TB off-site each month is $2,800 per month.

Of course, in many cases libraries and archives are part of large institutions and their bandwidth charges are buried in overhead. And the bandwidth usage of preservation isn't comparable to backup; the rate at which data is written is limited by the rate at which the archive can ingest content. On the whole, I believe it is reasonable for our models to ignore ISP charges, but Trevor's article is a reminder that this isn't a no-brainer.

Monday, November 12, 2012

A valid criticism of my blog posts on the economics of long-termstorage, and of our UNESCO paper (PDF), is that we conflate Kryder's Law, which describes the increase in the areal density of bits on disk platters, with the cost of disk storage in $/GB. We waved our hands and said that it roughly mapped one-for-one into a decrease in the cost of disk drives. We are not alone in using this approximation, Mark Kryder himself does (PDF):

Density is viewed as the most important factor ... because it relates directly to cost/GB and in the HDD marketplace, cost/GB has always been substantially more important than other performance parameters. To compare cost/GB, the approach used here was to assume that, to first order, cost/GB would scale in proportion to (density)-1

My co-author Daniel Rosenthal has investigated the relationship between bits/in2 and $/GB over the last couple of decades. Over that time, it appears that about 3/4 of the decrease in $/GB can be attributed to the increase in bits/in2. Where did the rest of the decrease come from? I can think of three possible causes:

Economies of scale. For most of the last two decades the unit shipments of drives have been increasing, resulting in lower fixed costs per drive. Unfortunately, unit shipments are currently declining, so this effect has gone into reverse.

Manufacturing technology. The technology to build drives has improved greatly over the last couple of decades, resulting in lower variable costs per drive. Unfortunately HAMR, the next generation of disk drive technology has proven to be extraordinarily hard to manufacture, so this effect has gone into reverse.

Vendor margins. Over the last couple of decades disk drive manufacturing was a very competitive business, with numerous competing vendors. This gradually drove margins down and caused the industry to consolidate. Before the Thai floods, there were only two major manufacturers left, with margins in the low single digits. Unfortunately, the lack of competition and the floods have led to a major increase in margins, so this effect has gone into reverse.

Thus it seems unlikely that, at least in the medium term, causes other than Kryder's Law will contribute significantly to reductions in $/GB. They may even contribute to increases. We have already seen that even the industry projections have Kryder's Law slowing significantly, to no more than 20% for the next 5 years.

Thursday, November 8, 2012

Andrew Brown asked to see the echocardiogram of his ticker, which was
taken eight years ago. He was told that although the scan is still on
file in the Worcestershire Royal hospital, it will cost a couple of
grand to recreate the data as an image because it is stored in a format
that can no longer be read by the hospital's computers.

But looked at more closely below the fold we see that it isn't so simple.