I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.

Tuesday, June 18, 2013

Not trusting cloud storage

I'm trying to work through my stack of half-completed blog posts. Some months ago Jane Mandelbaum at the Library of Congress pointed me to Towards self-repairing replication-based storage systems using untrusted clouds by Bo Chen and Reza Curtmola. It received an “Outstanding Paper Award” at CODASPY 2013. Let me start by saying that the technique they describe is interesting and, as far as I can tell, represents an advance in some important respects on earlier work.

However, it is also an example, if not a severe one, of the problem I discussed in my post Journals Considered Harmful of authors hyping their work in order to be published in a higher-profile forum, and reviewers failing to catch this exaggeration. Follow me below the fold for the details.
Chen and Curtmola propose a system with a client, the data owner, and a number of servers cooperating to store the owner's data reliably in untrusted storage. They describe their work as motivated by two insights:

The storage servers should be required to store replicas that are different, so that a malign server cannot forward a challenge to prove that its replica is undamaged to an unwitting confederate.

The storage servers should cooperate to regenerate a failed replica rather than placing the burden on the data owner to retrieve a complete replica to complete the process.

As a system preserving data for the long term, the LOCKSS system cannot rely on the continued existence of a single, external data owner, so it is not directly comparable. The second problem observed by Chen and Curtmola is in fact a requirement for a system such as LOCKSS that does not have a data owner to drive the repair process.

LOCKSS nodes store replicas that are the same, but mitigate the first problem using random sampling from the nodes and two nonces, one shared by all nodes in a sample, and one unique to each node in the sample. The malign node cannot be sure that the node to which it forwards the challenge will not also receive the same shared nonce from another node; an unambiguous signal of malfesance. Because, in effect, the per-node nonce makes the content being challenged different at each node, a conspiracy trying to behave as N nodes has to do N times as much work as a loyal node.

Chen and Curtmola's scheme imposes the extra cost of ensuring that each challenge is different once per replica, during its creation, instead of once per challenge, albeit a much smaller cost, as in the LOCKSS case.This, and their use of Proof of Possession (PoP), which checks a sample of the content, rather than the LOCKSS system's use of Proof of Retrievability (PoR), which checks the entire content, makes integrity checks cheaper and thus potentially more frequent. More frequent checks increase the system's reliability, although this must be balanced against the possibility that the sample of a PoP check might exclude some damage. The disadvantage of their scheme is that the initial ingest of the content to the system is expensive. To defeat the replicate-on-the-fly attack they require that generating each new replica be computationally expensive, but during initial ingest the client must generate all the replicas. On balance, in Chen and Curtmola's context their approach has significant benefits. It is not clear that these benefits transfer to other contexts.

Chen and Curtmola claim:

However, in all previous work [12,7] the Repair phase imposes a significant burden on the data owner, who needs to expend a significant amount of computation and communication.

and:

To the best of our knowledge, we are the first to propose the server-side repair strategy and an RDC [Remote Data Checking] scheme that implements it in the context of a real-world CSP [Cloud Storage Provider].

and:

Similar with [sic] the work of Reiter at [sic] al. [18], our scheme relies on the idea that only a prover which has the data can respond quickly enough to pass a challenge.

Very nearly 10 years ago, my co-authors and I won a Best Paper award at SOSP for Preserving Peer Replicas By Rate-Limited Sampled Voting, which was later expanded to LOCKSS: A Peer-to-Peer Digital Preservation System and published in ACM Transactions on Computing Systems. In it we describe a peer-to-peer storage system in which there is no data owner, and in which repair happens entirely between peers. Its defense against attack is based, in part, on the idea that a peer has a very limited time to respond to a challenge. Like Chen and Curtmola, we assumed a powerful adversary capable of conspiring among a large number of peers. This work forms the basis of the LOCKSS system, which has been in production use since before there were cloud storage systems such as Amazon's, which launched in 2006. Other similarities are described above.

These are not obscure papers; one is in a top conference and the other is in a significant journal. Between them, Google Scholar reports 88 citations. Among these citations are Chen and Curtmola's references 3, 4 and 11. Each of these references includes Curtmola as a co-author.

I am not arguing that it would have been appropriate for Chen and Curtmola to cite our work. It is a decade old and the field has moved on, making more recent works more relevant citations. Their context and that of the LOCKSS papers are significantly different. My concerns are two-fold:

That, knowing of our work, they made sweeping claims of novelty against "all previous work" which can only be substantiated by the narrowest of readings of their text. Their claim of novelty against "all previous work" may be valid if by "all previous work" they mean "all previous work on client-server based systems using conventional cloud storage providers", but that is not how it would be read, and reading it that way would significantly reduce the claimed impact of their work.

That, based on our fairly well-known (if older) papers, the reviewers (presumably) did not contest these sweeping claims and suggest less ambitious ones.

Without reform of our publication system, the incentives associated with increased pressure to publish in high-ranking journals will continue to encourage scientists to be less cautious in their conclusions (or worse), in an attempt to market their research to the top journals.

This illustrates how important it is that reviewers not blindly accept the author's claims for novelty and importance. The difficulty reviewers find in doing so is a strong argument for journals such as PLoS One, whose reviewers are not asked to judge the novelty or impact of a paper. Since the reviewers cannot be swayed by exaggerated claims, authors have less motivation to make them.