Practical Private Information Retrieval (2PIR)

Project Overview

In many distributed e-commerce scenarios, such as trading digital goods, the user acts by requesting (buying) a record from a database. The user might be interested in not revealing the record identity that he wants to buy. In other words, the user might want his privacy to be preserved. This might be trivially achieved just by encrypting the traffic between the user and the database server. The only problem is: How does the user hide the identity of a record from the database server. This problem appears especially important in the light of the recent cases, in which e-commerce companies illegally trade user preferences, such as lists of items requested by users.

By definition, a Private Information Retrieval (PIR) protocol allows a user to retrieve a record of his choice from a database server such that nobody (not even the server) observes the identity of the record. Many PIR protocols were developed since the research on PIR was started in 1995. However, those algorithms were hard to apply to real databases for two reasons. First, the algorithms are too complex computationally, that leads to the unacceptable response time of the server. Second, the database models considered are too simple, that makes the algorithms useless for practical applications.

The aim of the project is to develop new PIR algorithms and to optimize existing ones, so that they would:

provide reasonable performance

meet the real-world models

This research was supported by the German Research Society (DFG), Berlin-Brandenburg Graduate School in Distributed Information Systems (GKVI) (DFG grant no. GRK~316).

Question: Can't I just use anonymization techniques (like Crowds, Onion Routing etc.) instead of PIR to privately access a DB?

Answer: It depends on what you want.

PIR protects the contents of your queries. It does not hide the owner of the queries. Anonymization protocols (including that based on MIX nets) use crowd to hide the owner of the queries, although the queries themselves are not protected against the server.

Consequently, another difference between PIR and those tools is that PIR is resistant against the all-against-one conspiracy, and Mixes etc. are not. PIR is the next and probably the last step in the privacy protection.

Question: I have just typed "PIR" in Google, and got half a million links back. Is that all about Private Information Retrieval?

Answer: No, there are a lot of other terms abbreviated as PIR. A list of the links related to Private Information Retrieval you can find at this site.

Question: Look, I know nothing about computers. Could you explain to me what your research is about?

Answer: Look, have you ever seen a library? So far so good.

Normally, there is a librarian who has an exclusive access to the library. Once you are interested in a book, you talk to the librarian. The problem appears if you want nobody, not even a librarian, to know nothing about which book you are fond of.

By definition, you cannot bypass a librarian to access the book directly.

The only solution for you is to ask for the entire library to avoid a suspicion in being interested in a particular book. Not practical, both for you and for the librarian. One can do nothing about it for conventional libraries.

It looks like it is possible to find some practical solutions in the case of digital libraries. ... practical both for you and for a digital librarian. And I am working on it, because private access is as important for digital libraries as for paper libraries.

Similarly, these solutions could be applied to provide privacy for the customers ... the customers of digital goods.

To understand the rest, you ought to have a notion of two areas in computer science: digital libraries (databases) and digital magic (cryptography).