I am trying to come up with what could maybe be a novel algorithm for an application I am writing. Client A has a file fA. Client B has file fB. Each party is untrustworthy and will try to rip off the other party. Client A wants the fB and client B wants fA.

How can I make an algorithm were it is not possible for them to screw the other player.

For example:
A encrypts fA and send the file to B.
B encrypts fB and sends it to A.
Now, both A and B have an encrypted version of the file they want.
How can they both share the decryption key at the same time... or what?

Is this even possible? This is breaking my mind.

Since each client will try to rip the other party, Client A cannot send the file fA to client B and expect client B to send back its file. If client A sends it first "with no guarantees", client B will simply run off with the file.

In real life a person holds the pot in one hand, the other holds the cash in the other. They both trade each hand at the same time. The guy with the cash now ends up with the pot and vice versa. How can I do something similar with computer bits. (client A and B are on a network at different locations).

EDIT: Please note that each party has a hash of the file they are supposed to get. Therefore, the parties cannot send a junk file to the other guy whit-out him knowing. A trusted third party will provide the hash.

I think the only way would be through a trusted third party, who gets both files, calculates the hashes, and then forwards them to the partners. I have no proof of this, though.
–
Paŭlo EbermannJan 12 '12 at 0:58

Can the hash that A and B each have be created by a customized hashing algorithm specifically designed to make this algorithm work?
–
David SchwartzJan 12 '12 at 2:02

8 Answers
8

It can be done, but the algorithm is a bit complex and the hashes have to be specially constructed. The basic idea is this:

What you need is a way for each party to give the other a verifiable "clue" that reduces the search space for the possible file, say by a factor of 10. As soon as one party stops giving clues, the other stops giving clues as well. So a deceiver can, at worse, give you one fewer clue than you have given him.

That means you may have to do 10 times as much computation as he does to recover the file, but that's not enough to make the task possible for him but impossible for you.

One way to do this is to encode the file such that every byte is needed to recover any byte of it (for example, you can do this with no waste of bytes using an erasure code). Then you can make the hash actually include four hashes. One is of the entire encoded file. The next is of the file with the last two bytes missing. The next is of the file with the next three bytes missing. The last is of the file with the next four bytes missing.

You can then exchange all but the last 10 bytes of the file with the other side and validate this by hash. Then you can exchange the next four bytes and validate. Then the next three, and so on through the four rounds. At worse, a deceiver leaves you with 256 times more work to do than he has. So if he needs the file within a day, at worse you get the file within 256 days.

You can switch to nibbles and make this 16 instead of 256 at the cost of more hashes and more exchanges.

There is a classical article by Boneh and Naor which describes a possible solution for a problem which looks similar to yours. It relies on "timed commitments". Given a piece of data D which fulfills some algebraic properties, it is possible to compute a value C which can be sent to another party, with the following properties:

it is possible to efficiently prove to the other party that the value D hidden in C fulfills the "algebraic properties", but without revealing D itself (a zero-knowledge proof);

at some later date, you can reveal D itself, and the other party can quickly verify that D matches the commitment C;

if you do not reveal D, the other party can still recompute D from C, but at the price of a heavy computation, whose cost can be configured at will and made arbitrarily high; moreover, that computation is adverse to parallelism, i.e. it cannot be sped up by throwing more hardware at it;

you can do fine-grained partial revealing, i.e. giving away pieces of information which reduce the computational cost of force-opening the commitment (these partial information do not give some information on D, the other party still has to force-open C, but this task is made less expensive).

Note: the ZK proof also covers the cost of force-opening: the verifier of that proof is convinced that he could force-open the commitment with a given effort $2^k$.

In the Boneh-Naor protocol, the data D is a digital signature, and the "algebraic properties" of D are the fact that D is indeed a valid signature over a given known text T. The context is contract-signing: Alice and Bob want to sign a contract T, but neither wants to give away their signature without getting one from the peer (if Bob has a signature by Alice over T but Alice does not get a signature of Bob over T, then Bob gains a tactical advantage, which is what Alice wants to avoid; and vice versa). So the protocol looks like this:

Alice computes her signature SA over T, then computes a commitment CA over SA, and the ZK proof that the committed signature is valid. The commitment strength (cost of force-opening it) is tuned to an extremely high value $2^k$, similar to the cost of actually breaking Alice's public key (i.e. infeasible)(e.g. $k = 80$). Alice sends CA (+ZK proof) to Bob.

Alice does a first partial revealing on her commitment; with that information, Bob could now force-open CA with effort $2^{k-1}$.

Bob responds by partially revealing his commitment, reducing cost of force-opening CB by Alice to $2^{k-1}$.

Alice then does a second partial revealing, reducing cost of force-opening CB to $2^{k-2}$.

And so on.

If either Alice or Bob breaks out of the protocol at any point, both Alice and Bob can potentially finish the opening job, but only at similar costs: the cost for Alice will be no more than twice the cost for Bob, and no less than half the cost for Bob. In that sense, the protocol is fair.

The cornerstone of the protocol is the "algebraic properties": at any point, Alice and Bob must be convinced that force-opening would ultimately result in the kind of value that they are expected (in the Boneh-Naor protocol, a valid signature). It does not work with just any data. In your scenario, you assume that Alice and Bob have hashes of the data element they wish to obtain. A Boneh-Naor-like protocol could be applied if you find a way to make commitments-with-ZK-proof of correspondance between the committed value and the known hash -- this does not seem easy.

Of course there's "a way ... known hash": the committer gives ordinary commitments to sufficiently many random bits, the receiver sends an equal amount of random bits, the committer sends the timed commitment using the ordinarily-committed bits xor the random bits provided by the receiver as the 'random bits' for the timed commitment protocol, and provides a ZK proof of the correctness of the timed commitment.
–
Ricky DemerJan 12 '12 at 23:01

The same goes for doing this with bits, you need to have a trusted third-party verify the authenticity and correctness of the files to be exchanged.

Cryptography can provide the means to insure that a message is from a given source, but it cannot verify that the contents of a file satisfy the requirements of the recipient.

EDIT: How is the hash obtained? From the untrusted player? If so, they can't know it isn't junk. If they got it elsewhere, then that must be a trusted third-party that could be used to escrow...

EDIT2: The problem boils down to having both parties be statisfied with an encrypted message without being able to decrypt it before either of them can decrypt eachothers' message. This is just not possible without a third-party that has the ability to verify the cleartext of both messages.

EDIT3: Disk-space must be preserved on the escrow (E) host.

A securely provides E with fA.

E verifies fA and generates a public/private key pair (pubA/privA)

E encrypts fA using privA enc(fA,privA)

E generates a hash of file enc(fA,privA) and stores it

E securely gives A privA and keeps pubA secret

Now B must do the same.

B securely provides E with fB.

E verifies fB and generates a public/private key pair (pubB/privB)

E encrypts fB using privA enc(fB,privB)

E generates a hash of file enc(fB,privB) and stores it

E securely gives B privB and keeps pubB secret

Once both parties have done this:

A encrypts fA with privA and gives it to B

B ebcrypts fB with privB and gives it to A

A verifies that the hash of encrypted file matches the hash E has for it

If it matches, A tells E it has the correct file

If not, B tried to scam us and we know it, ask for real file

B verifies that the hash of encrypted file matches the hash E has for it

If it matches, B tells E it has the correct file

If not, A tried to scam us and we know it, ask for real file

Once both parties have verified to E that they have the correct file

E gives A pubB securely, A can now decrypt enc(fB,privB)

E gives B pubA securely, B can now decrypt enc(fA,privA)

Total disk-space required on E per file exchange: 2 hashes and 4 keys.

I skipped the crypto between A&E and B&E since that's somewhat trivial and not interesting in solving this use case.

I added as an edit that each party can verify the integrity of the file. They each have a hash of the file they are expecting to receive. Is it possible with this information to insure they both send their files to the other party.
–
Alexandre H. TremblayJan 12 '12 at 0:20

I've added an edit as well.
–
Ben SJan 12 '12 at 0:20

The hash will be provided by a trusted third party. But the function of the third party will be limited to providing an hash of the file. Think a bitorrent file. The hash is inside the torrent, but the files still have to be shared.
–
Alexandre H. TremblayJan 12 '12 at 0:22

That doesn't make sense to me. In order to provide a trust-worthy hash, the 3rd party will have to have access to the file. At that point it may as well escrow the files.
–
Ben SJan 12 '12 at 0:32

Ben S. Escrowing the files will require to have a high capacity web server whit a capacity that increases proportionally to the number of clients? That would be too much for my application. Or is there another way to do it that i do not see? How would the escrow work?
–
Alexandre H. TremblayJan 12 '12 at 0:54

Without a trusted third party or communication channel with unusual properties, it is provable that no such algorithm can exist. Here's the proof:

1) Any algorithm can be modeled as a sequence of messages, A to B, then B to A, alternating. If the algorithm has consecutive sends in the same direction, merge them and consider them a single message.

2) If the algorithm contains any "optional" messages, such that A and B still get their files without them, consider the algorithm without any such messages. Assume neither side sends any optional messages (since they don't have to, that's what "optional" means.

3) After the last non-optional message, A must have its file and B must have its file.

4) Before the last non-optional message, neither A nor B may have its file. If both sides have their file, the non-optional message is, by definition, optional, which is a contradiction. And, before the last non-optional message, it cannot be that exactly one side has its file, otherwise not transmitting the last message allows one side to opt not to send that message and get its file but not give the other side its file.

5) Thus the last message must convey information previously unknown to A and information previously unknown to B. But neither side can transmit information it does not know. Thus neither side can send the last message. Thus the protocol cannot have a last message, cannot terminate, and thus cannot exist.

However, you could do it with specially-constructed hashes. The basic idea is that each side gives the other information that reduces the other party's search space. So if either party bails very early, neither party will have enough information to make their search possible. If either party bails later, the other party will still have enough information to get their file, just with extra work.

Hi @David, I merged the duplicate cross-posted question into this one as they're identical and there were good answers there - however, you've answered both! Two answers isn't a problem, just letting you know so if you want to make any edits/amendments in light of the merge, you can.
–
user46Jan 12 '12 at 20:56

where phi is Euler's totient function, so phi(n) = (p-1)(q-1). Set the private key d to be the multiplicative inverse of e mod phi(n), i.e.

e * d = 1 (mod phi(n))

You encode a message by

C = M^e (mod n)

and you decode a coded message by

M = C^d (mod n)

Now take Alice and Bob as the untrustworthies. Have Alice pick a system (n_A, e_A, d_A) and Bob pick a system (n_B, e_B, d_B). Suppose Alice wants to send Bob a message M. Alice does this by sending Bob two encoded messages:

from your link, I get that RSA is used to prove the identity of the users. This is not what I am trying to solve. I want to exchange two files between people in a way that none of the party can run off with one file. One party has to share first, or can they be forced (trough an algorithm) to share at the same time.
–
Alexandre H. TremblayJan 12 '12 at 0:09

1

That doesn't solve the problem...
–
Ben SJan 12 '12 at 0:09

@AlexandreH.Tremblay It does. I'll post an example in a bit.
–
PengOneJan 12 '12 at 0:10

This is a typical RSA based solution, but wont work. Let us assume they exchange messages simultaneously. What if Alice sent correct files, and Bob sent junk file? What can Alice do about it? Nothing!
–
ElKaminaJan 12 '12 at 0:49

There is no such thing as 'simultaneous' or 'at the same time' in distributed computing. There is not even a 100% correct method to let 2 remote computers agree on atomic time. What you want is that both files become accessible at the same moment (keys are sent simultaneously). If one party finishes a fraction of a second earlier they can back off.

You could try to reduce the period in which they can back off, by using some sort of Handshaking method. Handshaking is used in Telecom to negotiate parameters. You could figure something out to agree on the key exchanging or the time or interval this will take place.

Usually a combination and public-private keys (RSA) works. But your constraints are so strict, I don't think any solution would work. At the end of the day A can send any junk file instead of fA and there is no way B can know it. Somehow if it is possible for B to know what A has sent is in fact fA then probably I can come up with something.

EDIT: I take that back. Even if you can verify the authenticity of the file, it is probably not possible to solve your problem.

If you synchronously exchange keys:

Let us assume you exchange keys simultaneously at an instant. Even then, since you can take back key you sent, it is not possible to assure no fraud at each end

If it is not possible to exchange keys simultaneously, ..... then forgetaboutit!

Each party has a hash of the file they are supposed to get. I added it as an edit to the question. Is this enough information about the files?
–
Alexandre H. TremblayJan 12 '12 at 0:15

It is enough information, but that also implies you're trusting the opponent to give you a hash or have a third-party that could be used as an escrow as well as a hash-provider.
–
Ben SJan 12 '12 at 0:20

Think bittorrent. The hash is inside the torrent, but the party still have to share the files. So yes, a third party will provide the hash.
–
Alexandre H. TremblayJan 12 '12 at 0:21

In bittorrent, the initial seed generates the hash and is therefor potentially forged and not trustable.
–
Ben SJan 12 '12 at 0:22

The framework of the problem is that you have a hash and you want the data that yields that hash. It doesn't matter whether you trust the hash or not.
–
David SchwartzJan 12 '12 at 5:37

Here's my two cents. Since you've already said that a trusted third party is giving them the hashes, here's what to do -
team A has file fA and hash of file fB denoted as h(fB)
and team B has file fB and hash h(fA).

Both teams also get hashes of their own files, so team A has h(fA) and team B has h(fB) as well. The hash is strong enough to take them a few many years to crack (2048 bit cypher or something).

Each team sends the other a sum of their hashes. So team A sends h(fA)+h(fB) and team B sends h(fB)+h(fA).

Now both teams have the sum of the hashes but not the real files. If the sum of the hashes match, both teams go to the third party and ask for the de-hashing algo. Since both teams agree on giving out the algo, the third party does so and each team gets what they want.