bsy's Explanation of Zero Knowledge Proofs

Abstractly, a zero-knowledge proof is an interactive proof with a
prover and a verifier, where the prover convinces the verifier of a
statement (with high probability) without revealing any information
about how to go about proving that statement. Hopefully the following
example will make it all clear.

First, our assumptions. We're going to do arithmetic modulo
n, where n = pq, and p and
q are primes. Factoring n is assumed to be
intractable.

Rabin showed in [RabinFunc] that finding square roots modulo
n is equivalent to factoring n. That is, if
you have an algorithm that can find a square root of a number modulo
n, then you can use that algorithm to factor
n. Our zero-knowledge proof will consist of rounds of
interaction which shows that the prover knows a square root of a
published number, where we do not reveal any new information about the
square root. It is known that there exists a square root to this
number (public knowledge), i.e., it is a quadratic residue. The
factors of the modulus n may be entirely secret. (U.
Feige et al show a refinement in [FFS] which allows the published
number to be a non-quadratic residue of a particular form as well,
thus revealing less information; in either case, runs of the protocol
itself reveals no new information.)

The prover, P, publishes the quadratic residue v
for which P claims to know a root s.

When P wishes to prove its knowledge of s to the
verifier, V, P runs several rounds of interaction. In
each round, P choses a new random number r and
sends x = r2 mod n to V. Now, V choses a
random bit b, and sends it to P. P replies
with y = r sb. To verify P's claim, V
computes y2 and compares it with x vb.

Now, let's do the analysis. The first claim is that only P can
successfully complete the protocol for both possible values of
b. This is clear, since knowing y0 =
r when b = 0 and y1 = r s
when b = 1 means you also know s, since
y1/y0 = s. The second claim is
that an imposter P' who does not actually know s
can succeed with a probability of exactly 1/2 each round:
to see this, notice that if P' guesses correctly that b =
0, then it can just follow the protocol and succeed; on the
other hand, if P' guesses that b = 1, P'
can generate x by chosing a random number t
and setting x = t2 / v. The response is
y = t. The third claim is that no new information is
released. To see this, consider what an eavesdropper E hears.
In the case of the random bit b = 0, E sees a
random numer r and its square x; in the case
of b = 1, E sees the numbers rs and
x = (rs)2/v. These are numbers that the
eavesdropper could have generated in a closet. More precisely, a
simulator S can run both sides of the protocol, and by using
advanced information as to the value of the random bit (model is a
poly-time TM with an auxiliary input tape of random bits), S
can simulate the protocol without knowledge of s.

Each round of the proof shows that there is a 1/2 chance
that a prover P'' (we don't know whether P'' is P
or P') might not actually know s. Iterating
20 times gives a probability of less than
2-20 or .0000009536 that
P'' does not actually know s.

Such zero-knowledge proofs can be used for authentication -- the value
of v can be generated from a randomly chosen
s, and v is widely published as the public
authentication ``puzzle''. A successful
zero-knowledge proof showing knowledge of s authenticates
identity. In [StrongboxIn25th], Doug Tygar and I show how to obtain
superexponential scaling in security modulo the factorization
assumption, run the protocol in constant rounds while retaining the
zero-knowledge property, and simultaneously perform key exchange.

Note that using only a zero-knowledge proof of identity over a
communication channel such as that provided by TCP/IP does not suffice
to provide a secure communication channel: an attacker who has access
to a link in the Internet through which your packets all cross may
wait until the authentication protocol completes, and then `hijacks'
your connection. Furthermore, doing the obvious protocol
piggy-backing to run zero-knowledge authentication at the same time
as, say, anonymous Diffie-Hellman key exchange is subject to message
tampering -- an attacker may substitute her/his own Diffie-Hellman
values in your packets, using your zero-knowledge authentication as a
subroutine. Care must be taken when composing two
cryptograhic protocols to ensure that the needed security properties
are retained.