Safe Short Hash Scheme

When you exchange email addresses with someone and you wish to communicate with that person securely, you
also need to exchange key fingerprints. Before you encrypt or sign the first message to or from the other
person, you additionally need to look up the key and compare fingerprints. Nobody wants to check 256bit
fingerprint values against printed ones. And it is even worse if you do not have business cards ready or
you want to exchange email addresses over the phone - you don't want to dictate or write a 256bit hash.

As a first layer of security that prevents unprivileged outsiders from feeding you bad keys when you do a
key lookup by email address, the responsibility for looking up keys by email address needs to be moved over
to mailservers. This could e.g. be implemented by letting the mailserver serve users' keys from
https://[domain]/s2h_user_keys/[email address], thus ensuring that, if it is known that the
mailserver supports this key lookup method, only the mailserver and rogue CAs are able to feed bad keys to
clients.

Now you could get rid of birthday attacks by using this instead of a normal key fingerprint:
$$ fingerprint = \operatorname{hashfn}(email\_address \| seperator \| keydata) $$
Because the birthday paradox doesn't matter anymore, now you only need 128 bits.

Classical fingerprints have two important properties:

They are not secret, meaning that you can just give the same fingerprint to multiple
people, you can say the fingerprint over the phone and so on.

They can be used non-interactively, meaning that Alice can give her fingerprint to Bob and
Bob can then use that fingerprint to verify Alice's key without needing to speak to Alice again.

In the scheme I'm proposing, you use an authenticator instead of a fingerprint. That
authenticator can only have one of these two properties: Either it has to be secret or it has to be
used in an interactive verification process. However, the authenticator is only 22 bits
long.

An authenticator consists of two parts, a 9-bit hash and a 13-bit salt. Every time you
want to give your email address to someone, you have to generate a new authenticator as follows:

A MUA can now, given the mail address, the key and the authenticator, verify the correctness of the hash
for the salt contained in the authenticator. The remaining question is how much security this provides
against an attacker. It is clearly impossible for an attacker to spoof an identity with full reliability
because doing so would require bruteforcing a total of $ 2^{13} * 9 = 73728 $ independent bits, but it is
also clear that with a lot of luck, the attacker only needs to get one of the possible authenticators
right, requiring only 9 bits of bruteforcing.

In the following section, a lower bound for the number of hash function iterations required to perform
a successful attack in 1% of all cases is calculated. $S$ is the size of the salt in bits, $H$ is the size
of the hash in bits, $w$ is the wanted success probability.
The number of possible salts is $$ n = 2^S = 2^{13} = 8192 $$. Of those, the attacker needs to get
$$ k_{min} = \left\lfloor n * w \right\rfloor = \left\lfloor 8192 * 0.01 \right\rfloor = 81 $$ right.
For every salt, the chance of the corresponding hash being correct is
$$p = \frac{1}{2^H} = \frac{1}{2^9} = \frac{1}{512}$$.
The chance of a random fingerprint having enough correct authenticators is the binomial cumulative
probability
$$ q = \sum_{k=k_{min}}^{n} \binom{n}{k} * p^k * (1-p)^{n-k}
= \sum_{k=81}^{8192} \binom{8192}{k} * \left(\frac{1}{512}\right)^k * \left(1-\frac{1}{512}\right)^{8192-k} = 6.37638 \times 10^{-31} $$.
This is equivalent to a security level of
$$B = \left\lfloor -\log_2(q) \right\rfloor = \left\lfloor -\log_2(6.37638 \times 10^{-31}) \right\rfloor = 100 $$ bits.

However, the total security level should ideally be 128 bits. This can be accomplished by demanding that
$\operatorname{hashfn}(fingerprint \| x)$, where $x$ is a value that can be chosen freely by the party
generating the key and needs to be stored together with the
key, starts with $128-100=28$ bits that are zero, causing the generation of one new fingerprint to
require $2^{28}$ hash operations on average. Because $fingerprint$ contains the email address, these
fingerprints can not be reused for attacks against another user. This construct is essentially a
proof-of-work mechanism and does introduce a nontrivial computational cost on key generation, but
the cost is tolerable - according to
the bitcoin
mining hardware comparison for non-specialized hardware, generating $2*390000=780000$ SHA256
hashes takes one second on a Pentium 3 processor, meaning that on average,
$$\frac{2^{28}}{780000} = 344$$ seconds are required to generate a valid $x$. Modern CPUs like Intel's
Core i7 3930k can generate $2*66600000 = 133200000$ SHA256 hashes per second and therefore only need
$$\frac{2^{28}}{133200000} = 2$$ seconds to generate a valid $x$.

The security of this scheme depends on the attacker not knowing which salt the user will use to check a key.
As described above, this can be archieved by either transmitting the authenticator over a secret
channel - for example, by writing it on a piece of paper and giving it to someone - or by transmitting the
authenticator after the key to check has been delivered to the communication partner, which is e.g. suitable
for use over the telephone.

A very powerful attacker might be able to perform successful attacks in up to 1% of all cases. Therefore, it
is important that all clients alert the user whenever an authenticator can not be verified.

If you can print business cards with different email addresses, maybe you could even put the authenticator
into the email address in Base32,
marked with two plus signs or so and with a 8-bit checksum, like this: user++SEK3S3@example.org.
Of course, the business
cards would then have to be kept secret, and it would require mail clients to recognize the syntax.

A nice side effect is that, as long as an attacker doesn't know whether you actually have an authenticator
or not, substituting the server's response would be dangerous for him. Therefore, even if a user does not
supply an authenticator string to his mailclient, the client can retrieve a somewhat trustworthy key from
the mailserver.

The code used for the calculations, which is horribly slow, but should be very precise, is
here.