Date: Mon, 31 Dec 2012 10:08:28 -0500
From: Matt Weir <cweir@...edu>
To: crypt-dev@...ts.openwall.com
Subject: Re: Intentionally Increasing Collisions in Password
Hashing Algorithms
In previous e-mails I went over how password hash collisions might
work for a very low value site. Since I was starting out by
investigating an edge case to see if hash collisions even had a chance
of helping, I wasn't that concerned with modeling a more realistic
level of acceptable risk vs. online attacks. To put it another way, I
agree with Steven that a 15 bit hash might create unacceptable risk,
and I had still used a *14 bit* hash in my examples.
Rather than trust my gut when figuring out a better model of
acceptable risk for online attacks, I figured I should just consult
NIST’s newest version of their SP 800-63-1 Electronic Authentication
Guideline. At least in the United States this is pretty much the gold
standard when developing password creation policies. You can find a
copy of it here:
http://csrc.nist.gov/publications/nistpubs/800-63-1/SP-800-63-1.pdf
You can imagine my surprise then when I looked up the requirements for
a Level 1, (the lowest level), security site as seen in Table 6. Two
things stand out:
“The Verifier shall implement a throttling mechanism that effectively
limits the number of failed authentication attempts an Attacker can
make on the Subscriber’s account to 100 or fewer in any 30-day
period.”
“The secret provides at least 14 bits of entropy.”
This means my initial assumptions about risk made in the previous
examples matched NIST’s recommendations exactly for a level 1 site. I
wish I could say that was planned, but I guess I was just channeling
my inner NIST writer. Now there are still a lot of things to debate.
For example hash collisions can hurt a user who voluntarily chooses a
strong password even if the site allows them to pick a weak password.
Aka the site might have a high tolerance for risk but the user might
not. Then there’s all sorts of other assumptions, (for example how
many sites really enforce a 100 attempt lockout policy?) But at least
we have some standardized models of acceptable risk to work off of.
The obvious next question then is: How do password hash collisions
perform for a Level 2 site? Referring back to Table 6 in the NIST
document again:
“The secret provides at least 20 bits of entropy.”
That makes the next step easy. All I had to do was change the rate of
expected collisions from 1 for every 16,384 guesses to 1 for every
1,048,576 guesses. Plugging that number back into the Excel workbook I
had posted earlier, I was able to calculate the number of expected
collisions per password hash for the following hashes:
Hash Type: Vbulletin
Cracking Session Length: 5 seconds
Percentage Cracked: 48%
Expected Number of Collisions Per Hash: 9,925
Hash Type: Phpass Portable Hash
Cracking Session Length: 5 minutes
Percentage Cracked: 36.9%
Expected Number of Collisions Per Hash: 1,083
Hash Type: BCrypt
Cracking Session Length: 5 minutes
Percentage Cracked: 9.3%
Expected Number of Collisions Per Hash: 2
So what about higher security sites? Well there is an option for Level
2 sites where there is no account lockout, (aka the attacker can make
millions of guesses if they want). In that case a secret with at least
64 bits of entropy is required. With a 64 bit hash the expected number
of collisions with even a fast hashing algorithm like Vbulletin will
be close to 0. Following NIST guidance, Level 3 and Level 4 sites also
require the use of a secret, (in all instances), that has at least 64
bits of entropy. It should go without saying, but when talking about
64 bits of entropy NIST is no longer referring to passwords users
select. Now we’re in the realm of authentication devices like RSA’s
SecurID, or randomly generated passwords that are assigned to the
users, (and then almost certainly written down).
I’m still struggling with the question of what impact this will have
on an attacker. It certainly seems to me that an expected 1k
collisions per hash would significantly increase the amount of work to
target password re-use across sites. With a strong hash like BCrypt
though, would 2 expected collisions impact the attacker that much? I
don’t know.
One way to look at it is that currently an attacker only has to verify
hashes they cracked. If they had stolen one-thousand BCrypt hashes,
they would normally need to only try 93 base passwords against other
sites, (assuming they only cracked around 9.3% of the password hashes
using the previously detailed cracking strategy). With collisions they
would instead have to try 2,093 base passwords, (2k collisions plus 93
that were the real passwords). It would actually be slightly higher
than that, since there’s no guarantee that all the real passwords the
attacker cracked were reused at the site they were targeting. If the
user picked a different password, the attacker would then go back to
submitting collisions vs. stopping if the user had picked the same
password. Now is 2k guesses a significant impact compared to 100
guesses? Once again I don’t know.
Another impact might be that a determined attacker performing a
targeted attack will have a lot more uncertainty when using cracked
passwords to attack other sites. Aka an attacker might be willing to
spend months cracking a single BCrypt hash in the hope that they can
use that password to log into a very high value account/site they
don’t have access to. Normally they would spend those months cracking
the hash, and if they didn't crack it, well at least they would know
where they stood. If they did crack it though, they could then focus
their efforts on trying that password plus assorted mangled versions
of that password against the high value site. With collisions, even at
two every five minutes, an attacker doesn't know if they really
cracked the password until they try a guess that lets them log into
the other site. They don’t know if their authentication attempts are
failing because the user choose a different password or if they simply
are trying a collision. In such a scenario, they now have to balance
how much mangling they perform against each base password vs. trying a
different base password. They also have a higher risk alerting the
target or locking the account.
So despite the fact that there’s still a lot of uncertainties on my
part about how much impact password hash collisions will have on real
life attackers, the more I look at this the more optimistic I’m
becoming that this might actually be a good idea in some
circumstances.
Matt