Password Best Practices I - The Generator

This is the first in a series of posts about password best practices. The series will cover best practices from a few different angles- the generator targeted at developers creating those generators, the end user (you, mom, dad, etc.) as you select passwords for accounts from those generators, and the service provider storing passwords in the database for accounts that your users are signing up for.

Motivation

When end users are looking for passwords, they may turn to password generators, whether they be browser extensions, websites, or offline installable executables. Regardless, as a developer, you will need to ensure that the passwords you your providing for your users are secure. Unfortunately, that's a bit of a buzzword, and can be highly subjective. So, we'll motivate what it means to be "secure" here:

The generator uses a cryptographically secure random number generator.

The generator provides at least 70-bits of entropy behind the password.

The generator is open source.

The generator generates passwords client-side, not server-side.

The generator does not serve any ads or client-side tracking software.

I think most of us can agree on these points- the software should be downloaded over HTTPS to mitigate man-in-the-middle attacks. A cryptographically secure RNG should be used to ensure unpredictability in the generated password. In addition to that, the CRNG should also be uniformly distributed across the set, so no elements of the password are more likely to appear than any other. Creating an open source password generator ensures that the software can be audited for correctness and instills trust in the application. Generating passwords client-side, means the server hos now possible way of knowing what passwords were generated, unless the client is also calling home (the code should be inspected). And of course, we don't want any adware or malware installed in the password generating application to further compromise the security of the generator.

Okay. That's all well and good, but what about this claim to generate passwords from at least 70-bits in entropy? Let's dig into that.

Brute force password cracking

Password cracking is all about reducing possibilities. Professional password crackers will have access to extensive word lists of previously compromised password databases, they'll have access to a great amount of hardware to rip through the password space, and they'll employ clever tricks in the password cracking software, such as Hashcat or MDXFind, to further reduce the search space, to make finding the passwords more likely. In practice, 90% of leaked hashed password databases are reversed trivially. With the remaining 10%, half of that space takes some time to find, but those passwords are usually recovered. The remaining few, maybe 3%-5%, contain enough entropy that the password cracking team likely won't recover those passwords in a week, or a month, or even a year.

So the question is this- what is that minimum entropy value that thwarts password crackers? To answer this question, let's look at some real-life brute force searching to see if we can get a good handle on the absolute minimum security margin necessary to keep your client's leaked password hash out of reach.

Bitcoin mining

Bitcoin mining is the modern-day version of the 1849 California Gold Rush. As of right now, Bitcoin is trading at $3,665.17 per BTC. As such, people are fighting over each other to get in on the action, purchasing specialized mining hardware, called "Bitcoin ASICs", to find those Bitcoins as quickly as possible. These ASICs are hashing blocks of data with SHA-256, and checking a specific difficulty criteria to see if it meets the requirements as a valid Bitcoin block. If so, the miner that found that block is rewarded that Bitcoin and it's recorded in the never-ending, ever-expanding, non-scalable blockchain.

How many SHA-256 hashes is the word at large calculating? As of this writing, the current rate is 7,751,843.02 TH/s, which is 7,751,843,020,000,000,000 SHA-256 hashes per second. At one point, it peaked at 8,715,000 THps, and there is no doubt in my mind that it will pass 10,000,000 THps before the end of the year. So let's run with that value, of 10,000,000,000,000,000,000 SHA-256 hashes per second, or 1019 SHA-256 hashes per second.

If we're going to talk about that in terms of bits, we need to convert it to a base-2 number, rather than base-10. Thankfully, this is easy enough. All we need to calculate is the log2(X) = log(X)/log(2). Doing some math, we see that Bitcoin mining is roughly flipping every combination of bits in a:

63-bit number every second.

69-bit number every minute.

74-bit number every hour.

79-bit number every day.

84-bit number every month.

88-bit number every year.

What does this look like? Well, the line is nearly flat. Here in this image, the x-axis is the number of days spent mining for Bitcoin, starting from 0 through a full year of 365 days. The y-axis is the search space exhaustion in bits. So, you can see that in roughly 45 days, Bitcoin mining have calculated enough SHA-256 hashes to completely exhaust an 85-bit search space (click to enlarge):

Real-world password cracking

That's all fine and dandy, but I doubt professional password crackers have access to that sort of hardware. Instead, let's look at a more realistic example.

In the article, it is mentioned that they had a peak password cracking speed of 180 GHps, or 180,000,000,000 SHA-1 hashes per second, or 18*1010 SHA-1 hashes per second. The article mentions that's the equivalent of 25 NVidia GTX1080 GPUs working in concert. To compare this to Bitcoin mining, the team was flipping every combination of bits in a:

41-bit number every second.

47-bit number every minute.

53-bit number every hour.

58-bit number every day.

63-bit number every month.

66-bit number every year.

As we can see, this is a far cry from the strength of Bitcoin mining. But, are those numbers larger than you expected? Let's see how it looks on the graph, compared to Bitcoin (click to enlarge):

So, it seems clear that our security margin is somewhere above that line. Let's look at one more example, a theoretical one.

Theoretical password cracking by Edward Snowden

"Please confirm that no one has ever had a copy of your private key and that it uses a strong passphrase. Assume your adversary is capable of one trillion guesses per second. If the device you store the private key and enter your passphrase on has been hacked, it is trivial to decrypt our communications."

But one trillion guesses per second is only about 5x the collective power of our previous example of a small team of password cracking hobbyists. That's only about 125 NVidia GTX1080 GPUs. Certainly interested adversaries would have more money on hand to invest in more computing power than that. So, let's increase the rate to 10 trillion guesses per second. 1,250 NVidia GTX1080 GPUs would cost our adversary maybe $500,000. A serious investment, but possibly justifiable, and certainly not outside the $10 billion annual budget of the NSA. So let's roll with it.

At 1013 password hashes per second, we are flipping every combination of bits in a:

43-bits every second.

49-bits every minute.

54-bits every hour.

59-bits every day.

64-bits every month.

68-bits every year.

Plotting this on our chart with both Bitcoin mining and clustered hobbyist password cracking, we see (click to enlarge):

The takeaway

What does all this math imply? That as a developer of password generator software, you should be targeting a minimum of 70-bits of entropy with your password generator. This will give your users the necessary security margins to steer clear of well-funded adversaries, should some service provider's password database get leaked to the Internet, and they find themselves as a target.

As a general rule of thumb, for password generator developers, these are the sort of security margins your can expect with entropy:

70-bits or more: Very secure.

65-69 bits: Moderately secure.

60-64 bits: Weakly secure.

59 bits or less: Not secure.

What does this mean for your generator then? This means that the number of size of the password or passphrase that you are giving users should be at least:

Base-94: 70/log2(94)=11 characters

Base-64: 70/log2(64)=12 characters

Base-32: 70/log2(32)=14 characters

Base-16: 70/log2(16)=18 characters

Base-10: 70/log2(10)=22 characters

Diceware: 70/log2(7776)=6 words

Now, there is certainly nothing wrong with generating 80-bit, 90-bit, or even 128-bit entropy. The only thing you should consider with this, is the size of the resulting password and passphrases. For example, if you were providing a minimum of 128-bit security for your users with the password generator, then things would look like:

Base-94: 128/log2(94)=20 characters

Base-64: 128/log2(64)=22 characters

Base-32: 128/log2(32)=26 characters

Base-16: 128/log2(16)=32 characters

Base-10: 128/log2(10)=39 characters

Diceware: 128/log2(7776)=10 words

As you can see, as you increase the security for your users, the size of the generated passwords and passphrases will also increase.

Conclusion

It's critical that we are doing right by our users when it comes to security. I know Randall Munroe of XKCD fame created the "correct horse battery staple" comic, advising everyone to create 4-word passphrases. This is fine, provided that those 4 words meets that minimum 70-bits of entropy. In order for that to happen though, the word list needs to be:

You would need a word list of at least 185,364 words to provide at least 17.5-bits of entropy per word, which brings us to required 70-bits of total entropy for 4 words. All too often, I see generators providing four words, but the word list is far too small, like around Diceware size, which is only around 51-bits of entropy. As we just concluded, that's not providing the necessary security for our users.

So, developers, when creating password and passphrase generators, make sure they are at least targeting the necessary 70-bits of entropy, in addition to the other qualifications that we outlined at the beginning of this post.