Do Not Use sha256crypt / sha512crypt - They're Dangerous

Introduction

I'd like to demonstrate why I think using sha256crypt or sha512crypt on current GNU/Linux operating systems is dangerous, and why I think the developers of GLIBC should move to scrypt or Argon2, or at least bcrypt or PBKDF2.

History and md5crypt

In 1994, Poul-Henning Kamp (PHK) added md5crypt to FreeBSD to address the weaknesses of DES-crypt that was common on the Unix and BSD systems of the early 1990s. DES-Crypt has a core flaw in that, not only DES reversible (which necessarily isn't a problem here), and incredibly fast, but it also limited password length to 8 characters (each of those limited to 7-bit ASCII to create a 56-bit DES key). When PHK created md5crypt, one of the things he made sure to implement as a feature was to support arbitrary-length passwords. In other words, unlike DES-Crypt, a user could have passwords greater than 9 or more characters.

This was "good enough" for 1994, but it had an interesting feature that I don't think PHK thought of at the time- md5crypt execution time is dependent on password length. To prove this, I wrote a simple Python script using passlib to hash passwords with md5crypt. I started with a single "a" character as my password, then increased the password length by appending more "a"s up until the password was 4,096 "a"s total.

Nothing fancy. Start the timer, hash one "a" with md5crypt, stop the timer, and record the results. Start the timer, hash two "a"s with md5crypt, stop the timer, and record the results. Wash, rinse, repeat, until the password is 4,096 "a"s in length.

What do the timing results look like? Below are scatter plots of timing md5crypt for passwords of 1-128, 1-512, and 1-4,096 characters in length:

md5crypt 1-128 characters

md5crypt 1-512 characters

md5crypt 1-4,096 characters

At first, you wouldn't think this is a big deal; in fact, you may even think you LIKE it (we're supposed to make things get slower, right? That's a good thing, right???). But, upon deeper inspection, this actually is a flaw in the algorithm's design for two reasons:

Long passwords can create a denial-of-service on the CPU (larger concern).

Now, to be fair, predicting password length based on execution time is ... meh. Let's be honest, the bulk of passwords will be between 7-10 characters. And because these algorithms operate in block sizes of 16, 32, or 64 bytes, an adversary learning "AHA! I know your password is between 1-16 characters" really isn't saying much. But, should this even exist in a cryptographic primitive? Probably not. Still, the larger concern would be users creating a DoS on the CPU, strictly by changing password length.

I know what you're thinking- it's 2018, so there should be no reason why any practical length password cannot be adequately hashed with md5crypt insanely quickly, and you're right. Except, md5crypt was invented in 1994, 24 years ago. According to PHK, he designed it to take about 36 milliseconds on the hardware he was testing, which would mean a speed about 28 per second. So, it doesn't take much to see that by increasing the password's length, you can increase execution time enough to affect a busy authentication server.

The first thing addressed, was to make the cost configurable, so as hardware improved, you could increase the iteration count, thus keeping the cost for calculating the final hash expensive for password crackers. However, he also made a couple core changes to his design that differed from md5crypt, which ended up having some rather drastic effects on its execution.

Using code similar to above with Python's passlib, but rather using the sha256_crypt() and sha512_crypt() functions, we can create scatter plots of sha256crypt and sha512crypt for passwords up to 128-characters, 512-characters, and 4,096-characters total, just like we did weth md5crypt. How do they fall out? Take a look:

sha256crypt 1-128 characters

sha256crypt 1-512 characters

sha256crypt 1-4,096 characters

sha512crypt 1-128 characters

sha512crypt 1-512 characters

sha512crypt 1-4,096 characters

Curious. Not only do we see the same increasing execution time based on password length, but unlike md5crypt, that growth is polynomial. The changes Ulrich Drepper made from md5crypt are subtle, but critical. Essentially, not only do we process the hash for every character in the password per round, like md5crypt, but we process every character in the password three more times. First, we take the binary representation of each bit in the password length, and update the hash based on if we see a "1" or a "0". Second, for every character in the password, we update the hash. Finally, again, for every character in the password, we update the hash.

For those familiar with big-O notation, we end up with an execution run time of O(pw_length2 + pw_length*iterations). Now, while it is true that we want our password hashing functions to be slow, we also want the iterative cost to be the driving factor in that decision, but that isn't the case with md5crypt, and it's not the case with sha256crypt nor sha512crypt. In all three cases, the password length is the driving factor in the execution time, not the iteration count.

Again, why is this a problem? To remind you:

Long passwords can create a denial-of-service on the CPU (larger concern).

Now, granted, in practice, people aren't carrying around 4 kilobyte passwords. If you are a web service provider, you probably don't want people uploading 5 gigabyte "passwords" to your service, creating a network denial of service. So you would probably be interested in creating an adequate password maximum, such as what NIST recommends at 128 characters, to prevent that from occurring. However, if you have an adequate iterative cost (such as say, 640,000 rounds), then even moderately large passwords from staff, where such limits may not be imposed, could create a CPU denial of service on busy authentication servers.

As with md5crypt, we don't want this.

Now, here's what I find odd about Ulrich Drepper, and his design. In his post, he says about his specification (emphasis mine):

Well, there is a problem. I can already hear everybody complaining that I suffer from the NIH syndrome but this is not the reason. The same people who object to MD5 make their decisions on what to use also based on NIST guidelines. And Blowfish is not on the lists of the NIST. Therefore bcrypt() does not solve the problem.

What is on the list is AES and the various SHA hash functions. Both are viable options. The AES variant can be based upon bcrypt(), the SHA variant could be based on the MD5 variant currently implemented.

Since I had to solve the problem and I consider both solutions equally secure I went with the one which involves less code. The solution we use is based on SHA. More precisely, on SHA-256 and SHA-512.

PBKDF2 was standardized as an IETF standard in September 2000, a full 7 years before Ulrich Drepper created his password hashing functions. While PBKDF2 as a whole would not be blessed by NIST until 3 years later, in December 2010 in SP 800-132, PBKDF2 can be based on functions that, as he mentioned, were already in the NIST standards. So, just like his special design that is based on SHA-2, PBKDF2 can be based on SHA-2. Where he said "I went with the one which involves less code", he should have gone with PBKDF2, as code had already long since existed in all sorts of cryptographic software, including OpenSSL.

This seems to be a very clear case of NIH syndrome. Sure, I understand not wanting to go with bcrypt, as it's not part of the NIST standards . But don't roll your own crypto either, when algorithms already exist for this very purpose, that ARE based on designs that are part of NIST.

So, how does PBKDF2-HMAC-SHA512 perform? Using similar Python code with the passlib password hashing library, it was trivial to put together:

PBKDF2-HMAC-SHA512 1-128 characters

PBKDF2-HMAC-SHA512 1-512 characters

PBKDF2-HMAC-SHA512 1-4,096 characters

What this clearly demonstrates, is that the only factor driving execution time, is the number of iterations you apply to the password, before delivering the final password hash. This is what you want to achieve, not giving the opportunity for a user to create a denial-of-service based on password length, nor an adversary learn the length of the user's password based on execution time.

This is the sort of details that a cryptographer or cryptography expert would pay attention to, as opposed to an end-developer.

It's worth pointing out that PBKDF2-HMAC-SHA512 is the default password hashing function for Mac OS X, with a variable cost between 30,000 and 50,000 iterations (typical PBKDF2 default is 1,000).

At the 1999 USENIX conference, Niels Provos and David Mazières, of OpenBSD, introduced bcrypt to the world (it was actually in OpenBSD 2.1, June 1, 1997). They were critical of md5crypt, stating the following (emphasis mine):

MD5 crypt hashes the password and salt in a number of different combinations to slow down the evaluation speed. Some steps in the algorithm make it doubtful that the scheme was designed from a cryptographic point of view--for instance, the binary representation of the password length at some point determines which data is hashed, for every zero bit the first byte of the password and for every set bit the first byte of a previous hash computation.

PHK was slightly offended by their off-handed remark that cryptography was not his core consideration when designing md5crypt. However, Niels Provos was a graduate student in the Computer Science PhD program at the University of Michigan at the time. By August 2003, he had earned his PhD. Since 1997, bcrypt has withstood the test of time, it has been considered "Best Practice" for hashing passwords, and is still well received today, even though better algorithms exist for hashing passwords.

bcrypt limits password input to 72 bytes. One way around the password limit is with pre-hashing. A common approach in pseudocode is to hash the password with SHA-256, encode the digest into base64, then feed the resulting ASCII string into bcrypt. In pseudocode:

pwhash = bcrypt(base64(sha-256(password)))

This results in a 44-byte password (including the "=" padding) that is within the bounds of the 72 byte bcrypt limitation. This prehashing allows users to have any length password, while only ever sending 44 bytes to bcrypt. My implementation in this benchmark uses the passlib.hash.bcrypt_sha256.hash() method. How does bcrypt compare to md5crypt, sha256crypt, and sha512crypt in execution time based on password length?

bcrypt 1-128 characters (prehashed)

bcrypt 1-512 characters (prehashed)

bcrypt 1-4,096 characters (prehashed)

Now, to be fair, bcrypt is only ever hashing 44 byte passwords in the above results, because of my prehashing. So of course it's running in constant time. So, how does it look with hashing 1 to 72 character passwords without prehashing?

bcrypt 1-72 characters (raw)

Again, we see consistent execution, driven entirely by iteration cost, not by password length.

Colin Percival, Tarsnap, and scrypt

In May 2009, mathematician Dr. Colin Percival presented to BSDCan'09 about a new adaptive password hashing function called scrypt, that was not only CPU expensive, but RAM expensive as well. The motivation was that even though bcrypt and PBKDF2 are CPU-intensive, FPGAs or ASICs could be built to work through the password hashes much more quickly, due to not requiring much RAM, around 4 KB. By adding a memory cost, in addition to a CPU cost to the password hashing function, we can now require the FPGA and ASIC designers to onboard a specific amount of RAM, thus financially increasing the cost of production. scrypt recommends a default RAM cost of at least 16 MB. I like to think of these expensive functions as "security by obesity".

scrypt was initially created as an expensive KDF for his backup service Tarsnap. Tarsnap generates client-side encryption keys, and encrypts your data on the client, before shipping the encrypted payload off to Tarsnap's servers. If at any event your client is lost or stolen, generating the encryption keys requires knowing the password that created them, and attempting to discover that password, just like typical password hashing functions, should be slow.

It's now been 9 years as of this post, since Dr. Percival introduced scrypt to the world, and like bcrypt, it has withstood the test of time. It has received, and continues to receive extensive cryptanalysis, is not showing any critical flaws or weaknesses, and as such is among the top choices as a recommendation from security professionals for password hashing and key derivation.

How does it fare with its execution time per password length?

scrypt 1-128 characters

scrypt 1-512 characters

scrypt 1-4,096 characters

I'm seeing a trend here.

The Password Hashing Competition winner Argon2

In 2013, an open public competition, in the spirit of AES and SHA-3, was held to create a password hashing function that approached password security from what we knew with modern cryptography and password security. There were many interesting designs submitted, including a favorite of mine by Dr. Thomas Pornin of StackExchange fame and BearSSL, that used delegation to reduce the work load on the honest, while still making it expensive for the password cracker.

In July 2015, the Argon2 algorithm was chosen as the winner of the competition. It comes with a clean approach of CPU and memory hardness, making the parameters easy to tweak, test, and benchmark. Even though the algorithm is relatively new, it has seen at least 5 years of analysis, as of this writing, and has quickly become the "Gold Standard" for password hashing. I fully recommend it for production use.

Any bets on how it will execution times will be affected by password length? Let's look:

Argon2 1-128 characters

Argon2 1-512 characters

Argon2 1-4,096 characters

Execution time is not affected by password length. Imagine that. It's as if cryptographers know what they're doing when designing this stuff.

Conclusion

Ulrich Drepper tried creating something more secure than md5crypt, on par with bcrypt, and ended up creating something worse. Don't use sha256crypt or sha512crypt; they're dangerous.

For hashing passwords, in order of preference, use with an appropriate cost:

UPDATE: A note about PBKDF2 that was brought up in a Twitter thread from @solardiz. PBKDF2-HMAC-SHA512 isn't really an upgrade from sha512crypt (nor PBKDF2-HMAC-SHA256 an upgrade from sha256crypt), because PBKDF2 really isn't GPU resistant in the way bcrypt is. However, bcrypt can be implemented cheaply on ASICs with only 4 KB of memory.

If your choice of password hashing in constrained to NIST standards, which includes PBKDF2, then unfortunately, bcrypt, scrypt, and Argon2 are out of the question; just make sure to use it properly, which includes choosing a high iteration count based on your authentication load capacity. At that point, password storage is probably not the worst of your security concerns.

However, if you're not limited to NIST constraints, then use the others.

Acknowledgement

Thanks to Steve Thomas (@Sc00bzT) for our discussions on Twitter for helping me see this quirky behavior with sha256crypt and sha512crypt.

{ 7 } Comments

It's funny that you're recommending PBKDF2, but also warn against "10k iterations of salted SHA-256 etc.". PBKDF2 is exactly that - iterative hashing with a salt - and most implementations do in fact employ SHA-256.

PBKDF2 is not exactly "10k iterations of SHA-256". First, PBKDF2 is an arbitrary length output function. A user may request any arbitrary amount of data. The typical usecase is to request key material, so 16 bytes, 32 bytes, and 64 bytes are most common. However, you could request 50 bytes of data, or 33 bytes, or 400 kilobytes if you wanted. SHA-256 is a fixed length digest function, that always outputs 256-bits or 32-bytes of data.

Second, PBKDF2 has a pluggable architecture. Any cryptographic hashing primitive may be used. Common functions are MD5, SHA-1, SHA-256, and SHA-512. PBKDF2 is typically used with HMAC, although if the cryptographic hashing function supports keying, like BLAKE2, then HMAC is unnecessary. SHA-256 is a static function, without any ability to plug something else into it. It's a foundational cryptographic primitive. Bruce Schneier called hashing functions the work horse of cryptography.

Thirdly, PBKDF2 requires salts to prevent rainbow table attacks on the generated output. SHA-256 does not. This doesn't prevent you from prepending or appending salts to your input, but this is something that you have to manually add as part of your application, as SHA-256 doesn't support it natively.

MD5crypt() did not replace the traditional DES-derived UNIX crypt(), but rather an even worse stand-in which only existed because DES was under export-control from the USA at the time. We had the DEScrypt() source, we just could not distribute it without a DoD license.

I knew at the time that hardware implementations of DES were available, and from personal experience that you didn't really need them if you took the time to hand-tweak your assembly code, so DEScrypt was not particular desirable, even if we obtained an export-license.

The choice of MD5 was driven entirely by the source-distribution issue, MD5 was published in an RFC and licensed for any use (whereas the slower, and therefore more desirable MD2 was only licensed for email.) and there were no export-control on one-way algorithms.

The things I focused most on with MD5crypt increasing the runtime in a way which could not be trivially pipelined (ie: data dependence) and on improving the environment for crypt() implementations (ie: longer salt, longer passwords, longer stored results.)

The fact that the runtime depended on the length of the password was considered and ignored: I would be more than happy with the increased security if I could get people to use 8 or 10 char passwords, never mind 17 and up, instead of just eight.

The most important thing I did was the $1$ prefix, which allowed multiple password algorithms to coexist. I pointed out at the time, that allowed you to change the algorithm at any time, as long as you also supported the old algorithms until old passwords were changed (Best practice at the time was 3-6 month between forced password changes).

...and then people did the exact opposite, they all copied & pasted MD5crypt all through the dot-com madness until a researcher told me that he estimated most passwords in eCommerce and half of all passwords in the world were protected by MD5crypt.

As for the OpenBSD people trash-talking MD5crypt:

I never aspired to be or claimed to be a cryptographer, and the **only** reason I have ended up writing some rather consequential cryptographic source code, is that the real card-carrying cryptographers seldom do so and never in a timely fashion.

Bcrypt, scrypt and Aragon2 are without dispute superior to MD5crypt() on all metrics except the most important one: MD5crypt() were there in 1994, as open source and a readily usable software component, they were not.

So yes, I felt the OpenBSD people were a little bit too snotty when they came walzing in five years later, and pissing down on me from my own shoulders felt particular unfair: I paved the road they drove on.

Your first argument is that you could DOS an authentication server with huge passwords. Have you ever heard of that? I believe most servers have other software that you can DOS a lot more easily, but let's consider that and compare to the alternatives. Assume I was an attacker trying to DOS your authentication server. What I could do, depending on your hashing algorithm (numbers taken/calculated from your plots, assuming they are comparable):

sha512crypt: send a 4kb password that causes 0.3s execution time and 1 "process" on the server and costs me 4kb of traffic to transfer the password

PBKDF2-HMAC-SHA512: send 167 passwords with one byte each that causes 0.3s execution time on the server (167 * 0.0018s), needs 167 "processes" on the server (threads, tcp connections, database connections, whatever that server does while checking authentication), and costs me 167 bytes of traffic to transfer passwords

bcrypt: send 137 passwords with one byte each that causes 0.3s execution time and 137 processes on the server and costs me 137 bytes of traffic to transfer passwords

scrypt: send 7 passwords with one byte each, 0.3s execution time, 7 processes used and 7 bytes of traffic

argon2: send 8 passwords with one byte each, 0.3s execution time, 8 processes used and 8 bytes of traffic

Sure, modern algorithms are better and more secure. But they need more execution time for small passwords and thus, make it easier to brute force, not more difficult. The attacker could more easily send 7 one byte passwords to scrypt than one 4kb to sha512crypt. Usually the uplink of the attacker is a bottleneck, so sending 4kb passwords seems to be a bad idea. Plus, which real-world authentication system supports 4kb passwords?

"moderately large passwords from staff, where such limits may not be imposed, could create a CPU denial of service on busy authentication servers."
Staff would most likely first use VPN/SSH with ("large") key files to get into the internal network and then passwords of normal length.

This is only "dangerous" in theory. And even in theory, other hashing functions seem to be more attractive in terms of DOSing them.

The second argument you mention are timing attacks. I would give this the bigger concern, because it really affects security. But as you mention yourself, it does not make much difference as most real-world passwords fall into the same category. Furthermore, it is difficult to do these kind of timing attacks on a server without getting blocked due to brute-forcing.

So in summary: Yes, modern algorithms are better, but what you write does not explain why sha256crypt/sha512crypt are *dangerous*.

Post a Comment

Your email is never published nor shared.

Name

Email

Website

Comment

Wordpress Hashcash needs javascript to work, but your browser has javascript disabled. Your comment will be queued in Akismet!