Posted
by
Soulskill
on Wednesday July 04, 2012 @04:38PM
from the doing-it-with-style dept.

solardiz writes "A new community-enhanced version of John the Ripper adds support for GPUs via CUDA and OpenCL, currently focusing on slow-to-compute hashes and ciphers such as Fedora's and Ubuntu's sha512crypt, OpenBSD's bcrypt, encrypted RAR archives, WiFi WPA-PSK. A 5x speedup over AMD FX-8120 CPU per-chip is achieved for sha512crypt on NVIDIA GTX 570, whereas bcrypt barely reaches the CPU's speed on an AMD Radeon HD 7970 (a high-end GPU). This result reaffirms that bcrypt is a better current choice than sha512crypt (let alone sha256crypt) for operating systems, applications, and websites to move to, unless they already use one of these 'slow' hashes and until a newer/future password hashing method such as one based on the sequential memory-hard functions concept is ready to move to. The same John the Ripper release also happens to add support for cracking of many additional and diverse hash types ranging from IBM RACF's as used on mainframes to Russian GOST and to Drupal 7's as used on popular websites — just to give a few examples — as well as support for Mac OS X keychains, KeePass and Password Safe databases, Office 2007/2010 and ODF documents, Firefox/Thunderbird/SeaMonkey master passwords, more RAR archive kinds, WPA-PSK, VNC and SIP authentication, and it makes greater use of AMD Bulldozer's XOP extensions."

Sometimes it works. A few ages ago a friend gave me his old cellphone - one of those Samsung "Slim Line" ones - but at the moment it was uncharged; when I got it charged it turned out it was locked, I didn't have the password and couldn't get a hold of him to ask, so I figured 4 numbers couldn't be that hard and tried my best to guess. I couldn't. Fast-forward one week and I finally get around to asking him about the password.. What was it? 8888. The worst part is that I did try just enetering the same numb

I find it kind of odd that all of the analyses linked to in this article go on about SHA512-Crypt, BCrypt, SCrypt, etc, and the slideshow even talks about "Key Derivation Functions"... yet there doesn't seem to be any mention or comparision of PBKDF2-HMAC-SHA512 as a valid password-hashing key derivation function, despite it's widespread use, and that it's one of the core architectural components used in the design of SCrypt.

You make a valid point. I do intend to add a mention of PBKDF2 to a revised version of my presentation, and I am likely to use it or at least HMAC as a component if I design a new password hashing method - not so much because of actual need, but mostly to have an easy and convincing answer about cryptographic security.;-) However, in the context of this announcement PBKDF2 is arguably less relevant, and it is inferior to the alternatives being considered specifically in the GPU-friendliness aspect (it is more GPU-friendly than all three of SHA-crypt, bcrypt, scrypt). In scrypt, PBKDF2 is used (with SHA-256) to provide/demonstrate cryptographic security, but mostly not computational cost, whereas the analysis here is about the latter, under assumption that all of the alternatives being seriously considered are sufficiently secure cryptographically.

This release of John the Ripper supports PBKDF2 on GPU as well - in the included WPA-PSK cracking code. The release announcement shows a 27x speedup over the also-included CPU code when going from FX-8120 CPU (8 threads) to HD 7970 GPU for WPA-PSK cracking (PBKDF2-HMAC-SHA-1), which clearly shows that it is very GPU-friendly. With SHA-512, it'd be a lot less GPU-friendly, but likely not even to the point of sha512crypt.

SRP is great, but it does not eliminate the need for better password hashing - rather, these things may/should be used together. It does not take breaking DH to merely probe candidate passwords against a stolen/leaked SRP verifier. The Wikipedia article you referenced says that "using of functions like PBKDF2 instead of H for password hashing is highly recommended", and they were referring to the password stretching aspect. Other properties of the hashing method are also relevant, just like they are to "regular" password hashes.

In fact, I complained [redhat.com] to Tom Wu about SRP's use of non-iterated SHA-1 in 2000, and I had an e-mail exchange on a similar topic in SPEKE context with David Jablon in 1998 or so. Since then (or at about that time), the need for heavy to compute underlying hashes even along with zero-knowledge password proofs became widely recognized. I am not really into the latter topic, but I did my little bit to influence that field in that minor aspect (and I'm sure many others did as well).

according to that link [nist.gov] PBKDF2-HMAC-SHA512 is, when implemented correctly; as an example of bad implementation see that microsoft blog post [msdn.com] about.NET 2.0 (we are at 4). A good place to start is to understand that jargon is RFC 4868 [ietf.org] as it has almost all the links to the pertinent material.

BTW the Microsoft implementation failure is foremost a compliance and interoperability issue, we do not know about it's security impact, we can only presume bad things as we always should when face to unknowns qualities in the domains of computer security.

The design of PBKDF2, and the NIST publication you referenced, do not consider the difference in processing cost to defender vs. attacker, whereas that is precisely the aspect I've been focusing on in my analysis. PBKDF2 does nothing to bring the validation vs. cracking speed ratio [openwall.com] close to 1.0.

One? Sure. I probably keep 50 of them, under various usernames. The hard part is remembering them all, without some easy to guess formula. That is why passwords are often reused, which is far more dangerous than password weakness.

I have started non-reusing my passwords since I have implemented a "shared vault" - the user names/passwords are encrypted and hidden in a file that is shared across my various machines using a cloud mirroring service (e.g. Dropbox).

Dropbox security by itself is not terribly comforting, but the combination of my own memory-hard crypto inside the Dropbox system at least feels better. Having the "important" file shared means I have access to my secrets wherever I am, and it's harder to accidentally wipe out

The unimportant accounts (forum logins, shopping carts), just let your browser store them and protect them with a master password. Firefox is okay in this, Internet Explorer is not (as it doesn't have a master password). Give those accounts a 10-20 randomized mix of letters/numbers. I use a tool called EPG (Extended Password Generator) to do quick random generation of passwords. But there are also linux shell tricks you can do to generate stuff.

The fact that not every password is likely to be cracked is precisely what makes password security audits with John the Ripper useful. If every password would be getting cracked, there would be fewer legitimate uses for the tool.;-)

Memorizing one 16-digit mixed-case alphanumeric password is realistic, but it does not help you all that much unless it's a "master password" (e.g., used to access an encrypted password manager database or to generate other passwords from or to access an encrypted filesystem where you store other passwords in plaintext), because you'd have difficulty memorizing a large number of unique and dissimilar passwords of this kind. Either way, if you're developing a server application or administering a server where users can register with passwords (maybe as one of the authentication options, not necessarily the only one), it becomes sort of your responsibility to make your users' passwords less likely to be cracked, even if the server security is temporarily compromised (you should assume that this might happen). Note that many of your users' passwords might be weaker than you would have liked them to be, and you don't want to enforce too strict a password policy (as that's a tradeoff). This is where the choice of hashing method to use matters, letting you use a less strict password policy for the same level of security or/and resulting in fewer passwords getting cracked (even with no enforced policy, since some people will choose medium complexity passwords on their own).

On a serious note, entropy grows with length less than linearly, and you've provided a good example of that. This means that there's little point in using a passphrase this long. A replacement for yours could be: "cannon to R,L,F of them Volley'd and thunder'd" - perhaps about as easy (or as difficult) to memorize and recall reliably, likely roughly the same guessing entropy, but much shorter to type.

Sure, but fully going from a passphrase to a phrase-derived short password likely reduces guessing entropy (not the same thing as Shannon entropy, by the way; the latter is even more obviously reduced, but is less relevant). In fact, I think some of the passwords that JtR cracks in its incremental mode (which considers character frequencies) are actually built using the first-letter-of-each-word method. Indeed, many of those passwords will happen to use a subset of possible characters only - those that ar

Shameless plug: The crypto in StegaMail isn't unbreakable, but it is memory hard (you have to process the entire image to brute force guess passwords - encode in a big image and lots of memory is required), and I liken steganography to hiding your money in a false soda can in the fridge - hide in plain sight. Breakable? Sure. Likely to be broken in a mass trawl of data? Not.

Other than "playing spy" - hiding passwords is the best use I have found for steganography.

I am all for passphrases. We've been supporting them in our passwdqc [openwall.com] password/passphrase strength checking and policy enforcement tool (initially just a PAM module, then more) since I wrote it in 2000.

Implementation detail: when enforcing passphrase policy, we need to insist on some separators between words being present. passwdqc does, in order for the string to quality as a passphrase rather than password. Apparently, Dropbox does not [dropbox.com], and I think that's a flaw. No wordlist can be comprehensive, and a separator-less passphrase is indistinguishable to a password/passphrase strength checker from a long and somewhat obscure dictionary word. Indeed, any passphrase (or a multi-word portion of it) can happen to be found in a dictionary (or on the web, etc.) as well - or just be reused by the user across multiple sites - but that's a somewhat different issue.

Why not just require a password of a minimum size (say 25 char) and a couple of punctuation chars, and a certain minimum entropy... and call it a 'passphrase'? Whether or not dictionary words are allowed, what matters is overall complexity (i.e. char sequences aren't repeated) and you don't need to reference a dictionary.

Yes, you don't need to reference a dictionary when you approve something as a passphrase. passwdqc does not reference a dictionary for that. However, you mentioned requiring "a couple of punctuation chars" - and this almost ensures at least 3 words (well, it could also be e.g. a single word mangled by inserting/replacing characters, which is why I said "almost"). passwdqc has a similar requirement, although it does not insist on punctuation specifically. For non-repeats, passwdqc insists on there being en

Quite why people think that 'correct horse battery staple' is more memorable, in particular because of the visual imagery it evokes, than something that isn't 4 simple words, I will never understand. You have a fashion, that's all, but you will grow out of it.

I've been using passphrases for about 12 years (and more than that if we count those passphrases on PGP and SSH keys as well), and I'm not growing out of it yet. I often use mixed-character-type passwords as well, and my phrases often use weird word separators, misspelled and/or partial words (less typing, same or better security if you do it right), different languages, etc. The number of words also varies (but with too few words other bits of complexity have to be introduced). For me, what is easier or harder to memorize varies depending on what kind of suitable idea I happen to have at a given time. Besides, the variety in password/phrase types buys me a few extra bits of entropy. Even an attacker who has read this comment or cracked a few of my passwords somewhere doesn't come up with one single pattern on password type that I use - because there are many. Thus, let your users choose between short but complicated passwords and longer but less complicated phrases. Similarly, let them choose between server-generated strings and user-chosen ones (the latter may be subject to policy enforcement). Our passwdqc [openwall.com] tool set (PAM module, library, program for use from scripts) gives all of these options by default (but they can be disabled in any combination...) For server-generated strings, passwdqc uses 3-word phrase-like ones, with non-whitespace separators (out of a set of 8) and random word capitalization by default - that's 47 bits, which is currently sufficient in most user authentication contexts when used along with bcrypt hashes. With 4 words and the same approach, it's 64 bits ("pwqgen random=64" will do that) - but that is rarely needed with a decent password hash. (It is reasonable for data encryption keys, though - plus some 20 bits of stretching with a decent KDF.)

Given that you commend only schemes which are not the one recommended by xkcd, I do not see any disagreement between us. In fact, given that you say "let them chose", we are in strong agreement. Different people will be most comfortable with different types of password, and a password that they're not comfortable with and is stored in a record in their cell phone, or is on a piece of paper in their wallet, is nowhere near as strong as one they can keep in their head.Most of my passwords have had a several-u

"Quite why people think that 'correct horse battery staple' is more memorable, in particular because of the visual imagery it evokes, than something that isn't 4 simple words, I will never understand. You have a fashion, that's all, but you will grow out of it."

People think it because studies have repeatedly and clearly shown it. It's the way the typical human brain operates.

I'm not sure I've seen any independent study which investigates such questions satisfactorily. (You may interpret that as [citation needed].) And looking in from the opposite direction, I've also yet to see someone build a 4-simple-english-word rainbow table to directly attack the claim of security. Given that rainbow tables have made password recovery over much larger saerch-spaces possible, I think it's a worthwhile attack, even if purely from the theoretical standpoint (I mean, nobody actually uses 4 mod

And looking in from the opposite direction, I've also yet to see someone build a 4-simple-english-word rainbow table to directly attack the claim of security.

You don't need to count to 10^5^4 to know that it's a big number, far greater than the search spaces currently achievable with rainbow tables. Barring a monumental flaw in the hash function, the decreased per-character entropy shouldn't make a difference. (Though I guess it depends on how many "simple english words" you consider there to be.)

Certainly, in the field of memory, I am prepared to believe I am far from the norm. I have an exceptionally poor memory for almost everything. During my academic career I could never remember high level theories or identities, and had to repeatedly derive them from basic principles before using them.

I see a 136GB rainbow table that claims to cover 99% of a 2^48 gamut..An alphabet of 100000 symbols is *waaay* bigger than what I consider "simple english words". 12 bits per word (4000 words) seems reasonable, and that comes in at 2^48 again. IIRC, xkcd itself quoted 2^44 from 4 simple words (so 2048 simple english words).

I would argue from that data, which as you say might be based on an unsupportable premise, that we should not consider the xkcd recommendation as being clearly safe. I don't believe I'm s

Your phrase evokes an image for you to remember. The image then later evokes a sequence of words.If you're lucky, they're the words in your phrase, in the right order. However, images do not distinguish synonyms.

I can't prove it, but I bet that for the first few times you (y'all, not you personally) wanted to make reference to the xkcd phrase, you ran off and reminded yourself of it by doing a google search for "xkcd password" or similar. I know I did.

I fully expected a comment just like yours.:-) hashcat is in fact superior in many ways, but JtR is superior in many others. In the context of this story, since when does hashcat support sha512crypt and bcrypt on GPU? Last time I checked (just before releasing JtR 1.7.9-jumbo-6), it did not. I've just re-checked - as far as I can see, it still does not. So hashcat could not possibly be used for the comparison that this story is about, at this time.

My guess, based on recent hashcat user polls and atom's comments on the forums (yes, I sometimes skim over the topics), is that atom will in fact add support for sha512crypt on GPU soon (especially now that JtR has it, and hashcat "got to" compete and show a better speed, which it likely will) - in fact, even reusing our code is possible since we've BSD-licensed that portion, but I doubt that atom would do that. I am less certain about bcrypt. BTW, atom's expectation, stated on their forums [hashcat.net], was that sha512crypt would be only 2-3 times faster on GPU than it is on CPU. We achieved 5.5x, which is thus not bad. Admittedly, the CPU code could be rewritten to use SIMD and be roughly twice faster - thereby bringing us to the 2-3x expectation.

Also, some of us prefer Open Source, even if in some aspects a given implementation is inferior at a given time. Besides the current preferences/beliefs, guess what happens in case at some point atom loses interest in further hashcat development and does not release the sources under an Open Source license - or if something bad happens (I hope not!) preventing him from being able to do that? So far, hashcat is only ~2.5 years old and it is proprietary. (And yes, I am very impressed by what atom did in just 2 years.) John the Ripper has been around since 1996 and it is Open Source. BTW, this difference also means that hashcat can freely borrow low-level implementation ideas from us if atom wanted to (although I think he's good enough on his own not to use this option), whereas hashcat's EULA (as of the last time I checked, which was a long while ago) prevents us from doing the same even via reverse-engineering if we wanted to (although apparently this is not enforceable in many jurisdictions or in case the person never accepted the EULA; no, we don't rely on that and we don't RE hashcat).

Anyhow, I don't think there would be any issue in having a hashcat-focused news story if you or someone else posts one at a right time.:-)

The real issue isn't so much the hashing algorithm used than it is the bloggers.

An adaptive algorithm (one that can be made to go slower by tuning the number of cycles) such as bcrypt, and as opposed to sha, is arguably much better.

In the end, however, the real problem is the huge number of self-proclaimed experts with an opinion ranging from wrong to hopelessly wrong -- and a very big mouth. It wouldn't matter much if they were at least partially right (apart from the facts that, yes, do hash, and yes, do salt), but computer security is one of those fields where nearly no one actually understands what they're talking about. Don't believe me? Have the next security experts you meet explain encryption to you for a few good laughs.

I like to compare this to PHP/MySQL. The MySQL module has been deprecated in favor of MySQLi for so long I dare not even think about it. Yet, pretty much every tutorial/answer out there uses the mysql_*() set of functions including, of all places, Stack Overflow. So here we are, with newbie coders learning from horrific examples, and then viewing spaghetti code bases like WordPress as best practice. And then they write their own PHP/MySQL tutorial on their own blog, and the 5 years later some newbie reads it, perpetuating the vicious circle. The next step is the same PHP/MySQL newbie wanting to secure his app, learning from poor examples and advices, and propagating them in the exact same way.

Anyway, the "use sha to hash passwords" BS and the inane salting strategies that invariably accompany them are here to stay for a long time, because the Internet has a very, very, very, very long memory.

The fact of the matter as the parent post makes is that insecure password storage is a far larger issue, many many sites just store the passwords plaintext in a DB. If you're lucky they are bothering to use SHA1 on them first (without a salt). The website owner feeling smart adds salts but is still using SHA1 and a single round of hashing (cracking complexity... trivial). A real smart one decides he's going to use multi-round hashing, and perhaps even a stronger hash

The existing password hashing methods won't run on GPU well for user authentication, even when they do run well for cracking passwords. They lack sufficient parallelism within one hash computation. This is an issue I first raised in 1998, in pre-GPU context (it applies to recent CPUs as well, and the problem is getting worse with time).

A solution is to define a new password hashing method with sufficient (configurable) parallelism within one instance. We could then consider running it on GPU, unless it is GPU-unfriendly by other criteria. Do we really want to, though? GPUs in servers are not yet common, except in computing clusters. Their reliability may be lower than that of other typical server components. The drivers are currently relatively unreliable as well (although they may be reliable enough if running the same code, with no upgrades). Sure, computing clusters use them anyway, and get them to run reliably enough for their needs, but the extra hurdle and/or risk is there. Will we get embedded GPUs in typical servers soon? Will they be similar to current gamers' or HPC GPUs or not? This is not clear. Then there's Intel MIC, which delivers GPU-like performance, but is a lot closer to a CPU - it will require a lot of parallelism in the algorithm too, but it may run certain types of otherwise GPU-unfriendly code. Is this possibly a better target?

For current GPUs, a better strategy might be to make them inefficient - by using GPU-unfriendly hashes (for cracking, and for validation as well - as a side-effect).

We had a project [openwall.com] last summer to research this kind of possibilities, focusing on use of FPGA boards in authentication servers. This could optionally buy us GPU-unfriendliness [openwall.com] (if we want to make things more difficult for attackers with GPUs, but not FPGAs, and for botnets, which almost surely will lack FPGAs). We even considered some moderate CPU-unfriendliness of the component that we'd put on FPGA. Specifically, we experimented with bcrypt on FPGA, as well as with much smaller Blowfish-like "non-crypto" cores (not actual Blowfish), so that we could hopefully fit hundreds or thousands of those per chip (and have them somewhat CPU-unfriendly as well). Yuri, our GSoC 2011 student working on this project, did have some of this implemented in an experimental fashion, and some of it even worked (on FPGA boards kindly provided by Pico Computing [picocomputing.com]), but an outcome of the summer project was that this would be time-consuming to bring to desired levels of performance and reliability. At that point, the project was put on hold.

A simpler and cheaper alternative (if there are only a handful of customers for this) may be to use dedicated servers, existing HSMs, or microcontrollers for just the password hashing. Indeed, microcontrollers are super slow, so their only function would be to hold and apply a local parameter, with the rest of the hashing method implemented on the host's CPU and RAM. If dedicated servers are used, they would need to be separate from authentication servers - that is, they won't know usernames, won't have access to any database, won't have any persistent storage except for the local parameter, and the OS and software indeed. They will accept password, salt, and parameters (such as the configurable per-hash processing and memory cost settings), and provide the hash. Thus, their attack surface would be minimal and they'd provide an extra layer of security against network-based attacks. We'd do this with FPGA boards as well, and we'd also have the greater/unusual computational complexity as a security layer (in case the local parameter or its backup copy is leaked/stolen), but well - using typical and pre-existing server hardware, drivers, etc. is just simpler and cheaper unless we start a new business and expect to have plenty of customers (although that might be possible).

Reading your proposal for having an FPGA hold a critical part of the encryption process, I was reminded of the many places we have seen this idea before: dongles [wikipedia.org] (parallel port and USB), arcade games (slapstic from Atari [aarongiles.com], the CPS-2 "suicide batteries" [wikipedia.org], and NAOMI to name a few), SmartCards, and others.

Instead of using FPGA, I would use a cheaper PIC or Atmel device -- the security is in the algorithm implemented in the device. Having a copy of the hardware in the hand of an attacker, whether cheap or expensi

I've been curious how a hash looks with respect to various lengths of a password. Is a hashed 3-character password 3 characters long, i.e. does the hashed password itself indicate how long the password is? Is it filled to a particular length?