I remember a Security Now! show a few years ago where they went on to explain exactly how iteration increases entropy and that the net effect was indeed cumulative and not simply a single step in entropy. It's very much the same thing as what they're talking about there with stretching password hashes. While the discussion was in a symmetrical cryptographic context (IIRC), the principles are all pretty much the same.

It's kind of funny how these exact same issues come up again and again in security. You'd think that people would learn their lessons by now...

Anyways, the articles were good and really well focused on that issue.

"Separate password breaches last week at LinkedIn, eHarmony and Last.fm exposed millions of credentials, and once again raised the question of whether any company can get password security right. "

Hmm, I hadn't heard of the ones at eHarmony and Last.fm. And interesting choice of phrase - "whether *any* company can get it right" (emphasis mine).

"Ptacek: The difference between a cryptographic hash and a password storage hash is that a cryptographic hash is designed to be very, very fast. ... Well, that’s the opposite of what you want with a password hash. You want a password hash to be very slow. "

Okay, so I can *almost* see a somewhat smaller "less important" site like Last.fm making this mistake. (Although even they are too big.) But LinkedIn strikes me as different. In some ways, except for a couple of annoying software features they have that lead to address book invasions, I respect LinkedIn more than any of the "recreational" social networks. LinkedIn is for a fairly high grade of professional - not that many fast food workers, etc. So I would think that would be a very demanding clientele. Wouldn't anyone at that level have wanted LinkedIn to just spend $50,000 for a month's worth of consulting to review their overall practices? The security guy Brian Krebs talked to nailed in twelve seconds. Add another $100,000 for a two-man security programming team for a year. Done (sorta).

Okay, here we go: further down: Ptacek: At a certain point, the cost of migrating that is incredibly expensive. And securing Web applications that are as complex as LinkedIn is an incredibly hard problem."

So now we get a new question, that maybe someone *did* notice, but it got tabled as a migration cost issue. That's a whole different notion.

Edit 2: So okay, in the NY Times article, it seems my off the cuff guess wasn't so bad after all: "Mr. Grossman estimates that the cost of setting up proper password, Web server and application security for a company like LinkedIn would be a one-time cost of “a couple hundred thousand dollars."

First, I think this whole debacle is just more evidence that there is some value in core User Management code projects that can be reused when building custom sites and is focused on getting things like this right -- which is exactly the kind of thing I hope to accomplish with my Yumps project.

I'm guessing most modern sites get the password thing mostly right. The most important thing is salting and hashing. Using a slow hash vs a cryptographic hash is important, but not nearly as much as the core concepts of salting+hashing. Only a *really* sophisticated and dedicated hacker is going to be able to employ timing info to exploit the "mistake" of using a fast cryptographic hash. In fact, I think you could argue that you are about a trillion times more likely to be attacked by someone who is trying to crash your site by hitting it with requests that slow it down than you are to be attacked by someone trying to exploit timing differences in password checks -- and so a slow password check might even hurt you there, unless you put in place an anti-hammering thing, which is actually a bit of work to get right. Furthermore, a timing attack on passwords is likely to be pretty low on the list of exploits to search for. Before you worry too much about that I would worry about network traffic interception, forcing https login, and a bunch of other stuff. If you are building a site where you think you might be so attractive that you are going to have world class hackers attempting timing attacks on your user passwords, you might want to reconsider the entire concept of allowing simple password logins, and implement additional checks with things like hardware tokens.

A meta issue, which is touched on by Tao above when we talks about costs of migration, is building in a mechanism by which you can migrate passwords to new approaches. So dont just store a password hash and salt, store extra info like: When the password was last changed, and the hashing algorithm/parameters used when it was stored. So that if you decide to move from using 5000 rounds of blowfish as your hash algorithm to 10000 rounds of sha512, you can identify which algorithm was used to store each users password, and you won't break people's logins as you migrate them (looks like some of the modern password hashing algorithms are being clever and embedding this infomation in the hashed output) to make it easier to keep track of. And have an automated system in place for forcing users to upgrade their passwords, etc.

A meta issue, which is touched on by Tao above when we talks about costs of migration, is building in a mechanism by which you can migrate passwords to new approaches. So dont just store a password hash and salt, store extra info like: When the password was last changed, and the hashing algorithm/parameters used when it was stored. So that if you decide to move from using 5000 rounds of blowfish as your hash algorithm to 10000 rounds of sha512, you can identify which algorithm was used to store each users password, and you won't break people's logins as you migrate them (looks like some of the modern password hashing algorithms are being clever and embedding this infomation in the hashed output) to make it easier to keep track of. And have an automated system in place for forcing users to upgrade their passwords, etc.

As I posted elsewhere, my approach to updating my own personal database was to use double hashing. So, say I had the initial passwords stored as:

SHA1(password)

Now, to update them I could either use metadeta like you say, and update when they login, OR double hash ... That's right, hash the hash . The new algorithm would then become:

SHA2-512(SHA1(password))

...

or to be precise with salting,

SALT^SHA2-512(SALT^SHA1(password))

...

This is an easy way to update existing unbreached databases with new hashing algorithms, and it also increasing the computation complexity at the same time, and -- as an added benefit -- creates a 'unique' combination of algorithms that can serve to further protect you. Later, some day, if I need to increase the hash algorithm again, I can continue to add additional hash algorithms, using a third or fourth round of hashing the hash of the original password.

it might be more flexible if you instead moved to using a prefix in the stored hash which contained the meta information.

so your original stored hash strings are: SHA1(password)your new ones would be:HASHVERSION_2:SHA2-512(SHA1(password))

The only change being that you would explicitly be storing some meta data that would make it easier for you to identify which users had upgraded their passwords, and make it easy to change schemes in the future.

I will say that it's a clever and neat idea you have there of running the extra new higher security hash on the OLD_HASH rather than on the plain text, so that you could in fact upgrade the entire database any time you upgrade your hash algorithm, rather than having to wait until the person next logs in. That's clever.

Well, thanks. I dunno if someone has done it before or not, but it seemed the only way to do it without waiting for users to log in. Necessity is the mother of all invention and such. I'm sure others have done the same.

While reading those articles, the premise of what I'm doing is very similar to what they suggest with, say, PBKDF2 ... That algorithm apparently iterates the hash in a similar fashion, X times. Now, they are not rehashing the *plaintext representation of the hash*, and instead are rehashing the last iteration, but I think the result is similar if they increase the iterations of the hashing algorithm. Of course, they go through far more iterations, making it more secure... except that it is not clear if they allow for multiple algorithms to be used.

Of course, PBKDF2's intention isn't to allow instant updating of a database, but to provide strong initial security.

In my case, I actually did change the database field, just to be sure and certain of which password hashes were updated (since I didn't initially do it all at once, but later did). So, while not storing metadata, I did implicitly give an indication of which accounts had been updated to the new algorithm.

I figured SOMEONE had done this before, as it is the ONLY way to INSTANTLY update an entire database. Still, sometimes the most simple things are overlooked. I don't know whether he was mentioning this as a method to improve security or update a database, but still.

After reading this thread, I'm actually going to go ahead and take it another step further and add one or two more algorithms on top.. and that's the beautiful thing, how far it can be extended (infinitely). So long as nobody gets access to the code, they'll also not know the algorithm.

If my ailing memory serves, he was definitely remarking on the security side, I think it was in relation to the data - I don't recall speed considerations.

But heh I think I also walked into the "obscurity" theme a while back - that's one of those topics where it protects lower level situations and it is somewhat useful, but I learned that you have to assume that the algorithm could be discovered.

Re: Mouser, you were talking about the damage level of a breach of LinkedIn. I think it's worse than it sounds, because it's a dangerous Phish opportunity. I am grumpy at LinkedIn because they *already* turbo-spam people's addressbook - I got two separate ones and that's from normal accounts. Give attackers an hour logged in and all kinds of fun could happen.

If all algorithms chosen are secure, it should be good .. real good. I am not a cryptologist or mathematician though. I think with each iteration it would grow in strength. Who knows, I may be wrong. Of course, the larger the digest size, the better.

You know what really pisses me off though, about hackers in general? It is *MUCH EASIER* to breach a site than it is to keep one secure. They think they are so smart for exploiting a site, etc... but they have the easier task in almost all cases. Of course, 99% of them are just using exploits discovered by other people, then think they are so brilliant for doing so.

Just like it is easier to DESTROY than it is to CREATE, true of everything ... same with security.

By layering on the same algorithm (or another one) you effectively increase the entropy each time you iterate the process.

That is what I thought . So as long as you don't throw a malfunctional or non-secure algorithm in the sequence, e.g. one that often hashes to 0 or something, you are good ;p. Myself, I have a policy of using *only* algorithms that produce at least a 512 bit digest. The exception is, of course, the first sequence in my hash, which is SHA1, only 160 bits.

Going on about my rant on hackers ... part of the problem is how the media treats them. Calling them brilliant, etc... No, it takes brilliance to keep a sever secure.

Right now, my #1 problem, and maybe mouser can sympathize, is not having the TIME to dedicate myself to constantly securing and monitoring my server. I have 10 different jobs, at least, here at my one man show, and web server admin is *definitely* a job in and of itself.

Part of the problem is the the Media knows scary stories keep people interested (however bogus they tend to be). No one wants to hear about the good news - unless it's on the level of a puppy being rescued from a mine shaft.

Part of the problem is the the Media knows scary stories keep people interested (however bogus they tend to be). No one wants to hear about the good news - unless it's on the level of a puppy being rescued from a mine shaft.

Indeed, Renegade is right, as always, but I wanted to comment on this when I got a chance, to elaborate on compression, since that is one field where I can claim expertise (being the author of more than one LZ77/LZSS derivative algorithm). Entropy in compression is different indeed, but similar too. In compression, it of course represents the minimum theoretical size you can squeeze the data into, with it remaining in-tact (reconstructable in decompression without loss).

In compression though, passing data through more than one compression algorithm does *not* improve entropy. In fact, it may decrease it.

Now, you can pass it through different pre-processing algorithms that re-arrange the data and THEN compress it, which improves entropy, but most compression algorithms have these pre-processing algorithms built in. And those are not compression algorithms, they are pre-processing/re-arranging algorithms. For example, with PECompact, by making tweaks to x86 code before compression, the compression ratio can be improved by 20% in many cases, depending on the code (could be more, could be less). LZMA now has this pre-processor (known as BCJ2) built in. There are MANY more that target different types of data. By making these tweaks, you improve the chances for a 'match' in dictionary based compression (where it matches data it has already seen, and emits a backwards reference to that data, there-by saving space).

My POINT is to MAKE SURE that nobody misunderstands Renegade's accurate and wise comment as meaning they should pass their data through more than one compression algorithms. I *hate* seeing this, ZIPs inside of RARs, inside of ZIPs, etc.. absurd. Don't anybody do that, please .

My POINT is to MAKE SURE that nobody misunderstands Renegade's accurate and wise comment as meaning they should pass their data through more than one compression algorithms. I *hate* seeing this, ZIPs inside of RARs, inside of ZIPs, etc.. absurd. Don't anybody do that, please .

Ooops. Sorry about that. You're quite right. Successive compression doesn't guarantee size reduction, and in fact often results in larger file sizes. I didn't clarify that properly and left it open there to the wrong impression.