BK: I’ve heard people say, you know this probably would not have
happened if LinkedIn and others had salted the passwords — or added
some randomness to each of the passwords, thus forcing attackers to
expend more resources to crack the password hashes. Do you agree with
that?

Ptacek: That’s actually another misconception, the idea that the
problem is that the passwords were unsalted. UNIX passwords, and
they’ve been salted forever, since the 70s, and they have been cracked
forever. The idea of a salt in your password is a 70s solution. Back
in the 90s, when people broke into UNIX servers, they would steal the
shadow password file and would crack that. Invariably when you lost
the server, you lost the passwords on that server.

Ptacek doesn't really explain why this is the case--he only says that salt has not prevented this type of attack in the past.

My understanding is that salt can only prevent pre-computation of password hashes because of the space needed to store the precomputed hashes. But if you have compromised the system, you will have both the salt and the hash. So the time to dictionary attack the hash does not change significantly (just an extra concatenation of the salt to the dictionary word).

Of course, it depends on the size of the salt and its visibility. If you use the old Unix salts, there are only 4096 of them and they are stored along with the hashes, so an attacker can still compute rainbow tables and know which one to use. It makes the job harder, but still feasible with rainbow tables and a lot easier than brute force. Now if LinkedIn used 256 bit pseudo-random salts and did not store the salts or seeds in the database, that would have prevented rainbow tables and recovery/warning tools like leakedin.org
–
Major MajorJun 11 '12 at 19:56

8 Answers
8

Krebs follows up on this question, and Ptacek does clarify what he meant:

BK: Okay. So if the weakness isn’t with the strength of the cryptographic algorithm, and not with the lack of salt added to the hashed passwords, what’s the answer?

Ptacek: In LinkedIn’s case, and with many other sites, the problem is they’re using the wrong kind of algorithm. They use a cryptographic hash, when they need to use a password hash.

In the next couple of paragraphs, he also elaborates on the reasons for it. The long and the short of it is that SHA1, with or without salt, is far too fast to be used as a password hash. It is so fast, that when computed using a GPU or something similar, you can brute force 10s of thousands of hashes per second. As is elaborated on later in the interview, LinkedIn should have been using bcrypt, which is an adaptive hash that would have slowed the brute force time down to the order of 10s of hashes per second.

tens of hashes per second on a GPU? If that was really true, systems logging in and creating hundreds of accounts per second (such as linkedin) would need several dozens of servers just to calculate the hash. Not really feasible. Even with blowfish, a GPU can dish out millions of hashes per second
–
RazorJun 12 '12 at 8:31

7

@Razor - bcrypt lets you set a "work factor" to decide how much computing power it should take to calculate the hash, so you can decide what tradeoff between speed and security is right. (You can also bump that number up as computers get faster.) That said, if you have enough server power for your actual site to function, hashing passwords will just be a tiny part of that.
–
Nathan LongJun 12 '12 at 10:28

Salt, dictionary attacks and rainbow tables

A salt massively helps against dictionary attacks in the common case of an attacker getting access to more than one password hash.

Without a salt, an attacker will sort all the hashes. He will hash the first word from the password dictionary, and check whether the calculated hash is in his sorted list of stolen hashes. With a bit of luck, he already got access to multiple accounts.

But with a salt, he has to attack each account separately.

Rainbow tables are just a special case of dictionary attacks, in the sense that they have been done before the attack and are ready to use.

It's important to note: Sticking a constant random string to all passwords will render prebuilt rainbow tables useless. But it is still a bad idea because of the parallelism issue described earlier.

Speed, algorithms for documents vs. for passwords

In addition to not using a salt, LinkedIn used a hash algorithm which is very fast, and can be executed on special hardware for even more extra speed. This special hardware is not exotic but part of common graphic cards.

2 billion per second using the Radeon HD 7970. A 6 [character] password [in] 500 seconds [...] with brute force.

The original use case for hash algorithms was to sign documents. Therefore being fast was a design goal.

Modern hash algorithms for passwords are designed to be relatively slow. A simple way to make them slower is to use the hash function repeatably. ShaCrypt and BCrypt are such algorithms. They pay extra attention to prevent parallel processing and being resistant against pre-image attacks on a single round.

Scrypt takes this one step further: In addition to being slow, it requires lots of memory (e. g. 16 MB for the default configuration). Specialized hardware usually only has access to about 1 KB of fast internal memory. Access to core memory is slow. Therefore building a fast scrypt cracker in hardware gets expensive very quickly.

Using a constant random string as a salt will render prebuilt rainbow tables useless (unless one of the rainbow tables happens to have used that same random string as its salt) but it is much less helpful than using a different salt for each password. Using a constant salt, an attacker can build a hash dictionary using that salt and compromise all accounts using a password in the dictionary in one pass. All accounts using "password" as the password would be discovered in one pass. Varying the salt would force the attacker to break each account separately, making the system much more secure.
–
Major MajorJun 11 '12 at 20:07

@MajorMajor, ah you got it the other way round. For the constant-"salt" case, I propose this: Sort the password database and use that one to lookup the freshly calculated hashes. This is way more effective then building a dedicated rainbow table first, with many entries that are not in the password database anyway.
–
Hendrik Brummermann♦Jun 11 '12 at 20:19

Hendrik, what I wrote agrees with what you wrote, but I wanted to clarify your comment that using constant salts would render prebuilt tables useless. With tiny salts the tables might have been prebuilt anyway and with huge salts you still crack all accounts with the same password in one go. For example, if you have an account on the system then you can immediately find everyone else who is using the same password as you. I was, as always, concerned that a newbie might misunderstand a remark and end up doing something stupidly insecure.
–
Major MajorJun 11 '12 at 20:51

You are correct, however that doesn't change the fact that it is essential to use a salt. In this case attackers got hold of the hashed passwords, so they could either use a rainbow table or start a brute force or dictionary attack.

A rainbow table will get you all the passwords (up to the size and complexity in the table) in a very short space of time.

Likewise, a dictionary attack will get you the passwords which are in dictionaries.

Brute forcing will get you the short and simple passwords quickly, but the time taken to get the longer ones quickly becomes so prohibitive that users with long passwords are still relatively safe.

So a salt removes that first set of possibilities, forcing the attackers to use dictionary and brute force solutions - making the users safer.

What a salt does is it renders rainbow tables useless, which does slow down bruteforce attempts on the password.

Definition of rainbow table:

A rainbow table is a precomputed table for reversing cryptographic
hash functions, usually for cracking password hashes.

Without a salt, an attacker could easily use a pre-generated rainbow table containing millions of passwords and their hashed equivalent and compare it against the password.

With a salt, every password requires the attacker to generate an entirely new rainbow table.

It has no impact on dictionary attacks - easy, obvious dictionary based passwords like password will be cracked easily with or without the salt.

A salt SHOULD be used however. Password cracking is all about time/effort. No password/hash is invincible. It is all about forcing the attacker to spend more time than he is willing to spend on your password tables.

"With a salt, every password requires the attacker to generate an entirely new rainbow table." then why bother with tables at all???
–
curiousguyJun 11 '12 at 15:29

As far as John the Ripper(the software i use) is concerned, password cracking goes like this. Takes list of input passwords, hash, compare against known hash. This is practically no different with generating rainbow tables for each salted hash and comparing it.
–
Terry ChiaJun 11 '12 at 15:35

So if you were using a CorrectHorseBatteryStaple style password, and the table was salted your password would probably be safe, because they'd brute force all the "low hanging fruit" and probably not worry about that last 5% or so of the users that used the most secure passwords. Heck, hackers might be happy with bruteforcing just 20% or so of the passwords.
–
aslumJun 11 '12 at 16:31

4

@TerryChia "is is practically no different with generating rainbow tables for each salted hash and comparing it." Hug? This is entirely different. Generating tables required lots of space, and is slower than directly trying out passwords.
–
curiousguyJun 11 '12 at 17:17

1

@Ramhound "the first thing they do is attempt to string multiple dictionary words together." which password cracker are you talking about?
–
curiousguyJun 12 '12 at 14:18

Salts prevent parallelism. Parallelism is generic in the whole space-time continuum; by this, I mean that it can be applied space-wise and time-wise:

Space-wise parallelism is when the attacker has several hashes to crack (that's the LinkedIn situation). With unsalted hashes, the attacker can hash one potential password and look the result up in the whole list of hashes he wants to crack.

Time-wise parallelism is when the attacker precomputes hashes of common passwords, into a big table (rainbow or not), to apply on hashes that he subsequently obtains.

What Ptacek meant was that:

Salts do only half the job; to really slow down the attacker, you need an inherently slow hashing process, like bcrypt.

When there are many users, at least some of the passwords will be so weak that they will be cracked, regardless of how much salted bcryptness you do.

So, while salts would have somewhat enhanced the situation for LinkedIn, they would not have saved them, only delayed and somewhat diluted the trouble. Using bcrypt or PBKDF2 would have further improved things, but not to the point that the breach could be totally ignored.

If an attacker exploits a system, and the salt is also compromised, the difference is that pre-computed rainbow tables will be useless. (Any particular password could be stored on a number of different ways, depending on salt size). Rainbow tables being a trade-off between time and space, a lot more space would be needed.

Hope it helps. An example is salt usage in the shadow file in UNIX, described clearly here: Why shadow?

Ptacek doesn't actually say that a salt would not have prevented LinkedIn passwords from being cracked. I would have to argue that depending on the size of the salt it would have even protected the worst of the passwords.

The one weakness SHA1 is its weakness to a brute force attack. All you have to do is generate X hashes ahead of time in and compare the value of a string's SHA1 hash to your generated list of hashes.

With a salt depending on the size in order to generate the list ahead of time would take a very long time and a very large amount of disk storage. You have to remember that one would have to generate the same list for every possible salt. At this point your only hope is try every single combination and hope you find a match. If you are able to generate a unique salt per user without storing the salt in a database you are even more secure.

In other words...Instead of every single password being cracked...LinkedIn would only be looking at a very small percentage of their users passwords being leaked ( the worst of the worst ).

Update

Where would you store the key?

If done the correct way you would not need to store it. Just generate a salt based of the username, when the account was created, or a combination of the two and you would have a unique salt for the given user.

So unless the source to your website itself was leaked you any hackers wouldn't be able to generate it. You could even make this more secure and generate a new salt when the user changed the account's password.

"If you are able to generate a unique salt per user without storing the salt in a database" how could you?
–
curiousguyJun 11 '12 at 15:28

@curiousguy: You could calculate salt from the username, registration date, or some other unchanging profile field, using an algorithm/key that's unknown to the attacker (i.e. not stored in the database).
–
Jesse McGrewJun 11 '12 at 17:10

4

If you generate the salt from some user related data, you allow the attacker to attack multiple instances of the hash (for example on separate systems) for the cost of 1 attack. If you rely on your source code to stay secret, you enter the realms of security through obscurity.
–
JaccoJun 12 '12 at 10:48

1

@curiousguy In the vast majority of cases I don't see how you can use a salt that you do not store. But if you are not keeping http logs and you operationally assume that your clients will always have the same IP address (neither of which is reasonable) then you could use the client's IP address as a salt. Without making some sort of unreasonable assumptions, I don't see how you can have a salt that is not stored on your system.
–
emoryJun 12 '12 at 14:25

3

Note that a "secret" component into password hashing (as your server-specific key) is often not called salt, but pepper (a salt is not considered secret).
–
Paŭlo EbermannJun 15 '12 at 22:29

In the most basic terms, the way I understand it, all what salting does is make the password more longer.

The original password was 'password'. But I appended to the password before computing the hash the salt cGFzc3dvcmQ to make passwordcGFzc3dvcmQ. What does that do? Well, it just makes it so if you happen to have a rainbow table of common passwords and their hashes,

password hash
password 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8

where you can see at a glance that SHA('password')=5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8, looking up the original password from the hash is all too simple. We want the cracker to have to do some work for it.

So we go "Dyoh! We either prepended or appended THIS RANDOM STRING RIGHT HERE to the original password! Now your hash tables of common passwords and their hashes are useless"

Which they are. Because we complexified each password by the random string, basically "making the user's passwords better", and completely screwing up the hashes so they are completely uncommon now.

So salting makes it so each hash has to be cracked individually, because to do it, if the cracker has the salts, the cracker has to append/prepend the salt to each common password he tries. 2 different users could use the simple password password, but cracking that will be 2 entirely separate jobs for the cracker, because of the salt. So cracking all the passwords will take longer.

The article here says that because of the compute power available today, however, using salts is trivial to crack.