(I did search on this topic, but I found no complete question/answer that addressed it, or even good portions of questions that might be relevant.)

I'm implementing a salt function for user passwords on my web page, and I'm wondering about some things.

A salt is an extension added to a password and then hashed, meaning the password is stored in the database as hash(password+salt). But where does one store the salt? A salt is simply to make rainbow tables "useless", right?

Couldn't an attacker just build a rainbow table, then convert all the hashes and remove the salt? After all, the salt is stored somewhere in the database. Therefore, if one can find the hashed passwords one should be able to find the corresponding salts. It seems to me that a salt only makes passwords longer, forcing the rainbow table to run longer.

So how does one maximise the effectiveness of a salt? I don't really see multiple security reasons for it, aside from making the passwords longer dynamically, but then again one could convert them to bits in a string instead.

Are my assumptions about how a salt works correct? If not, how should I store it and the salted passwords correctly?

Note: Salt does not make a brute force attack on a single entry harder than without it. But, if 2 users use the same password, the salt makes the hashes different.
–
SinisterMJMay 8 '13 at 15:12

The salt is normally added (salt + password); the other way around I've seen the "beginnings" of the hash be the same and only the tail end of it is changed by the salt. Also, the salt is added in such a way that its as if the user typed 3453535335325mypassword where 3453535335325 is a salt, so no, the salt can't be removed from the hash because it becomes prepended to the password. You'd have to unhash the password + salt first, but if you could do that, the hash would be pretty broken.
–
AndyMay 8 '13 at 16:27

1

I won't repeat the solid work in what others have said, just that GPUs have made ripping through salted hashes so easy that it makes rainbow tables look cumbersome. Use a real password hash like bcrypt, not MD5
–
Rich HomolkaMay 9 '13 at 2:50

2

"I'm implementing a salt function for user passwords on my web page, and I'm wondering about some things." Just use bcrypt! (There's nothing wrong with asking questions to understand salting, but don't implement it yourself in a production setting)
–
MaxMay 10 '13 at 7:54

1

@ThomasAndreèLian IF you want to learn, pick a known good algorithm like bcrypt and implement it yourself. This has two advantages - 1) You actually learn something useful. 2) You get test cases which you can use to verify if your implementation is broken.
–
Terry ChiaMay 10 '13 at 12:56

7 Answers
7

You are fundamentally correct that it is just making the password longer but there are some additional things that it does add. Also the average password is not that long or secure and adding length "for free" is rarely bad. The salt makes it so that an attacker can't hash "password" once and then look for every user that has a password of "password".

Instead, they would have to generate hashes of "password03j9834nfp-3028n408" and "passwordn0438wnvas89v5sne" and... This greatly increases the protection of identifying insecure passwords and still has a positive benefit on increasing the difficulty of finding secure passwords.

Where your understanding goes astray is when you say "Couldn't an attacker just build a rainbow table, then convert all the hashes and remove the salt?". There are a few problems here.

The first is that of collisions. A rainbow table is not guaranteed to produce the original input, it is only interested in producing AN input that produced the desired output. Since the salt will be added to whatever a user enters, the attacker must find the exact input that was used for the hash.

Since the exact input has to be found, the benefit of a rainbow table to leverage collisions is lost. Further, the size required to brute force all possible salts would be too large for pretty much any attacker to hold.

You also can't simply remove the salt from a hash without figuring out what the original input was. You would have to look for some value ending in the salt that corresponds to the given hash. This then requires a separate rainbow table to be produced for every salt value in the DB and this defeats the ability to pre-compute since it is infeasible to make a rainbow table for every possible salt.

So, it makes passwords longer AND guaranteed to be unique AND eliminates hash collisions, not 'just' making passwords longer ...
–
schroederMay 8 '13 at 15:37

1

Eliminating hashing collisions is a complete non-issue here. That matters in hash tables, and in hash tables you cannot append add random bits to the search keys just to eliminate collisions, because then you're changing the search keys. So "Bob" is not found in the hash table, because, silly you, you should have been searching for "Bob0z+dKl".
–
KazMay 8 '13 at 16:12

1

@schroeder - From a security perspective, a very long, very secure password is not going to be any easier to break than a simpler and salted password, at least as far as brute forcing is concerned. It offers a slight gain in terms of unconstrained input from a collision. That's why I said fundamentally correct. There are some intricacies that are gained, but the main point is extending the complexity to something that can't simply have all salt/password combinations built in to a rainbow table.
–
AJ HendersonMay 8 '13 at 16:52

1

@Kaz - hashing collisions are relevant in the other direction. Not for rainbow tables specifically, but if I find that hso8urn0 and password both resolve to a hash value of 12345678, then I can use hso8urn0 as the password. Since the salt is going to be added to my user input however, I have to find a collision that ends with the salt, which means the majority of possible collisions will no longer work.
–
AJ HendersonMay 8 '13 at 16:55

@AJHenderson I'm not taking you to task, I'm saying that you have value in your answer that you might be undermining with a single word in your intro.
–
schroederMay 8 '13 at 17:16

A rainbow table or a hash table is built by an attacker prior to an attack. Say I build a hash table containing all the hashes of strings below 7 characters for MD5. If I compromise your database and obtain list of hashes, all I have to do is lookup the hash on the table to obtain your password.

With a salt, you cannot generate a rainbow table for a specific algorithm prior to an attack. A salt is not meant to be secret, you store it alongside the hash in your database.

x = hash(salt+password) You will then store it in your database in the format of salt+x This renders rainbow tables and hash tables useless.

As usual don't roll your own, use bcrypt, scrypt or pbkdf2 which takes care of all the details including salting for you. See How to securely hash passwords?

A rainbow table is an optimization for reversing hashes by brute force. It works by a trade-off: you do a lot of precomputation to build a huge data structure, and you can then crack many hashes quickly.

A rainbow table only helps the crack hashes in the search space that it covers. Concretely, rainbow tables are built for plaintexts made of printable characters and [up to a certain length. The basic principle of adding a salt is that the plaintext that is hashed contains both the password and the salt; with the added salt, the search space becomes too large to build a rainbow table.

Thus, by adding a salt to the password, there is no way to amortize the cost of a brute force attack over many cracks. The attacker has to do all the work for each hash, starting only when he knows the salt.

Given a hash and a salt, there is no way to “remove the salt”. This is a basic property of a hash function: even if you know that two strings are related (e.g. you know that password is a substring of password+salt), it doesn't help you find hash(password) knowing hash(password+salt). There's no need to hide the salt from the attacker: it needs to be unique (and not derived from the password) but it doesn't need to be more secret than the hash. (You can add an additional secret salt, which is called a pepper, but it's only useful in a narrow set of circumstances.)

Salting the hash is only half the battle. Another part of securely hashing passwords is that the hash function must be slow, because that hurts the cracker a lot more than the verifier. Don't roll your own, use a method that has been vetted by experts. Use bcrypt or PBKDF2 or scrypt. Implementing password storage should not involve doing any cryptography yourself, call a library function.

For everything you wanted to know about hashing passwords, everything you didn't want to know about hashing passwords, and everything you didn't even know you might know about hashing passwords, read How to securely hash passwords?

Suggestion: Change "it needs to be unique" to "it needs to be unique for each password".
–
IsziMay 8 '13 at 14:48

@Iszi: Technically, having the same salt for two different passwords leaks some information too: namely, the fact that those passwords are different (since, if they weren't, the entire hash would be identical).
–
Ilmari KaronenMay 8 '13 at 16:02

1

@Iszi: Considering that it is extremely unlikely to ever generate the same random salt twice, and that if there is a salt collision you can just generate a new one for marginal cost, I don't think that's a very useful suggestion.
–
JorenMay 8 '13 at 16:21

1

@IlmariKaronen Since there's no way to tell whether the password is identical to another password out there (maybe on a different server), the server can only enforce global uniqueness (by generating a random salt), not per-password uniqueness. And as already noted by Ilmari Karonen a per-password uniqueness scheme could leak that two passwords are equal. So why would you want to bring in the notion of per-password uniqueness?
–
GillesMay 8 '13 at 16:43

1

@Iszi [Oops, I picked the wrong autocomplete target in my previous comment. Glad you found it anyway.] The salt needs to be unique across password databases as well. For example, if everyone used 1,2,3,… as salts, then it would only take a slightly bigger rainbow table to account for each hashed text having these extra digits, even though the salts would be unique per user on any given server.
–
GillesMay 8 '13 at 18:37

A unique salt solves one problem -- every account can't be attacked simultaneously in one giant brute-force attempt.

Let's say you tried building a rainbow table of all printable ASCII passwords that were 8 characters long1. That's 968 ~ 7.2 million billion (7.2 x 1015) possibilities. If you had a GPU that generates passwords at a billion per second, that will take roughly a month or so to crack (and at 200W at $0.10 per kWhr it's about $200 on average; ~$400 worst case for your electric bill).

So you find out the hashes of every account on linkedin from some sort of SQL injection or finding a harddrive with an old backup. You have hashes of a say a million users. Now if they are unsalted hashes, you can break the vast majority of these hashes in two months with one GPU for about ~$400 of electricity. If they are all salted uniquely and you want to break all million of them, it will now take $400 million dollars of electricity as each unique salt has to have its own independent attack. Now before you say, I'll build a bigger rainbow table that includes the salts, realize that at minimum a typical salt has say 5 hex characters (16**5) which will mean it will take a million times longer to create your rainbow table (e.g., you'll need to spend $400 million on the electricity to generate it).

Now add in key strengthening where instead of doing a hash once, you iterate the process N times (e.g., a three-time iterated hash function would be: hash(salt+hash(salt+hash(salt+pw)))). Now instead of being able to break a billion hashes per second, they can break a billion/N hashes per second, so all attacks will be N times more expensive. (A typical N is 5000; so to break a single hash would take about $2 million in electricity; the problem with super large N is that its more computational resources on your servers for verifying password attempts).

1 - This isn't the most effective way to crack passwords. Dictionary lists of millions of previously used passwords is much more effective, as many passwords are quite weak. I don't recommend people generating passwords themselves (typically very low entropy) or reusing passwords. Use a service like keepassx that allows you to create random passwords for each site and store in an encrypted file. Key strengthening + unique salt means that its infeasible to try and break any but the simplest hashes.

Rainbow tables are precomputed hash tables which are used to crack the passwords in a relatively quicker time because looking up a table is much faster than calculating a hash.

if one can find the hashed passwords one should be able to find the corresponding salts

The salt known to the attacker will not create a big problem if the attacker is trying to crack a single password. The salt will make it difficult and time consuming to crack a list of passwords because for every password the salt is different.

But where does one store the salt? A salt is simply to make rainbow tables "useless", right?

Salt is stored in the DB only and for every password you have a different random salt.Yes the salt makes it very difficult to use a rainbow table. The rainbow tables are generally created using common passwords that are used. The addition of a random salt makes it very difficult to be cracked by a rainbow table.

Couldn't an attacker just build a rainbow table, then convert all the hashes and remove the salt?

May be you mean to say building a rainbow table with known salts (before carrying out the real attack) because you cannot just remove a salt from the hash. Recomputing the table is not recommended because if you take into account all the known salts the size of the table will be too big.Remember space/time trade off. If you create the tables for one salt at a time you are wasting too much energy. The whole purpose of creating a rainbow table is to crack a password list and not a single password.

Another very important value addition that salting provides is no two users are going to end up with an identical password hash because the salts will be different. Imagine an attacker is able to crack one of the passwords and if salts are not used the attacker simply has to do a look up to find out which other users are using the same password.

There are a couple of factors involved in salting, and you're missing (at least) one of those.

They make rainbow tables useless. If someone looks in your password database and sees that a password is "5f4dcc3b5aa765d61d8327deb882cf99," they don't even need to build a "rainbow table," they can just google for that value and google will tell them that it's the md5 hash of "password". If the passwords are hashed before being put in your db, the "rainbow table" lookup mechanism will be useless. You don't need to store your salt in the db to make rainbow tables useless. Your salt can just always be "xxxx" or any other arbitrary string. If you google for 6a316e1fdac8a61d9c7a2ed1cba4a804 (the md5 hash of "xxxxpassword"), you don't get results.

If done properly, the salt generally means that an intruder needs both your database and your code in order to crack your passwords. You could have hashed your password as "xxxx" + password or pw.substring(0,2) + "xxxx" + pw.substring(2). The attacker doesn't know how you salted your passwords unless he has stolen your code too. Your web server often isn't running on the same box as your database, and with compiled programming languages your code might not even be available on your web server either. Regardless of what salts you store in your db, I would recommend also having a long, arbitrary, fixed salt stored outside the db that is additionally concatenated to the password before hashing.

Unique salts make cracking more expensive. As someone else pointed out, if you hash each dictionary word with a static salt ("xxxx"), then you can scan the whole password database for "6a316e1fdac8a61d9c7a2ed1cba4a804" looking for anyone who's password is "password." If each row uses a different salt, then you have to perform N hashes just to find out if one of N users has the password "password."

On the second point, beware of only using a single fixed simple salt without any other. A cracker could try hashing arbitrary strings + "password" until it found someone who's password was "password," and then would have essentially cracked your salt. In any large database of passwords there is bound to be at least one user who's password is some trivial password like "password."

“If done properly, the salt generally means that an intruder needs both your database and your code …” That's not a salt, that's a pepper. They serve different purposes, and the usefulness of the pepper is dubious. Regardless of whether you use a pepper or not, a unique salt is necessary. “With compiled programming languages your code might not even be available on your web server either” makes no sense: even with a compiled language, the pepper is present in the binary.
–
GillesMay 9 '13 at 8:46

@SLaks: Interesting, I'd never heard that name. Gilles: There's nothing "dubious" about the use of a "pepper." I've seen plenty of cases where companies were unable to maintain the security of their databases, while an intruder didn't gain access to the application code. A "pepper" doesn't have to be a string, but can be an algorithm. How you mix the salt into the password, the number of iterations by which you re-hash, etc, are all a kind of "pepper." Sure the algorithm or string is in the binary somewhere somehow, but it won't be easy to find it. It adds security beyond just salt.
–
BrianMay 10 '13 at 14:14

Firstly, why are you implementing low-level crypto code for a web page? You'd think this is a solved problem: you use a framework which uses libraries.

Secondly, the hash protect the passwords in case that the hashes leak out. Your site does not provide access to the password database, so this situation should ideally not even arise. The salts help to defend the password database as a whole in that event.

If the attacker is focused on a single password out of that database, then it doesn't make much difference. It only makes the job harder in the sense that the attacker can't just retrieve the password from a rainbow table by looking up the hash. The brute-forcing of a password with a salt is only slightly slowed down because a few more bytes are hashed which costs a few more machine cycles in the hashing rounds.

Once upon a time, Unix systems providing public access (such as campus systems for students) had their passwords cracked left and right. There were various reasons why this was easy, but the main problem was a simple one: the sensitive password hash information was kept in the same file as the other information fields about a user, and that file, /etc/passwd was world-readable. The fix is to put the sensitive information into a separate file, /etc/shadow. Presto: that ends the password cracking by script kiddies who have unprivileged, legitimate accounts on the system.

Similarly, nobody is going to be brute forcing your passwords unless you let them leak out.
If they leak out and someone is after a particular user's password, there is little that can be done to stop them.

Users who use the same passwords on different systems and don't change it for years and
years are always going to be vulnerable. You can salt it, pepper it and add ketchup; won't make a difference.

Salts do deter someone who wants to crack the password database as a whole and gather as many different passwords as possible. Salts mean that each password has to be brute-forced individually rather than just retrieved from pre-computed tables. If a salt has N bits, then, at least ideally, the attacker's rainbow table database has to be 2^N times larger. A 32 bit salt means you need a four billion times larger rainbow table database to just look up passwords.

If your site only has 20 users, then of course there are only up to 20 different salts. So what an attacker will do is take the password dictionary and hash each entry in 20 different ways with those salts.

So salts not only protect your database against rainbow table lookup, but also protect it from brute force cracking by about a factor of N, where N is the number of users. To brute force N passwords, the attacker has to do N times the work as brute forcing one. (Assuming the salt is wide enough: at least as large as the base 2 logarithm of N. If a salt is only, say, 12 bits wide, then there can only be 4096 different salts, no matter how many users there are.)

Without salts, this is not true. Brute forcing any number of passwords at the same time is as easy as brute forcing one, so the more users there are, the bigger the payoff per effort.