Just to point something about salt : if the salt is totally hidden and secured, to anyone outside it'll be part of the password. Since a good salt is also random bytes, no one will be able to tell "there are N bytes of password, M bytes of salt". The complexity is only reduced when the attacker have the salt, not just it's lenght.
–
woliveirajrMar 23 '12 at 11:52

I agree. The distinction in my question between password and salt is rather pointless - but I left it there in order to prevent comments like "hey, you should use a salt also". However, the complexity should be reduced if the attacker knows the exact length of the total password (M+N). I have updated the question to be more precise about this.
–
mhswendeMar 23 '12 at 11:59

1

Strange choice? It is one that I've seen in use pretty often, that's the only reason. And what does the size of 64bit+ come from? What's behind that figure?
–
mhswendeMar 23 '12 at 13:12

2

@CodeInChaos I'd wager that in actual fact, typically passwords are hashed with MD5, less so SHA1, even less so with salts, and even less so using PBKDF2. While they should be using the scheme you mentioned, chances are, they aren't. ;)
–
SteveMar 23 '12 at 18:01

3

@SteveS Agree. I'd even wager that plain text password storage is more common than any hashing more complex than MD5.
–
mhswendeMar 24 '12 at 20:44

5 Answers
5

Let's say they randomly chose alphanumeric (A-Za-z0-9 no symbols) for both salt and password; e.g., the sample space is (62)^M possible salts and (62)^N passwords.

Say they have a million GPUs in a farm at their disposal that can each generate a billion hashes a second (assuming a simple MD5 or SHA type hashes - bcrypt or PBKDF based hashes are much slower). So for a given salt, they can crack a 8 character password in 0.2 seconds (200 000 GPU-seconds), a 10 character password in 14 minutes (26 GPU-years), a 12-character password in 37 days (100 000 GPU-years), a 16-character in password in 1.5 million years (1.5 trillion GPU years).

Or another way to think about it; a typical GPU to crank out a billion hashes a second uses ~200 W. So if electricity costs you $0.10 per kWHr and neglecting start up costs, a GPU-hour costs $0.02. So an 8-character password costs ~$1, 10-character $4600, 12-character $18 million, 16 character - $260 trillion (the world's money supply is on the order of $10 trillion). (And this neglects the cost of buying/maintaining a million GPUs -- just electricity).

As for a rainbow table -- at some point it has to be constructed. So you want a complete rainbow table for all 4 character prefixes; and want to cover all passwords up to 8 characters (e.g., total of 12 char password) it will take 100 000 GPU years (37 days of a million GPUs) and cost about $18 million in electric bill.

Basically when M+N > ~12 for random alphanumerics it starts to become unfeasible (e.g., is unfeasible at M+N = 16).

EDIT: New summary table, where I list password length and type (e.g., 8 A-Za-z0-9 means 8 character uppercase + lowercase + numeric password).
Also included is a google application-specific password (16 random lowercase letters) though obviously google type password typically would have to be attacked online (not like an offline hash attacked by a GPU farm).

@AviD - There is a limit that even governments can get away with. I don't count those governments that don't hold free elections, those are the exception to the rules, in those cases even those rulers are limited to how much money they actually have.
–
RamhoundMar 30 '12 at 12:35

1

@Ramhound very offtopic, but I think a certain western government is proving that there are no real limits. Quick - how many zeros in a trillion? ;-)
–
AviD♦Mar 30 '12 at 13:11

1

The unfeasibility arises when the cost/effort of the brute force exceeds the value of potentially brute forcing. Hard to imagine any password worth trillions of dollars. Much simpler to imagine a nefarious gov't just coercing passwords out of people ( relevant xkcd ) or paying tens of millions of dollars to introduce subtle back doors (say by introducing to a compiled open-source compiler )/poor randomization/poor hashing/keyloggers/services that log password attempts expecting reuse), etc.
–
dr jimbobMar 30 '12 at 15:52

1

@AviD "a certain western government is proving that there are no real limits." Unless someone in the government has found new sources of energy (and does not want to share), there are physical limits.
–
curiousguySep 1 '12 at 9:15

I apologize ahead of time because this answer won't be a direct answer to your question, but I wanted to post it so that people know that the question you're asking is really only an academic one. The real answer to this question is: don't use sha1, use bcrypt. The sha1's and md5's of the world were designed to be fast and fast is not a good quality for hashing passwords. Bcrypt is designed to allow configuration of the amount of time it takes to hash a password, so you can slow down the hashing process just enough that users won't notice the difference, but it will take orders of magnitude more time for an attacker to generate a rainbow table or brute force a password. For more information on bcrypt and why you should use it, see this question.

In addition to all of this, bcrypt also includes a salting scheme. Because of the reasons mentioned above, you can be pretty certain that the 20 byte salts mentioned by woliveirajr will be viable well into the future.

Rainbow tables represent a trade-off between CPU time and storage, so in theory the answer to this question is unknowable. It depends on which side of the trade-off your attacker favours: if they've got lots of CPU time available then upon encountering a new salt value the attacker can start computing a new table, so it's effectively worthless (this extreme is equivalent to the attacker being able to brute force a key without using precomputed tables). If they've got lots of storage then they can precompute tables for any value of the salt, but it'll take them a long time so the bigger the better.

Of course, things that don't work in theory often work very well in practice. The real world injects itself at this point and tells us that for many attackers, both storage and CPU time are limited. The longer the salt, the more attackers do not have the resources to mount a successful attack. Accepting that relationship, then there's an inequality that provides an upper bound on your choice of salt size:

The total resources on your side required to work with the salted hashes must be less than the resources required to satisfy your application's other requirements.

What I'm interested in, basically, is what the upper bound on M+N is that can be bruteforced in practice, today, by a dedicated attacker (as per the scenarios). Perhaps the focus that I put on rainbow table was wrong.
–
mhswendeMar 23 '12 at 13:29

@mhswende - In practice today is a totally different question. Many users will choose low entropy/reuse passwords that are vulnerable to dictionary attacks. E.g., if a malicious attacker runs a large web app and has been collecting passwords for a few years (and have say a 10 million common passwords in your dictionary), then if you have the hash and it was something simple like md5/sha, and the password is something someone else has used in your large dictionary; it will take you under a second to crack with a modern GPU.
–
dr jimbobMar 24 '12 at 4:19

Yes, naturally, that's why I specifically pointed out that the passwords were considered random in the context of the question.
–
mhswendeMar 24 '12 at 20:30

There are rainbow tables, and there is brute force... these are two distinct things.

A rainbow table is a special case of a big precomputed table of hashed passwords. As such, a rainbow table can "invert" a hash, i.e. recover a matching password, only if, at some point during the table construction, that password was considered and hashed. Correspondingly, if a password can be found with a rainbow table, then it could have been found with a simple dictionary attack which would have cost no more than the table building. The "rainbow" does not change that; in fact, the rainbow thing actually makes the table more expensive to build, by a factor about 1.7 (that's because the table construction tends to consider and hash several times the same passwords, and that's rather unavoidable).

A consequence is that no rainbow table is worth the effort unless it can be applied at least twice. We use salts precisely to prevent that from happening. The salt can be viewed as making a variant of the hash function, each new salt implying a new variant. A precomputed table is worth anything only if it was precomputed with the same variant (the same salt) than the hash value which is to be attacked. If no salt value is used more than once, then the intelligent attacker will not waste his time building Kleenex rainbow tables; he will just run
a dictionary attack.

We consider that the attacker knows the salt. Why ? Because the server knows it, and the attack model is that the attacker could obtain a dump of the server database. Whatever the server knows, the attacker knows too. Thus, when attacking a hashed password, the salt length or contents do not matter (the attacker must include the salt value in his computations, but no salt length will make his task easier or harder).

Thus, we only have the password as line of defence. If we want to know how much the attack costs, then this becomes economics, and, as such, some complexity arises. In particular, we want to know if we are talking about an attacker who is after one very valuable password (say, the password which protects the main computer of the alien force which is about to obliterate Earth), or an attacker who makes a living out of breaking many passwords. In the latter case, the hardware costs become negligible with regards to power consumption.

If we take @jimbob's estimates, hardware which computes 109 hashes per second uses 200W worth of power, and power comes at $0.1 per kWh (note that power cost includes cooling: every Watt spent on computation also becomes heat, which must be somehow dissipated). This gives us 1.8*1014 hash values per dollar. From that, we obtain the following:

With $10K, an attacker can try 1.8*1018 hash values, which is more or less the number of possible passwords of 10 alphanumeric characters (uppercase, lowercase and digits).

With 683.7 billions of dollars (that's the total US military budget in 2010), an attacker could try about 1.23*1026 hash values, corresponding to about 14.5 alphanumeric characters. Let me add that this figure corresponds to the yearly output of about one hundred nuclear power plants, so that cracking effort would hardly be inconspicuous.

Conclusion: with 15 random alphanumeric characters, your passwords will resist even implausible enemies, even if you totally botched the hashing by using a single invocation of SHA-1, instead of using bcrypt or PBKDF2 with a high iteration count, as you should do. Note that this is valid only for random characters, not at all for the kind of characters you may come up with in the privacy of your brain. Human brains are not good at all at randomness.

The use of a salt with passwords makes rainbow tables almost useless for attacks in most instances. With password length P and a salt length of S you would need space P * S with naïve brute force look ups (adding a random salt you have to create S tables for each password). Rainbow tables reduce the space required to store the lookups for the password but they don't eliminate the multiplication by S (my Uni maths isn't good enough to give you the correct Big O notation; that can be an exercise for the reader :P).

The calculation can then be thought of as the length of your accepted character set (c) raised to the power of password length (p) multiplied by the number of possible salts (s).

(c^p) * s

Suffice it to say, this can grow to be a very large number very quickly.

Scenario 1

Could a government build a system suitably large to break a password + salt scheme using rainbow tables? This depends on the size of passwords, character set available and salt size. Above a certain size you start hitting physical limits (can you actually encode all the permutations on a device using the matter/energy at hand?). Governments often have greater resources than most institutions but even they can't exhaust these physical limits.

The only way to get around this would be to try and use intuition or prior knowledge to
reduce the search space. For a commonly used system, peoples' passwords will be below a certain length so only bother checking those. They also will probably only use a subset of the character set available. You can use these to concentrate what hashes are used to populate your rainbow table (the p and c). But you will always hit against the fact that you multiply by the random salt (s).

You do have a random salt don't you!? This is where people start getting worried about the entropy of random numbers. It is pretty hard to exploit when non-random data is used where real random data is needed but this is one of those situations where it comes into play. Consider the previous problem where you've managed to reduce the (p^c) by making intuitions about the nature of the password. If we know something about how 'random' the salts are (or can influence them in some way) we can start reducing the size of s. We may know (or can influence) the s to be between a certain set of values and therefore drastically reduce the permutations on the rainbow tables. If an attacker can get this below reasonable physical limits then a government (that was so inclined) could use rainbow tables to crack the passwords.

(I hope this has answered your related question as well)

Scenario 2

At current 2012 levels I would say that, unless you were incredibly lucky, the likelihood of you being able to crack password + salt hashes (ignoring technical weaknesses in your implementation) as incredibly slim. This assumes you use reasonable lengths for p, c and s.