Some websites, even the StackOverflow asks for atleast 1 digit, 1 uppercase character in the password. Does this really matter when the developer uses a password-hash algorithm to store the passwords in the database?

I am building a project where the security really matters, but the users are stubborn to use weak passwords. I want to motivate them to use strong passwords by using rules that really matter and educating them about the reasons for such requirements.

Is it sufficient to prevent phone numbers, their own name or dictionary words? Why not?

@RowlandShaw I already specified in the question that you can prevent the user to use dictionary words, please read the question first :)
–
Mr. AlienJan 18 '13 at 8:28

2

@Mr.Alien- A dictionary attack doesn't mean trying all words in the dictionary. It means using a precompiled list of common passwords to try to break into an account. The original term comes from trying all dictionary words, but these days it's usually used to mean any type of attack based on guessing frequent passwords in a brute-force manner. Disallowing dictionary words can help mitigate this, but it certainly won't prevent it.
–
templatetypedefJan 18 '13 at 8:38

3 Answers
3

Regardless of hashing, the inherent weakness of passwords is that they are chosen and remember by humans. Humans are not good at such jobs. They will choose and remember passwords from a rather small set of possible passwords, namely words which "make sense" one way or another. An attacker with a computer can try all "plausible passwords" at the speed of his computer, which can be devilishly fast at that job (up to several billions per second with an off-the-shelf gaming PC). This is called a dictionary attack.

Requiring the addition of a digit and a mix of uppercase/lowercase letters is an attempt to force human users to enlarge their set of possible passwords. There are more possible passwords consisting of a meaningful word + one digit, than possible passwords consisting of a meaningful word (exactly ten times as many, actually).

(Such rules often backfire. When asking for an extra digit, most users will add a '1' and add it at the end, which means no enlargement of the set of possible passwords at all; and the extra length incites users to choose a shorter word from a shorter list, thus actually reducing the size of the set of potential passwords.)

Hashing is a second-layer defence system, meant to thwart attackers who could partially breach the server, and got a peak at the database of stored passwords. The size of the set of potential passwords is important regardless of hashing. When the attacker has access to the hashed password, he can just run an offline dictionary attack, where each "try" is just a matter of computing the hash function. This makes the task easier for the attacker (instead of having to talk to the server for each try, which is an online dictionary attack) but it does not change the core concept, which is that users should use passwords from a large set.

Complexity still adds security. @Thomas Pornin is correct in his answer, but I'd like to provide a different way to think about it.

Assume that we are hashing. How resilient to attack is

Upper/lower/special/symbols (about 84 potential values IIRC)?

Upper/lower/numbers (64 symbols)?

Hexadecimal numbers (16 symbols)?

Decimal numbers (10 symbols)?

binary numbers (2 symbols)?

Obviously case #5 is reductio ad absurdium; everyone should accept that 2 symbols are insufficient. We want the symbol set to be as complex as we can make it without imposing an undue penalty on users or other system elements.

Binary digits are fine, you just need a (much) longer password to compensate for the lack of symbols.
–
CodesInChaosFeb 27 '13 at 19:04

I assume you mean "Entropy adds security". Complexity is known to reduce security, in the general case. Regarding passwords, complexity is a (cheap) stand-in for entropy. Which is where the strength of a password comes from.
–
AviD♦Feb 28 '13 at 9:18

Personally when looking for raw entropy, I find hexadecimal digits easiest to remember. Lower than that and the password gets too long, and beyond that there are too many symbols to remember. I use a 20-character hexadecimal passphrase as my master passphrase (80 bits of entropy). One can easily remember up to 128 bits, but not too many of them, which is why some form of password derivation is often desired.
–
ThomasFeb 28 '13 at 23:17

@CodesInChaos You're missing the point: requiring to use a larger selection of characters in the password will make each password of teh same length harder to find. It also means that a rainbow table for attacking an dump of the passwords must be far larger since it's guarantee that it will not find anything by trying a smaller sets of characters.
–
StephaneJul 23 '13 at 9:39

@Stephane 1) My point is that you should simply require a longer password when the user doesn't use many different characters. There is no reason to require many different characters. My preferred way to validate passwords is estimating their quality and comparing to a threshold instead of having lots of silly rules that reject many strong passwords. 2) If rainbow tables are a relevant threat, the coder did something wrong. Typically it means that they forgot to use a salt.
–
CodesInChaosJul 23 '13 at 10:06

There are three primary types of attack that can be done against hashes: brute-force attacks, dictionary attacks, and pre-computation attacks.

Brute-force attacks
A brute-force attack involves selecting a range of characters (e.g. lowercase and numbers) and computing the hash for every single possible permutation of those characters, for a range of password lengths. Each hash is compared against your target hash, and if it matches the password has been found. For example, we might choose a-z A-Z 0-9 as our alphabet for passwords between 5 and 8 characters. Defending against such attacks is reliant on the computational cost of each hash operation, the alphabet needed to successfully attack the password, and the length of the password. Since modern technology allows for GPU-based acceleration of hashing, it is important to use a slow key-derivation function (e.g. PBKDF2 or bcrypt) instead of a single hash.

Dictionary attacks
Dictionary attacks involve running through a large list of pre-chosen words that are likely to be used as passwords. It is important to note that most dictionaries don't just include real dictionary words - they also include various pseudo-words and other values that are found in various database leaks and common password lists. These attacks are more efficient than brute-force attacks in general, because they focus on the kinds of passwords that humans choose rather than completely random values. Defending against such attacks almost entirely relies on not picking a common password or dictionary word.

Pre-computation attacks
Instead of computing hashes repeatedly and comparing them to the target hash, pre-computation attacks involve computing hashes for a set of chosen values (like a dictionary attack) and storing them in a file or database. Hash databases and rainbow tables are two common methods of doing this. This provides a very fast lookup of plaintext for any known hash, since it's just a case of looking up the hash in the index and returning the associated plaintext. This can be defended against by using a salt, i.e. a random value appended to the password before hashing. This makes computing rainbow tables for each possible salt value completely infeasible.

So, why are complicated passwords important? It depends, really. If you're doing password hashing properly, using PBKDF2 or bcrypt with a reasonable cost factor, complexity beyond not using common passwords isn't actually that important. It's more important to avoid dictionary words and common passwords, and complex passwords do usually offer that kind of protection. However, choosing a long and unusual non-dictionary password that is memorable (e.g. PolynomialLovesBacon) works just as well. If you do password hashing incorrectly (e.g. salted SHA1) you need a much stronger password to remain safe, because GPUs can compute tens of billions of hashes per second.

Of course, you're going to have to deal with the human aspects. I think one of the best things you can do is warn users if they use a common password, by storing a list of the ~2000 most common ones (you can get lists of these online) and checking against them. As long as you're properly hashing passwords, most users should be reasonably safe even in the case of a database leak.

Most of these attacks are based on the model of your site being hacked and your passwords stolen, e.g. via SQL injection, so it's important to adhere to secure coding practices and be aware of common vulnerabilities.