Description

The current instantiation of django.contrib.auth has a few issues that could be improved. The three primary ones dealt with in this patch are:

Increasing the size of the salt pool

Making available SHA-256 for enhances security

Making the selection of algorithms available in settings

The first, increasing the size of the salt pool, is based on decreasing the impact of a birthday paradox attack against the pool. The current approach uses a space of 165 (1,048,576) for all salts. While this would seem on the surface to be adequate, there is in-fact a 0.5 probability of 2 users having the same hash in any database of 1,206 or more users. More information on the probability can be ​found on Wikipedia. The patch changes the method used to calculate a salt to 10 random selections from printable characters, and increases the space to 2.18*1014 and creates a 0.5 probability situation around 447,656,038 at the cost of 5 bytes per entry.

The second issue is that SHA-1 has known collision issues, and so I've made a tiny patch to allow SHA-256 (a version of SHA-2) to be used. For this to be useful, however, I've refactored out the third item, and created a setting AUTH_CRYPTO_ALGORITHM that can override the default algorithm. This has a default setting of 'sha1' but can be changed by the user.

Finally, I've also factored out the process of upgrading a password in place in User.convert_password, and modified User.check_password to automatically upgrade users as they sign in.

Having the same salt, especially when related to a relatively small space for the salts to come from, reduces the cost of either a brute-force attack. As I noted in the original post, a 50% probability of identical hashes is expected with 1,206 accounts in the system. This, combined with the common use of simplistic passwords (​here and ​here) creates higher probability that seeing the hash will be enough to determine the password. Combining these two factors reduces the pre-calculation requirements for any form of brute-force attack.

SHA-1 has been shown to be breakable in no more than 263 iterations (​more information). In addition, at CRYPTO2006, a semi-selective collision attack was shown (​more information). NIST has announced the phase-out of the 160-bit variants of SHA-1 ​no later than 2010, and recommended movement to SHA-256 as a minimum.

I understand that this isn't a particularly interesting patch, but is something I had to do for a project I'm working on that has minimum requirements for how users are handled. There are some additional changes that I'll likely submit, and it is up to the Django team to decide whether they should be mainstream. Much of this is covered in the ​OWASP documents.

Please be careful with your terminology here: it's true that the salt is selected from a space of 165, but the eventual hash comes from a space of 1640, and the size of that space does not decrease if there's a collision in salts. The only way that identical hashes come up is if two users select identical plaintext passwords and also get a hash collision (which is something to be solved by password policies) or if an attacker manages to obtain the salt and the hash and generate a second string which causes a collision (and the stat I've seen for that is 269, not 263, and the resources needed to do it in reasonable times are still such that it's not practically feasible for the overwhelming majority of attack scenarios; generating rainbow tables from a suitable dictionary is likely to be a much more efficient use of the average attacker's time and is unaffected, as far as I can tell, by what you're proposing).

On the whole I'm happy with the idea of allowing more options for hashing and for expanding the space from which salts are selected, but again: please be very careful in how you describe this: There's a lot of misunderstanding about this stuff, and throwing in confused/confusing assertions doesn't help any.

I'm perfectly happy to maintain my separate code for expanding the salt space, since I have no choice but to expand it for my own uses, if it's not desirable to integrate it into the mainline. My point was that, unfortunately, in my experience, there is a relatively high likelihood (greater than 1%) of password collision, and given the relative likelihood of salt collision, it seemed an easy thing to fix.

BTW, if you follow the link, you'll see that the 263 research is "relatively new" in that it was announced last year at CRYPTO2006 by Adi Shamir (he of RSA) on behalf of Chinese researchers who were denied entry visas. At 263, SHA-1 represents approximately the complexity level that MD5 should have represented, were it not broken and relatively easy to find collisions (​see this paper on collisions).

I leave it up to the core to take whatever they want from the patch.

N.B. I wanted to add SHA-512, however the VARCHAR(128) column for the password isn't large enough to hold it, and I'm not sure how to address that. It's not a problem, but was just something I wanted to do for completeness. It would require a minimum of 146 characters to hold the algorithm, salt and hash.

I submitted bug #5787, which is a specific instance of this feature request.
I'm not a huge fan of trying to implement our own password storage system and I would much rather implement a known standard (or provide support to use known standards).
In the patch​ for #5787, I included some code that pulled the password format details out of the User model. I like that as it clarifies the abstraction between the hashing implementations and the User class.
My two cents:

Pull all the password formatting out of the User model

The convert_password addition is great, but I would rather see a second configuration setting than a magic number of 10 (for salt length)

I'd love to see the core team make a decision on this, as the alternative at the moment is to invalidate the user's passwords and manage them in the user's profile.

I'm not a big fan of the new setting. On top of that, there's a problem with supporting any hash schemes not in Python 2.3 (our lowest supported Python version): it means databases created with a different version of Python break when used under a lower one (Malcolm mentioned a similar concern in #5787).

This isn't academic -- I have to support applications that deploy under mixed Python versions in some cases.

So I think the only thing we can do here is increase the salt size. I think anyone who feels they need more security will have to implement a custom authentication backend; building this into Django is just to fraught with danger.

I'm marking this as wontfix to help keep it organized; please reopen a new ticket if you'd like to just submit the salt size increase.

Since all currently supported hashing algorithms for Django have some known cryptanalysis attacks (indeed, not _yet_ enough to crack the passwords, but I'm not too eager to wait for that) it might be a good idea to atleast consider some more advanced hashing algorithms.

Backwards compatibility doesn't have to be a problem, the ​hashlib module currently used by Django can be installed for Python 2.3 and 2.4 aswell which makes a value in the settings to select the algorithm (if you're not feeling like installing hashlib and you're running an older Python version you could go for sha1) a good option.

Would it be an option to make the encryption algorithm selectable? If so than I would be willing to write a patch for it.

Please do not reopen tickets that have been marked as wontfix. Read contributing.txt for what to do if you don't agree with the decision.

Note, however, that requiring the installation of extra third-party modules to run Django is something we work very hard to avoid whenever possible and, as you note, this isn't a security hole at the moment. Making databases dependent upon the Python version that encrypted the data is also not a good plan, since there are known cases of people needing to be able to work with different Python versions and the same data.

If the current situation a problem for you personally, you can already write your own auth backends to use whatever crypto system you like.