I understand that some KDFs bundle the salt with the output, such as bcrypt (modular crypt format).

In PKCS5_PBKDF2_HMAC (specifically, looking at OpenSSL implementation) with a single iteration, is the salt knowable/attackable?

I am staring at a KDF implementation that runs a single iteration of PBKDF2-HMAC over some data, using the user password as the salt.

Is the salt at risk of being discovered/is this a proper use of PBKDF2?

(The data being hashed is itself output from the same KDF, but with a proper iteration count.)

// hash 1 is not transmitted over network, and it used for symmetric encryption
hash_1 = pbkdf2(salt=username, data=password, iterations=5000)
// hash 2 is transmitted over network, as a password for auth
// is the salt vulnerable?
hash_2 = pbkdf2(salt=password, data=hash_1, iterations=1)

What is your application for the hash algorithm? It sometimes occur (e.g. passwords) that the hashes and salts are stored together, so compromising the hashes leads to compromising the salts.
– Steve Dodier-LazaroDec 7 '14 at 22:50

@SteveDL the salt is to my knowledge not stored anywhere else. My primary concern is that it is not stored in the output from the KDF
– az_Dec 7 '14 at 23:07

2

Wait, the password is being salted with... the same password? That does not sound healthy, regardless of the KDF. In effect instead of KDF(password + salt) you're basically doing KDF(password + password), right?
– AviD♦Dec 7 '14 at 23:16

@AviD It is not my application, but I am trying to determine whether its kdf implementation is proper. I have updated OP with example of the usage.
– az_Dec 7 '14 at 23:30

pbkdf2 get [some/a lot] of its security from the large number of iterations used (and future proofed by making the number of iterations configurable). Using iterations=1 is going greatly reduce the amount of compute effort required to find the origanal parameters both salt and data.
– David WatersDec 8 '14 at 1:30

2 Answers
2

My first instinctive response was DON'T EVER USE THE PASSWORD AS THE SALT VALUE. And that's still pretty much where I'm standing.

However; reading your post a little more carefully, no the salt is not stored in the result from the pbkdf2 algorithm.

However; the purpose of the salt is that it adds entropy (randomness) to the input data before hashing.

Salt is really not intended to be "secret."

The first glaring problem I see is the use of the username as the salt value in the first call to pbkdf2. That's totally predictable. Why is that not using a randomly generated salt value that is stored along with the username and the password hash? (The password itself should never be stored anywhere).

Anyway, it's safe to say the salt isn't stored or transmitted by the KDF because it is an input that is used to generate a one-way, mathematically non-reversible output value. The salt itself does not exist in the output.

But both usernames and passwords are rather poor choices for "salt" values because neither is sufficiently random (sufficiently safe from duplication). Passwords in particular are highly susceptible to duplication over a large set of accounts. Lots of people end up using same passwords.

But if you add randomly generated salt to identical passwords before hashing them, then the resulting hashes will be completely different from each other.

On the other hand, having identical (predictable!) salt values on multiple accounts or multiple data streams is bad news because it introduces a degree of predictability and potentially opens the door to exploits.

EDIT #2:

Hashes and salt... since there seem to be some additional qustions and comments on this subject, the bottom line is that cryptographic hashing algorithms are deterministic (the same input into a cryptographic hashing algorithm always produces the same output).

If this was not true, then these algorithms would not be useful.

The issue from a security standpoint is that if you do not "salt" your input data (typically, passwords), then the hashes you store or transmit are no more unique than the original data.

So if you crack one hash (dictionary attacks or rainbow tables), then you've cracked every one that matches. This is exactly what led to (can I identify a major business social media site by name?) having something like 6 million accounts compromised a couple of years ago.

"Salt" was just a cutesy acronym back in the green-screen days for entropy (randomness). It's cute to think of "salting" your "hash" (putting salt on hashbrown potatoes or corned beef hash or whatever).

The purpose of the "salt" is to randomize the input (the password). If the salt is randomly generated, it will be completely unique for every password. Append it to the password, and all of a sudden every password is unique even if thousands of users all choose the same password.

The salt value itself is generally not meant to be a secret. It is often stored right alongside the hash value (or even appended to it), and the original data secret itself (the password) is never stored.

So when the user provides the secret, the salt is appended to the provided string the same way it was when the hash was originally created, fed into the hash algorithm, and if the result matches the stored hash you know the user provided the correct password.

What @Fleche is saying about usernames not being globally unique is true, and passwords aren't even very unique in small domains, let alone globally.

Using both the username and the password as salt this way might open you up to an attack that you haven't really anticipated. The username is predictable and not universally unique even though they're unique in your domain. But then salt values are generally not secrets anyway. So you might be safe. But it feels a little sketchy.

Craig thanks for the in-depth explanation. In the example I provided, there is a guarantee that the salt (username) is unique, for hash_1. Since hash_2 is KDF output using hash_1, does it not transitively have the same property, so it doesn't matter if the password is duplicated? So, in effect, there is no additional weakness against dictionary attacks?
– az_Dec 8 '14 at 2:40

1

@Alex uuueeeeayyaaahh, it makes me feel kinda squirmy. You're still hashing hash_1 with salt that is a non-random, non-unique (for all intents and purposes) value, and on top of that, the salt for the second pass is the input of the first pass. I mean, pbkdf2 with 5,000 passes is probably making this all moot. Maybe. But who knows what funny bugs might be lurking in there. I don't like it, but I can't definitively say it's horribly broken. I mean, where is the password coming from, anyway? You said it isn't being stored anywhere. Why not use random salt that is stored with the credentials?
– CraigDec 8 '14 at 3:57

Or better, why not use random salt generated at the time the KDF is executed, and store the salt temporarily so it's possible to re-create this session authentication token (I presume that's what it is) if/when necessary? The token being emitted by the KDF isn't reversible, anyway. Offhand I can't think of a reason why using the password would confer an advantage, and there might be serious disadvantages.
– CraigDec 8 '14 at 3:58

Alex, there still seem to be a lot of misunderstandings about salts. The usernames may be unique in your system, but they're certainly not unique on this planet. That means an attacker going after a different application may get your passwords for free. Usernames are also predictable, which means an attacker can calculate possible username-password combinations in advance. So again: What you call “salt” is no salt at all. The whole point of salting is to add a completely random additional input.
– FlecheDec 8 '14 at 4:54

Thanks Fleche, Craig, your points are well taken. My main motivation was to see whether the local uniqueness of usernames would mitigate the lack of a salt, but that is clearly also inadequate. I've been looking at the security of an online password vault.
– az_Dec 8 '14 at 5:32

A “salt” which is derived from the input data is no salt at all. In other words, this is an unsalted hash function which only takes a password and an iteration count and calculates the resulting hash. If the iteration count is constant, then the same password always yields the same hash.

The whole point of the salt is that it's additional input. Contrary to popular belief, it's not secret in any way, but it must be sufficiently random.