If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

by re-hashing, you are limiting it to the 16^32 possible inputs that hashes might be. Therefore, that's fewer than the N^L (N=possible symbols; L=length) passwords that could have been used. So on purely mathematical grounds, there are fewer possible inputs, fewer possibilities for brute force, etc.

true. but the second (third, fourth, nth) hash does not have fewer possible outcomes than the first.

Originally Posted by djr33

The problem with this concern is that 16^32 is a huge number!

Exactly.

Originally Posted by djr33

And even if that's not "enough", then we have to concede that most passwords are less than 32 characters anyway.

Exactly.

Originally Posted by djr33

the odds do go way up as you repeat more times that you would eventually happen to land on the right hash somewhere in the iterations. But the way for it to actually work is to land on the right hash at the end of the iterations. I don't see how having the right hash after 50 iterations would be at all helpful for you to find the right one after 100.

Exactly.

Something else to consider: part of the reason that hashes are non-de-hash-able (or, like, whatever) is that they're lossy. Flip this bit, rotate these ones, log() that one, throw away every fifth value that is a power of two. (Random example, of course, not any real algorithm.)

James, I don't mind a couple topics in one, since it's all going in the same direction. But let us know if we're being distracting

As for your test code, I really don't know what's going on there. It sounds like something may be going wrong with your server. Maybe check the configuration for crypt()... traq, any ideas?

Traq, three things to add. I've been doing quite a bit of reading (including getting stuck in a Wikipedia loop for a while):
1. I figured out what wasn't making sense earlier. Hash algorithms aren't perfect.http://en.wikipedia.org/wiki/Random_oracle
A "Random Oracle" would be perfect. But it doesn't (can't?) exist:

Originally Posted by Wikipedia

In cryptography, a random oracle is an oracle (a theoretical black box) that responds to every query with a (truly) random response chosen uniformly from its output domain, except that for any specific query, it responds the same way every time it receives that query. Put another way, a random oracle is a mathematical function mapping every possible query to a random response from its output domain.
...
No real function can implement a true random oracle.

This means that all hashing algorithms (that exist, potentially ever could exist) are biased in one way or another. Therefore, more iterations means more impact of that bias. I'd assume that asymptotically, the result of md5(md5(...)) would end up at the same/similar values regardless of the starting input. But... these are HUGE numbers. Doing two or three iterations won't hurt at all I don't think. And doing 100 or 1000 might not. Doing 1 billion might.

2. There's an important detail to be added to your point above:

Originally Posted by traq

Something else to consider: part of the reason that hashes are non-de-hash-able (or, like, whatever) is that they're lossy. Flip this bit, rotate these ones, log() that one, throw away every fifth value that is a power of two. (Random example, of course, not any real algorithm.)

There's no non-theoretical reason for knowing someone's actual password* (and you're right; that's hard). Lossy means we can't deterministically work out what the input was because there isn't enough information in the output to reconstruct it. But that doesn't really matter-- we're not trying to reconstruct the input; we're trying to reconstruct an input. Still, what you said probably applies, but at the most abstract level, there is enough information in the hash to reconstruct an input (via brute force) given what we know about it: it is some value X such that as an input to the algorithm it generates that hash. Even given the strongest brute force systems you can imagine and infinite time, you'd never be able to determine the original password-- it's lossy and there's no way to know; but you could find some password that works just as well.

In some sense, the lossiness is bad because then we're not using as much information as we could to make a complicated hash (and by coincidence there may be other passwords we could use equally well). On the other hand, the lossiness is a good thing because it obfuscates the original input and makes it harder to guess what's going on. Part of hashing well, I'd imagine, is not leaving any clues in the form of the hash that tell you about the algorithm or the input. Having a set length output for all inputs is useful in that.

(*The only practical reason to know someone's real password is so you know what they used-- perhaps to know more about them (what is their favorite pet's name, anyway!?) or more importantly to gain access to their other accounts if they use the same password. Having an algorithmically equivalent alternative only works with the same algorithm-- so if you want access to their email or bank accounts, you'd need to know the original password unless you're hoping the email or bank websites use the same exact algorithm as the one you hacked.)

3. Aside from lookup tables (or "rainbow tables"), the weaknesses of MD5, SHA1 and other algorithms are due to collision attacks. That is, in less time than brute force, people have managed to find an input Y that has the same hash as another input X. This is problematic when it could be used as a method to create verifying but fake data (eg, a checksum for a download would match but the data would be off, and might contain a virus) or to verify something like a security certificate. However, it doesn't appear to actually directly relate to what we're doing here with passwords. It means it might take less time to come up with a password that works, but it doesn't sound like it actually "breaks" the algorithm for use in passwords as it is used here. But.... I'm not sure about that. Basically the descriptions focus on the attacks being used for other purposes and only hint at it maybe not being relevant for passwords. The lookup tables, to the extent that they do exist, are the real problem for these algorithms.

My posts are now almost off topic. If you want I can start a new thread. Continuing on with my current area of confusion:

I notice that:

Code:

<?php
$test="password";
$test=crypt($test,'saltt');
echo "$test";
?>

produces the same results as:

Code:

<?php
$test="password";
$test=crypt($test,'salt');
echo "$test";
?>

The salts need to be constructed in a specific way to implement particular hashing methods. "just a string" leads crypt() to use DES, which only cares about the first two characters (try "alt" and "always", "salt" and "safety"). The way you need to format the salt is complicated. I've used implementations built by others, but I'm still at the stage where trying to use crypt() directly frustrates me. But I'm gettin' there. That big blue box at the top of the crypt() man page bears reading a few dozen times.

If I am understanding you correctly with salting crypt I need only concern myself with the first two characters, but that there are ways to get crypt() to use more or less than two characters or different characters by formatting it differently and that currently you are not fully familiar with crypt() syntax to do that, correct?

The crypt() function is weird. Look at some of the examples on the PHP page.

The "salt" parameter not only allows actually adding salt, but it also allows you to insert special instructions on how the algorithm (or which algorithm?) is used. There are also constants you can use to select which algorithm is the default.

This is why I said earlier that the crypt() function is one of the strange or most difficult I've seen. It looks useful and in the end not all that complicated, but I don't get why these things aren't just easier-- give it three arguments and be done with it

I have looked over that page several times, but I may have been skimming it as well. Several terms are new to me as security is less of a hobby for me than other aspects of coding. I'm getting there too, but slowly. I have to push myself harder to learn this than with other topics, but it is important that I learn at least a little more than I already know so that I can improve the security I have so that it is acceptable for the type of website I have.

The "salt" parameter not only allows actually adding salt, but it also allows you to insert special instructions on how the algorithm (or which algorithm?) is used. There are also constants you can use to select which algorithm is the default.

which algorithm. And actually, you can't use the constants to define which one - it's determined by the format of the $salt. The constants only allow you to check which algorithms are available on your system.

Originally Posted by james438

I have looked over that page several times, but I may have been skimming it as well. Several terms are new to me as security is less of a hobby for me than other aspects of coding. I'm getting there too, but slowly. I have to push myself harder to learn this than with other topics, but it is important that I learn at least a little more than I already know so that I can improve the security I have so that it is acceptable for the type of website I have.

As implied above, "it's not just you." It's weird. It's a heavily nuanced function that wasn't designed well in the first place and has been fiddled with a lot since then. I don't really like it, but it is the one you need to use.

which algorithm. And actually, you can't use the constants to define which one - it's determined by the format of the $salt. The constants only allow you to check which algorithms are available on your system.

Huh, ok. I did not get that out of the page. Thanks for clearing it up. It seemed weird that we'd have to use constants (but there are a few functions like that). Usually I can read a php.net page and generally understand it

And I agree-- this looks like the best one to use, even if it's hard to use. And it's not that hard-- you just need to set up the salt once with your favorite algorithm, then you can cut and paste without worrying about the details.