Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

jd writes "NIST has announced the round 1 candidates for the Cryptographic Hash Algorithm Challenge. Of the 64 who submitted entries, 51 were accepted. Of those, in mere days, one has been definitely broken, and three others are believed to have been. At this rate, it won't take the couple of years NIST was reckoning to whittle down the field to just one or two. (In comparison, the European Union version, NESSIE, received just one cryptographic hash function for its contest. One has to wonder if NIST and the crypto experts are so concerned about being overwhelmed with work for this current contest, why they all but ignored the European effort. A self-inflicted wound might hurt, but it's still self-inflicted.) Popular wisdom has it that no product will have any support for any of these algorithms for years — if ever. Of course, popular wisdom is ignoring all Open Source projects that support cryptography (including the Linux kernel) which could add support for any of these tomorrow. Does it really matter if the algorithm is found to be flawed later on, if most of these packages support algorithms known to be flawed today? Wouldn't it just be geekier to have passwords in Blue Midnight Wish or SANDstorm rather than boring old MD5, even if it makes no practical difference whatsoever?"

I don't get why people misread what I'm saying. There are a number of established cryptographic techniques for 'breaking' an encryption or hashing algorithm. A tentative first step to finding a new algorithm is to, duh, make sure it's resistant to all the old techniques.

-I- never said you could prove the security of anything other than a one time pad, everyone loves to infer that though.

It's Schrodinger's cat: you might prove it secure by exhaustively mapping its function over the whole input range and then assessing the proximity of results and the psuedo-randomness of the mapping. The problem with that being that you then know how to recover any original text from a given encrypted result...

Actually, it's probably much better to have MD5 which is known broken in understood ways, than Jo3#a$# which is broken but we don't know how, where and why. There are fairly simple rules for MD5 (start phasing out now; only use in situations where you in some way control the input, not your adversary) which make it possible to use in a relatively safe way. If you don't know what way the hash is broken you don't know how to avoid those problems. Having said that, SHA256 should probably be considered the minimum for a temporarily secure system with a lifetime limited until something better has been available and tested. As Mr Schneier says "attacks only get better; they never get worse".

It's also not a surprise that some hashes got broken. There are many entries and they come from all types of cryptographer from teenager to aged expert; from unknown to known mostly by initials (e.g. A, S or R). There was not much hope that all of them would be of good quality.

All hash functions, no matter how carefully reviewed by however many experts, are broken in unknown ways. The winner of the SHA-3 contest will be broken in unknown ways. It won't stop you using it once it's circulated and part of the "standard". You will and you know you will. So your usage has nothing to do with whether people know where the breaks are, it has to do only with whether it is circulated. If Joe Cracker is so good at breaking hashes that we need fear for the safety of Skein or MD6, then I woul

If some new hash function was equally scrutinized and no attacks were discovered, you wouldn't say MD5 was better

agreed. but the point is that the new ones haven't been scrutinised.

The only security advantage it has over any other cipher is the far greater attention it has received.

well, that's a big advantage. For example, a hash function could have a weakness that there are certain "special" values which are very bad but detectable. By always using the hash with salt and then trying again if a bad va

Wouldn't it just be geekier to have passwords in Blue Midnight Wish or SANDstorm rather than boring old MD5, even if it makes no practical difference whatsoever?

s/geekier/stupid and irresponsible

Let me guess, the submitter likes to enable all the useless bling effects on Compiz but never gets any work done, and has racing stripes on his Civic....

I went to Carnegie Mellon and took classes from a bunch of professors who were all freakin' geniuses and here is the second most important lesson I learned about cryptography: NEVER DO IT YOURSELF. And a corollary to that is never use a cryptographic system someone else cooked up until it has been through the vigorous peer review that these hash functions will go through. This was an important lesson to a bunch of egotistical CMU students, and I hope the ones who were actually smart listened. (The first most important lesson is an old one: if you think cryptography is the solution to all your security problems, you don't understand cryptography or your security problems).

"Whaa! But the ciphers we have now are already broken!!" The current hash functions that are "broken" like SHA-1 are not trivially broken, but broken in a sense that in some scenarios might make it somewhat easier to conduct either a pre-image attack (useful if you know somebody's password hash and want to make a password that will hit the same hash) or a collision attack (useful in some cases where you are trying to forge a messsage to match a digital signature.... but if the fake message has to contain lots of garbage bytes even a successful collision might not pass the smell test). "Somewhat Easier" does not mean you can do it on your iPhone, it just means that it might take a supercomputer 100 years instead of the heat death of the universe to do it. This is still very important, but it is a world apart from an algorithm that has never been tested... those could be blown wide open and cracked almost instantly with trivial computing power. To use a bad car analogy, just because a seat belt won't save your life in every car accident doesn't mean it's just as safe to strap plastic explosives to your gas tank and hook them up to a mercury switch detonator.

As for "open source" making these cryptographic models available quickly, I wasn't aware that text editors froze up and stopped you from writing code if it wasn't going to be open source. The reason commercial vendors won't jump on a new cryptographic protocol before they are validated is that their customers would (rightly) go ballistic and their credibility would be smashed. Fortunately for all of us the leaders of the open source community have a little more sense that you and you won't see any of these hashes in the Linux kernel or OpenSSL until they are at least in the final rounds of competition and there is some evidence that they have value. OSS has the advantage that its software implementation can be publicly validated and peer reviewed, but having your code opened up to the world is actually much MORE dangerous if you are just screwing around because you think a hash function has a badass sounding name. I'm glad Torvalds is in charge of Linux and not "jd".

If the rest of his post is so well-said (such as wondering why they can't mod down submissions), why am I able to go to the Firehose and, well, mod down submissions? Far as I'm concerned, if his argument is so easily broken, it can't be treated as the least bit reliable. True, you will find "bling" on my computer. It'll be in the form of kernel patches I've ported or "adjusted". Assuming you consider "bling" to mean anything that isn't strictly necessary but is great fun to exercise and which sometimes actu

Let me guess, the submitter likes to enable all the useless bling effects on Compiz but never gets any work done, and has racing stripes on his Civic....

Reminds me of the manager who works a whole week on his presentation. 90% of the time is used for special effects. And then when he gives the presentation realizes he needs to move on, but can't because his nice slides force him to show every special effect he has put into it.Other people realise that Powerpoint is just a tool. Giving the presentation is wh

The whole point of the contest is to give all the candidates testing and scrutiny. Sure, I would currently choose one of the SHA-2 family (256, 384 or 512) for any current thing I was doing where it mattered. But I fully expect that in 2-5 years time I will instead choose one of the algorithms that was recently submitted to NIST.

I am disappointed though to not see Whirlpool [wikipedia.org] in the list.

And MD5 is just plain out broken, and there are alternatives that are better in every respect. If I had my way the algor

You seem to be of the same type of people the GP was attacking. So you can either know you are safe or ahead of the game... For what are you using cryptography again? Showing off your tech or securing your data?

The Victorian "thief lock" is well-tested, has been around for ages, is well understood by experts, and is used by exactly no-one to secure their belongings. The high-end, high-quality locks that security experts rave about are, by comparison, barely tested by anyone, have had minimal serious testing, are probably not understood by many experts owing to IP laws, and are used by people serious about keeping their belongings. Which camp did you say you fall into, again?

The submitted example code for this contest uses a standardized API that is 99.99% the same as that used by mhash, libgcrypt and others. Likewise, submissions for the AES contest used a standardized API that was 99.99% the same as used by mcrypt, libgcrypt and others. The differences largely consist of the prefix used on the function names and restrictions on the naming of global variables. Search-and-replace should be almost sufficient. Adding some means of registering the function with the library (usuall

its if those bits that come out have a truely random distrobution which being unpredictible. The outputs of hash functions are weakened from theoretical (perfect) (apparent) entropy by a need to do things fast. Distrobutions are not perfectly distrobuted and randomness is removed through the passes.

But both of these things can be completely solved with even crappy systems simply by using larger keys. The most import thing is a (CPU+RAM/bit of entropy). If want something secure do a triple blowfish-whirlpool

You are absolutely correct, which means that the difference between one hash that is currently secure because nobody has found any weaknesses and another which also has no currently-known weaknesses is one of confidence that a weakness won't be found soon. SHA-1 has vulnerabilities which (should) reduce the confidence levels. MD5 is considered completely broken within such things as validating a file is untampered with. But SHA-1 is likely used for classified data (which it should no longer be, it's no long

MD6 (similarity in name to MD5 is entirely intentional) looks very interesting:

Security: MD6 is by design very conservative. We aim for provable security
whenever possible; we provide reduction proofs for the security of the MD6
mode of operation, and prove that standard differential attacks against
the compression function are less efficient than birthday attacks for finding
collisions. We also show that when used as a MAC within NIST
recommendations, the keyed version of MD6 is not vulnerable to linear
cryptanalysis. The compression function and the mode of operation are
each shown to be indifferentiable from a random oracle under reasonable
assumptions.

MD6 has good efficiency: 22.4-44.1M bytes/second on a 2.4GHz Core
2 Duo laptop with 32-bit code compiled with Microsoft Visual Studio
2005 for digest sizes in the range 160-512 bits. When compiled for 64-bit
operation, it runs at 61.8-120.8M bytes/second, compiled with MS VS,
running on a 3.0GHz E6850 Core Duo processor.

MD6 works extremely well for multicore and parallel processors; we have
demonstrated hash rates of over 1GB/second on one 16-core system, and
over 427MB/sec on an 8-core system, both for 256-bit digests. We have
also demonstrated MD6 hashing rates of 375 MB/second on a typical
desktop GPU (graphics processing unit) card. We also show that MD6
runs very well on special-purpose hardware.

While raw speed isn't great (the default single-threaded 32-bit md5sum in Linux can do 325 MB/s on a 2.4 GHz CPU) maybe its multi-core friendly design is the right way to do it right now. The original MD5 will probably not entirely disappear because of its speed.

(OTOH if you're hashing SSL web traffic it's probably worse to have your hash bog down other CPUs that are busy with their own jobs)

I don't find the multi-core so useful, it's rare that I want to hash one, very large file. More often I want to hash many things, which naturally parallelises.
In your SSL web traffic example, if you app isn't dealing with as many connections as it has cores, then you probably don't have to worry about performance, and if you do then you already have one compression per core.

Yes, but not a major threat. Quantum computers allow the search time to be cut to the square root of the normal search time. This effectively halves the key length. A 256 bit key now takes only an average of 2^127 operations to find instead of 2^255.

This isn't nearly so much of a problem though as for public key encryption schemes. For RSA, for example, quantum computing changes the time from super-polynomial but sub-exponential to being polynomial with a very low exponent.

MD6 is definitely a serious contender. Its very conservative and well researched. It's main contender is probably Skein at the moment, although there are a few others to consider. MD6 is however not as fast as some contenders, not as flexible as some and its internal state is, as I believe, larger, which makes it more of a pain on embedded and smart card processors. In all this, Skein beats MD6. It's parallel design is using a typical hash tree, which can be used for many other hash methods as well, although MD6 uses it in its main operation.

IIRC, Skein is getting about 6 cycles a byte in 512-bit mode on 64 bit platforms, which on a 2.4GHz dual core CPU would yield a theoretical 800 MB/s in a parallel tree hashing mode, 400 MB/s in standard mode. Apparently MD6 has a parallel mode also, and it's striking that both hash functions are trying to be minimalist by employing only three fundamental operations (AND, XOR, SHIFT for MD6; XOR, ADD, ROLL for Skein) and lots of rounds. It's odd that MD6 should be so much slower. Perhaps it hasn't been fu

True, a good dose of salt will improve the hashes. (But it may lead to higher blood pressure.) My concern is that the vast majority of "hackers"/crackers are either skript kiddies or macho wannabes. They can exploit known weaknesses, they can use known techniques, they can download known black hat toolkits, but they will never discover or write anything worth a damn themselves.

If 99% of the risk comes from people with 1% of a functioning brain, it makes no sense to not take simple precautions that might (

I spent a few hours the other day looking over all of the submissions; Keccak and Skein are my favorite contributions. My criteria was "does the hash generate a fixed-length output, or is the hash capable of also being used as a stream cipher".

There are only four unbroken contributions that can generate arbitrarily long streams of numbers: Keccak [noekeon.org], LUX [tugraz.at], MeshHash [tugraz.at], and Skein [skein-hash.info]. Of these contributions, LUX and MeshHash, while not broken, already have cryptanalysis done against them that make me a little uneasy using them.

I prefer Keccak over Skein, for the simple reason there is a bonda-fide 32-bit variant of Keccak that can run quickly on 32-bit systems. Skein is designed to run well only on 64-bit systems. Part 5.4 of the Skein paper talks about the possibility of making a 32-bit variant of Skein but that they need to come up with rotation and permutation constants, and figure out how many rounds a 32-bit Skein variant would need. I would like to see Schneier, et al (the team responsible for Skein) actually do this. Skein is more flexible that Keccak (I think threefish is the first tweakable block cipher since the somewhat broken Hasty Pudding Cipher), and is faster on 64-bit systems, but I would like to see it run on embedded and legacy systems better.

A better overview: The SHA-3 Zoo [tugraz.at]. Did you look at Edon-R [tugraz.at]? It is not be the most flexible, but it's the fastest one. Followed by Skein.
I agree to what Bruce Schneier wrote [schneier.com]: sort the algorithms based on performance and features, and then focus on the top 12.

My criteria was "does the hash generate a fixed-length output, or is the hash capable of also being used as a stream cipher".

Every hash function can be used as a stream cipher: you simply hash the password, then hash the resulting hash, and so on, and use each intermediate hash as input to a stream you then XOR the cleartext stream with to produce the ciphertext.

Of course for this to be secure, the hashes must be undistinguishable from random strings, but I'd imagine that's a requirement for a good hash fu

The problem with this, of course, is that due to the Birthday Paradox, you will start creating a loop after (on average) sqrt(NUMBER_OF_POSSIBLE_HASH_OUTPUTS). For short messages, this is usually okay, but for long streams of "random" bytes, this is totally unacceptable.

The problem with this, of course, is that due to the Birthday Paradox, you will start creating a loop after (on average) sqrt(NUMBER_OF_POSSIBLE_HASH_OUTPUTS). For short messages, this is usually okay, but for long streams of "random" bytes, this is totally unacceptable.

Good point. I assumed that you'd loop after 2^hashlength, but of course even that has the same problem. I guess it just goes to once again show that cryptography should be implemented by real experts:).

Add dedicated hardware to an embedded system just so that it can perform hashing?Given a choice between the above and picking a hash function that can run decently on the 32 bit processors like ARM, MIPS and x86, I highly doubt the first option will be chosen.

If NIST can get slashdotted, we have far more serious problems than just hash functions being broken and we should go back to being an agrarian culture. A far more likely outcome (and, IMHO, a better one) would be for the mailing list to explode in new members (a total of 51 + non-entering SHA3 Zoo contributors shouldn't be too hard to beat) asking totally obvious questions that weren't asked (because they were obvious) but should have been (because arguing a point is a superb way for the arguer to spot wea

Popular wisdom has it that no product will have any support for any of these algorithms for years â" if ever. Of course, popular wisdom is ignoring all Open Source projects that support cryptography (including the Linux kernel) which could add support for any of these tomorrow. Does it really matter if the algorithm is found to be flawed later on, if most of these packages support algorithms known to be flawed today?

It matters a lot. Say OpenSSL added all of these algorithms tomorrow. Some idiot developer (hint: go read DailyWTF) will build on top of it. OpenSSL now has to maintain backwards compatibility - so they can never take out the algorithm. A month from now, the algorithm gets broken completely. But because OpenSSL shipped with it, they can never take it back out.

The "popular wisdom" standard for proliferating a new algorithm is not how shiny it looks at first glance. Popular wisdom waits months or years until algorithms seem good enough. MD5 (or even MD4), SHA1 - all are good enough for some purposes (generally, when attacker does not control input). And if the attacker does control the input, the only sure solution is to send the whole thing - anyone believing otherwise needs to review the meaning of the word "hash". A secure hash is merely an irreversible hash with a very low risk of collision.

Even this article is mostly "security theater". There are very, very few uses of secure hashes where SHA1 (or even MD5, for that matter) is not good enough.

Err, well, no. They can take out features, they probably have in the past and they certainly will in the future. Linux is no different. Those who wrote code assuming Intermezzo or the kernel-based devfs would always be there are, well, victims of their own folly.

The same applies to hash functions. You have some mechanism for identifying which function you are using and then you use it. Hard-coding is for wimps and fools, and is almost always the true cause of backwards incompatibility. Correctly-engineere

The article is already out of date. The round 1 candidates were announced back on December 11. Since that time, 11 candidates have been broken. For the latest information, I recommend visiting the SHA-3 Zoo [tugraz.at].

Also, the article suggests that candidates will continue to be broken quickly, but I doubt this will happen. The weak hashes will be broken quickly, but there are likely to be many strong candidates which will not be broken during the contest. Other factors (speed, simplicity, etc.) will determine the ultimate winner.

I dunno. From th last time NIST updated its website to the last time SHA-3 zoo updated theirs, a whole bunch more functions got broken. And as the pool dwindles, the number of crypto experts studying each function increases and the value (both of breaking the hash and in terms of PR within crypto circles) rises. Sure, it won't be linear, but I don't expect the fall-off to happen for a while yet. IF anything, the breakage might rise for a brief time as the holidays afford precious extra thinking time and a w

Does it really matter if the algorithm is found to be flawed later on, if most of these packages support algorithms known to be flawed today?

does it matter? does it matter?? fuck me it fucking matters.

example 1

there's a type of encryption algorithm principle - the feistel cipher - see http://en.wikipedia.org/wiki/Feistel_cipher [wikipedia.org] - where you perform one simple transform function as "round 1", then for rounds 2 and 3 you do a one-way hash function, and then for round 4 you do a simple transform function.

if the one-way has is ever broken, your encryption cipher is also broken.

game over: any traffic that's ever been using that cipher can be decrypted.

example 2

your credit cards you carry around? the PIN number isn't stored on the card - but an MD5 hash of the PIN number *is* stored on the card (making replay attacks possible, believe it or not).

if MD5 is ever cracked...

game over: anyone can get your PIN number.

example 3

your peer-to-peer filesystem, your git source control system, they use one-way hashes to store an index of the data blocks. let's say that someone deliberately wants to break deployed systems, they work out what file chunks could end up being mapped to the same one-way hash...

game over: anyone can corrupt the database or the peer-to-peer filesystem by _deliberately_ making file or file chunks write to the same block.

i could go on with the list of examples - authentication systems that would fall over; internet bank systems that could be broken in to - we _totally_ rely on one-way hashes working correctly.

it's important beyond _belief_ that these one-way hash functions work, so much so that i was staggered that the question even had to be asked as part of the article-announcement.

Wait, you haven't figured out that Slashdot is the compression function of the cryptographic hashes of an advanced extraterrestrial race (whose projections in our reality are, well, whatever you find most amusing)?

your credit cards you carry around? the PIN number isn't stored on the card - but an MD5 hash of the PIN number *is* stored on the card (making replay attacks possible, believe it or not).

A common fallacy. The standards for PIN generation on magnetic cards were developed long before MD5 became common. The 'IBM' methods (guess who makes the security processor at the other end of the ATM links?) are based on your PIN being a hash of your account number and a bank secret key, known only to the bank and the ATM. This lets the ATM work offline but not know your personal PIN.

Later they let your PIN be changed by also storing an 'offset' between the 'real' PIN and CSP (customer select PIN) for each

MD5 is effectively broken, replay attacks against debit and credit cards are commonplace (banks lose millions to it yearly), and the use of it in something as commercially sensitive as a credit card is the height of stupidity. "If" it is ever broken doesn't apply, because enough weaknesses have been established to prove conclusively that a total breakage is just a matter of time. If it hasn't already happened. I doubt organized crime or the NSA file the hashes they can break with the Hash Function Lounge.

your credit cards you carry around? the PIN number isn't stored on the card - but an MD5 hash of the PIN number *is* stored on the card (making replay attacks possible, believe it or not).

I sure as hell hope not! If that was the case, anyone with a card reader could brute-force your PIN in under a second by taking the MD5 hash of all 4 digit numbers, and comparing them to be hash that is supposedly on the card.

an MD5 hash of the PIN number *is* stored on the card (making replay attacks possible, believe it or not).

if MD5 is ever cracked...

game over: anyone can get your PIN number.

Bullshit and chips. Look, there are only 10,000 possible pins, do you know how long that would take to force? Hell, a complete rainbow table is only 156k. Even if salted, do you know how long it takes to hash 10,000 4 digit numbers?

Most credit cards do, or at least can, have a PIN so you can use them to withdraw cash from a cash machine (ATM). In the UK, and increasingly in the rest of the world, you now enter your PIN rather than signing a piece of paper when making purchases too.

Replying on myself here, but any algorithm that starts with encoding the hash size is bad as well, IMHO, and there are some examples of that in the SHA-3 zoo. If you have e.g. XML base 64 encoding you may not know the full length before decoding, so you cannot hash at the same time.

Several points about this:-DES was never algorithmically broken--it was just designed with too small a key size. 3DES effectively doubles the key size to something reasonable. MD5, however, is actually broken--it has algorithmic weaknesses that can be exploited. Thus, it's not an analogous case.-We know a lot more about hash functions now than was known when MD5 was designed. From new attacks (e.g. multicollisions) to new design techniques (e.g. HAIFA), there's a lot more knowledge for cryptographers to

The fact that the OS community uses MD5 still just shows how slow it is to move to new technology.
MD5 is broken, it's trivial to collide. There are free alternative hashing algorithms. Stop using MD5, stop using MD5, STOP USING MD5!

The first third of the submission is interesting, relevant and sane. The rest, especially the question, is based on so much mis-understanding of the topic at hand, I just lack the time to point all of it out. I suggest OP re-thinks the effort of switching to new _and maintaining the old_ hashes for a second or twenty. That should be a good starting point for some relevations.

Not strictly true. Rainbow tables are only feasible for very small inputs -- like 8-character-or-less passwords. Salting makes the minimum input larger (much larger, since salts are usually full binary, wheras password characters are almost always out of a small subset of possible characters). Of course, rainbow tables are absolutely useless if what you're hashing is, say, an entire file for a digital signature.

Anyone who has access to a set of password hashes will break some of them quickly. Just make sure your system is robust despite that (i.e., make sure that you can't get to a given set of password hashes unless you can already get to everything accessible using every password in that set).

Humans choose short, weak passwords, and always will. Make your system OK with that. There are plenty of ways, from limiting retries to using physical tokens. 4-digit PINs *work* for ATM cards, because the PIN isn't the

Fraudulent transactions using your pin number aren't covered by the law

Of course they are! If you steal pin numbers and withdraw other people's money do you think the cops will just say "let him go, he's not covered". Its still fraud!

What you mean is that the banks won't automatically reimburse you. They often will reimburse when its shown to be a crime but they are wary because of the large number of "same address" offences where the victim does not want to press charges.

You cannot secure websites or fora using 2 factor security. It's not reasonable. Therefore, the hash is all you have.

Well, if you just decide that's true, of course your security will be a joke. If the "fora" (forum is an English word now, pluralized normally) you care about is the comments for your blog, joke security is probably enough, but if there's risk of financial loss then may be worth spending a little money to get it right.

My bank uses 2-factor security, and the RSA key thingy is free - the bank comes out ahead on not eating fraud losses, plus it's good marketing. There are also software-only 2-factor solution

Not only did I not say anything about other attacks, but hashing passwords is probably the second-least-interesting application of a hash algorithm. None of the attacks against common hashes (MD5, SHA1) are even applicable to passwords. (The least interesting application is its use as a checksum.)

What does slow down a guessing attack is increasing the computational requirements of generating the hash, as is done with multi-round PBKDF. Alternately, all guessing attacks are rendered useless by selecting pass

Wikipedia:"The ideal hash function has four main properties: it is easy to compute the hash for any given data, it is extremely difficult to construct a text that has a given hash, it is extremely difficult to modify a given text without changing its hash, and it is extremely unlikely that two different messages will have the same hash. These requirements call for the use of advanced cryptography techniques, hence the name."

The whole point of the exercise is to find an algorithm that can't be easily reversed and that's far from impossible.

Besides, hashes are never completely broken, at most you can make various collision attacks, you never get away with putting in arbitrary data.

Did you really need a link to explain that? I mean, the fact that I'm deriving a 16-byte hash from a multi-gigabyte file should be a pretty good indication that there's no way to turn it around. Otherwise we'd have some really cool compression algorithms.

True enough, but by the same token, the inverse of a hash can then be considered any synonym if you only need one of the possible inputs to generate the same output. If a hash is badly broken, then it may be possible to algorithmically produce an infinite series of synonyms given some seed value that is one of those synonyms. If it is horribly horribly beyond broken, you can also show that there are no synonyms that are not in that series.

I tried reading your comment carefully, honest. I even focused on words 7-10, "hashes relying on is", because they seemed to almost make sense. But I'm afraid the bottom line is that there just isn't enough English, sense, and/or cryptography in that paragraph.

There is reversible and there's reversible. If you can conclude any interesting properties of the input message from the output that counts as being broken from the standpoint of being reversible. One example would be if you could conclude the input's last few bits must've contained an equivalent number of 0s and 1s, or the input was one of infinitely many prime numbers.

Except that's totally irrelevant, because in most cases we don't CARE about regenerating the original input. You can usually create a useful attack simply by finding SOME input that generates the same output, and that's what you want to build a hash function to resist.

I'll ignore your misuse of the term 'reversible', others have explained it.

Rainbow tables are only feasible against poor implementations. I.e. the windows SAM hashes. Even the stored LM2 hash is susceptible to a rainbow table that can fit on a dual layer DVD for over 99% of the keyspace. The old crypt in Unix systems is similarly weak (though still not nearly as much). The implementation on MD5 crypt on/etc/shadow would require about 10^73 yottabytes of a rainbow table to achieve the same end in the same way.

In other words, a dictionary attack on the password space rather than precomputed tables of hashes remain the biggest threat to/etc/shadow. No application in their right mind would not use a similar strategy to remember how to prove client knowledge of a password.

MD5 is not sufficiently broken yet to induce panic. As far as I understand, there is no attack yet that has sufficient control over the colliding data to be of consequence yet.

Besides, what would your proposal be? The other logical class of cryptography would be two-way, which fundamentally provides no security in these instances. Hashes passwords are so a server can prove a password is valid without having to know the password. If it were two way, the crypted data and the key would both have to be accessible, making it trivial to break if you achieve privilege to get the password file today. The other major application is download verification, to enable a small amount of data to be distributed in a more trustworthy way to validate data transmitted in the most expedient way, or to validate future transfers once trust is established..

The implementation on MD5 crypt on/etc/shadow would require about 10^73 yottabytes of a rainbow table to achieve the same end in the same way.

Since the whole idea of/etc/shadow is that it is not readable by anyone besides the root, rainbow tables would be of no use whatsoever against it. Well, I suppose you could use them as an optimized dictionary...

Besides, doesn't the use of salt prohibit the use of rainbow tables, or at least grow them beyond any feasibility; or did you take that into effect in those

Both/etc/shadow and the SAM database should not be readable by users, correct. The assumption is that some offline attack or online exploit is leveraged first in either case. For example, a local hard disk from a workstation is extracted and the local administrator/root password cracked. Chances are high that the password is the same on other workstations that may be hard to mount an offline attack on.

I counted the salt in the 10^73 yottabytes. Which I agree is beyond any feasibility for presumably a l

It does seem like that would be true, but in practice it's entirely possible that cracking hash functions (and block ciphers) is a computationally hard problem (in the "you can't do it" sense). The class of problem, in the general case, is NP-complete [wikipedia.org].