An elderly application was trampled to death today as everyone struggled to exit the Sha One theater after someone screamed that an unknown assailant had a knife. After the panic, there was no evidence of injuries from the alleged attack and police are still investigating the presence of an actual weapon.

It's a collision attack, that means you can make two files with the same SHA1 in 2^52 operations (via the birthday paradox).

In the best possible application of this attack you can make two files, one good, one incriminating and get somebody to sign the good one.

The two changeling files are generated via a randomizing process so generating meaningful text files really isn't possible, the files have to contain binary data with a 'random' appearance. Examples of this would be crypto keys, SSL certificates, stu

Actually, the birthday attack IS useful in the real world. The birthday attack has been successfully used to dupe Verisign into issuing an SSL certificate to an address of the attacker's choosing! The attackers simply created two CSRs with the same MD5 hash--one for their website (which they had signed) and another for their victim's site.

Had they been blackhats instead of security researchers, they could have performed flawless MITM against SSL sites.

Nice optimisation, but this sort of stuff doesn't belong in most programs. Compiler/machine Optimisations should be in libraries where possible (ESPECIALLY in specialist, highly power-hungry areas like bignum libraries), or directly supported in the language/compiler itself. If you find some particular code to be an unacceptable bottleneck, or you're finished implementing the project and have fixed every bug and feel the universe isn't quite right without some new ones, then it's time to start making your

I've read that there are places where the people there are so uneducated they often get their commas and periods mixed up. Probably some silly problem with typwriter keys or something I guess. Anyhow how could you expect those backward people who don't even know the right symbol for a decimal point to be able to do math.

Sheesh! good thing you pointed out that bad math to the poor slob. I bet he's feeling pretty stupid right now.

If I understand the attack correctly, I think most real-world SHA-1 usage should be secure for the time being. From the looks of it, the researchers were able to reduce the time necessary to find two inputs that hash to the same digest. This is very different from finding a second input that hashes to a known digest. If that were the case, common hash applications like storing the digest of passwords or a digital signatures would be vulnerable.

See also: idioms. No one where I live, ditch digger or professional, would raise an eyebrow at that phrase. Might I suggest you find larger grammatical fish to fry, or perhaps resolve not to get worked up over regional slang?

What country do you live in? In the US, "off of" is a well-established phrase. A patient may move "off of" one drug to another, or an IT system may move "off of" a particular platform to a newer platform.

Really stupid question (not a cryptographer), but is there anything wrong with using multiple hash algorithms (hopefully none derived from one another)? Surely breaking two or more hashes simultaneously would be far harder?

Really stupid question (not a cryptographer), but is there anything wrong with using multiple hash algorithms (hopefully none derived from one another)? Surely breaking two or more hashes simultaneously would be far harder?

E.g., MD5 is broken. But what if we use both MD5 and SHA-1?

Unfortunately not; once you've broken the strongest hash, the rest can be broken in polynomial time.

This has nothing to do with multiple hash algorithms. What you're referring to is that finding an n-way collision from a 2-way collision is polynomial time. That is, a 2-way collision is two documents with the same hash, and an n-way collision is n documents with the same hash.

Finding a pair of documents that have the same SHA1 hash doesn't help you find a pair of documents with the same MD5 hash. Indeed, none of the efficient-collision algorithms allow you to find collisions in both SHA1 and MD5 simultaneously. (Note that, as far as I know, there aren't even any efficient preimage attacks on MD5 or SHA1, only collision attacks.)

Finding a pair of documents that have the same SHA1 hash doesn't help you find a pair of documents with the same MD5 hash.

No, but if you can find one you can also find many, and in the many you can find an MD5 collision. From the link:

Joux then extended this argument to point out that attempts to increase the security of hash functions by concatenating the outputs of two independent functions don't actually increase their theoretical security. For example, defining H(x) = SHA1(x) || RIPEMD160(x) still gives you only about 160 bits of strength

"||" is concatenation. The theoretical argument, in brief: with the Merkle-Damgaard construction (used in MDx, SHAx, RIPEMDx), it takes a factor k to find 2^k collisions, over finding just a single one. By finding many collisions in one hash function, we have enough messages that they're likely to collide in the second as well.

For most systems, think about it in terms of the application developer. Say you encrypt a piece of e-mail with a password-based secret key. (Some function k(p) turns a password p into an n-bit key.) If you supply the wrong password, what should the system do? Usually you want the strength of the system to rest on the cryptography and not the inability to tell if a decryption was successful, so you simply include a marker to indicate if the decryption was successful. (There are standard padding and encoding

You can't be "sure" about much in cryptography. If m is a message, and f and g are hash functions, you are assuming that you get more security from f(g(m)) than from f(m). You have to define what kind of security you are hoping for. With hash algorithms, you usually want: given m', it is difficult to find some m where h(m) = m'.

So to answer your question, using f(g(m)) would only be more secure than f(m) if f contains some vulnerability and no cryptanalytic vulnerability is introduced by combining f and g. If I don't have any information about either type of vulnerability, I would have no reason to assume that f(g(m)) is more secure than f(m). In this situation, I would stick with f(m) because it has been studied more and people have probably spent time trying to break it.

To put it another way, breaking f(g(m)) does not necessarily require breaking both f and g. The resulting algorithm that you get by composing f with g is still only one algorithm (just like SHA-1), and that is the only algorithm that needs to be broken.

I'm not so sure he's talking about applying one hash to the other's output, as much as performing both hashes on the same material and storing both results, also checking both results. Then you'd have to create a collision for both hashes in order to beat the system.

I'm not so sure he's talking about applying one hash to the other's output, as much as performing both hashes on the same material and storing both results, also checking both results. Then you'd have to create a collision for both hashes in order to beat the system.

Yes, sorry. I meant to say you have message M, and you hash M with multiple algorithms, a(M), b(M),..., coming up with hashes A, B,... Now you have message N, and a(M) == a(N). It shouldn't follow (always, that is) that b(M) == b(N), especiall

An ideal cryptographic hash should require exponential (on the hash length) time to break. A "broken" hash, like MD5 or SHA-1, is one for which an algorithm exists which is capable of finding collisions in less than exponential time.

Combining hashes is the cryptographic equivalent of making a single, longer hash. If you combine broken hashes, you have a long broken hash, which would actually be far easier to break than an ideal cryptographic hash of the same length.

No; not by enough to make it worthwhile, possibly not at all. E.g. if you're willing to use 256 bytes of hash, you're much better off using SHA-256 than two 128-byte hashes.

Makes some sense, but the difference is, they only have to break 1 algorithm. He is looking at it like "they broke SHA-1, big deal, break another well tested and secure hash algo". But I would say that the more different hashs you throw in, the better (but longer validation process, which some reason to defeat the purpose of a hash, since it's meant to be a short validation).

"if you're willing to use 256 bytes of hash, you're much better off using SHA-256 than two 128-byte hashes."

Proof or citation please?

That said SHA1 is 160 bit (not byte), and md5 is 128 bit. So let's just have a proof that SHA256 is harder to crack than SHA1 + MD5 (where the hashes are used independently on the same data - e.g. not something like hash=sha1(md5(data)), but rather hash= concat(sha1(data),md5(data)) ).

"It was pointed out in the questions that another reason for concatenatinghashes is not to try to increase the theoretical security, but forpractical considerations in case one of them gets broken. This isprobably why SSL, for example, used MD5 along with SHA1. That is stilla valid reason."

Thus I'm still not sure that one would be better off using a single X bit hash

Meh, only for key diversification and PIN hashing. Both are pretty safe. Of course, that won't matter to people that are not too deep into cryptography. Once they hear it is unsafe they will try and migrate away.

So many major systems are secured with PK systems that depend on SHA-1 hashes now. If this can be broken, someone please put this to good use by making a collision that makes it possible for people to write homebrew code for the PS3 or 360.

I keep hearing about all these hash collisions and how I should be scared, but I wish I could at least get the good with the bad.

Afaict collisions aren't much use for such things, unless you could make two apps one malicious one non-malicious that collided and somehow get the non-malicious one signed to high privilages. But if you are in a position to do that you can just get code with a hidden malicious feature signed.

One specific thing that would really help would be if debian would make it a priority to do a complete job of packaging the relevant hash functions, along with bindings for popular languages. For instance, I have an open-source perl app [lightandmatter.com] that uses digital watermarks. The user can choose between SHA1 and Whirlpool. However, I want to keep my app simple for users to install, and the perl binding for Whirlpool hasn't been packaged for debian yet, so I've made SHA1 the default. A debian user who wants to use Whirlpool with my app has to jump through hoops, e.g., installing the perl module via CPAN. That's actually a real pain for a debian or ubuntu user, because CPAN and apt don't play nicely; you can get in all kinds of screwed-up states if you try to install half your perl modules using apt and half using CPAN.

TFA is talking about gpg. Well, realistically, the choice of hash function is not the bottleneck in terms of security when it comes to sending encrypted or signed email. The bottleneck is mainly just that it's too hard to use (isn't built in to most GUI email clients), and in the case of encryption it also suffers from negative network effects -- there's no big benefit to me from using gpg encryption in my email unless the people I'm communicating also use the technology. The world's best crypto doesn't do you any good if you don't use it because it's too much of a pain. I think gpg is clearly a case where the perfect has been the enemy of the good. They've been so hung up on protecting the user against obscure vulnerabilities that they've ended up making the darn thing too hard for the vast majority of users. The docs, last time I checked, were basically written in Martian. I have a bachelor's degree in math, I program computers as a hobby, and I've read Schneier's Applied Cryptography. I'm not claiming that makes me a big expert on crypto, but it does put me out in front of the vast majority of the population. Well, I simply cannot figure out all the ins and outs of gpg. Okay, I could, but it would take more time than I'm willing to invest.

installing the perl module via CPAN. That's actually a real pain for a debian or ubuntu user, because CPAN and apt don't play nicely; you can get in all kinds of screwed-up states if you try to install half your perl modules using apt and half using CPAN.

IME, 'dh-make-perl' does a pretty good job of making.debs from CPAN packages.

Of course, this doesn't invalidate your actual point that the bindings should be in there in the first place.

According to x509(1) and ca(1), OpenSSL supports md2, md5, sha1, and mdc2 as options for message digests for certificates. Since MD2 and MD5 are already broken, and SHA1 is now suspect, that leaves just the relatively obscure MDC-2 [wikipedia.org].

They're like locks, they make getting in hard enough that most people will look for an easier target.

And they're unlike locks, in that a fruitful attack can occur many years afterwards. A lock need only supply protection for a specific period of time - if no bad guys get in during that period, then the security can be regarded as perfect no matter how insipid in design. In cyber-security, even "near-perfect" is as imperfect as "completely lacking" - at least for high priority targets with legacy value.

Well, we're then assuming that the encrypted file will remain where it is (and in its current form).. if the object behind the locked door was to remain there indefinitely, the same sort of disclaimers apply (the better technology gets, the easier it will be to open)..

Considering the time for even a modestly skilled person to get into most locks is < 5 minutes (home locks anyway), I would say that it is not perfect.

High-security locks take longer, as does high-security encryption. We only need to use algorithms that have a MTBF of around 1000 years today, and baring quantum breakthroughs your pretty safe. I mean how long does even the most sensitive data need to remain protected? 30 years?

I guess if you are a high-profile politician/activist, or a murderer a little

I mean how long does even the most sensitive data need to remain protected? 30 years?

The exact method for constructing an atomic bomb is still secret, though most nuclear physicists could give you a good surmise... The exact method for constructing a Tellar-Ulam thermonuclear bomb is also still a secret that we wouldn't want to be generally available, and that's over 50 years old.

You do make a good point though, that "planned obsolescence" is a good feature to look for, particularly in a government encryption algorithm. The problem is that when attacks are found in these algorithms, they t

No, in proper cybersecurity, one should realize exactly that a particular protection mechanism is only likely to be secure for a limited period of time. There are even expected security times associated with different algorithms and key sizes.

This is why, for example, signing keys should only be valid for a limited period of time. It would be foolish to assume that it will remain secure forever, so it should be designed to become useless before it becomes insecure.

Right, almost secure isn't good enough for privacy, where you don't want others to know the contents of a file even after you have passed away.
"Mostly Secure", is good enough for multi-million dollar transactions that will be vapor soon after they are completed. As long as someone can't tamper with them during the transfer, it's ok.

A paradox in that an extremely valuable transaction may need less security than an almost worthless (to anyone else) file.

A filing cabinet with 50 year old government secrets might need to be as physically secure now as it was 50 years ago. Where as my expired and canceled credit card numbers, not so much.

Yes, but the value of physical security is cumulative. If you manage to protect government secrets for 50 years - even if this involves a $2 padlock and a footlocker - the security can be upgraded at any point to a higher level suitable for current threats. Cyber security on the other hand is only as good as its weakest e

If you manage to protect government secrets for 50 years - even if this involves a $2 padlock and a footlocker - the security can be upgraded at any point to a higher level suitable for current threats. Cyber security on the other hand is only as good as its weakest expression over those 50 years. Expose a rot13 copy of a file even one time and it doesn't matter if you later re-encrypt the file using the NSA's latest and greatest algorithm.

"Perfect security" varies according to what constitutes "perfect". For example, one-time pads can be considered perfect, if you can guarantee the security of the pads.

(One suggested method is to use a cosmological source of randomness and transmit encrypted the position of the radiosource. Since any time delay will result in not having the same pad as the people on either side, the pad is essentially guaranteed secure even if the encryption is broken, provided it cannot be decrypted through the attack faste

Over-generalization is generally a bad idea. MD5 and SHA-1 were flawed. That does not mean all cryptographic hashes are flawed. One could conceive of a cryptographic hash long enough that a computer as large as the universe would be required to successfully attack it within the lifetime of the Sun.

A related illustration is that OTP can provably provide perfect encryption (this is not the same as hashing, however).

There are some for which no known attacks exist. SHA-256 and SHA-512, Whirlpool and Tiger are all pretty thoroughly-reviewed with no weaknesses uncovered. The NIST hash function competition is causing a great deal of new hash function research and we'll almost certainly get a bunch of great new hash functions out of it -- many of them not only secure, but significantly faster than SHA-1.

If you're thinking that "no known attacks" isn't good enough, keep in mind that's as good as it every gets in cryptography, with the sole exception of the One Time Pad

If you're thinking that "no known attacks" isn't good enough, keep in mind that's as good as it every gets in cryptography

I think my math teacher would call that a "necessary but not sufficient" condition, I mean anything can be without known attacks by virtue of never having been reviewed. Minimums should include:

1. Published algorithm, no "secret sauce" security by obscurity2. Solid peer reviews by other cryptographers, definately not just the vendor or their hirelings3. Strong links to well-researched hard mathematical problems

Of course, nothing can guarantee that the NSA hasn't found some super-secret math thingie that'll cut through it like a knife through hot butter. But cryptography is also about eating your own dog food, if you don't use it for anything valuable who'd trust it? You can't really keep that a secret because you have to tell lots of people that this isn't really secure or they'd use it as if it were. And if you do use it for your valuables, would you really leave that kind of backdoor for someone else to find? Again it doesn't prove anything, but I think most modern crypto algorithms have no weaknesses known to anybody, and if one showed up it'd be just as big a OMG for those who made/approved it as everyone else.

The crypto lounge and hash lounge have a list of known flaws/attacks/documented partial attacks in algorithms. It's a good place to start......well, to start being paranoid, anyway, as more than a few of the algorithms listed have this nice skull-and-crossbones by them.

This is usually only true for asymmetric (public-key) crypto. There are hash functions which are provably secure if some math problem is hard... pretty good math problems too (discrete log, not something "easier" like DDH).

But they're fabulously slow, not to mention they operate over prime-order fields instead of on bytes (this is a bigger problem for block ciphers than for hashes). We accept this slowness for public-key crypto because it seems to be required, and because only the header has to go through

If you're thinking that "no known attacks" isn't good enough, keep in mind that's as good as it every gets in cryptography

I think my math teacher would call that a "necessary but not sufficient" condition, I mean anything can be without known attacks by virtue of never having been reviewed. Minimums should include:

1. Published algorithm, no "secret sauce" security by obscurity
2. Solid peer reviews by other cryptographers, definitely not just the vendor or their hirelings
3. Strong links to well-researched hard mathematical problems

Absolutely correct, with the partial exception of #3. It would be nice, perhaps, if #3 applied, but most symmetric ciphers and secure hash algorithms rely on complexity rather than hard problems.

More like no attacks are feasible. If you have an N-bit document, an N-bit quantum computer can attack it. Since the largest quantum computer know is like 3 bits, you should be safe for awhile.

There is no known quantum computer algorithm for attacking any of these hash functions. Shor's algorithm would allow a quantum computer to attack n-bit RSA with an n-bit quantum computer, but I've never heard of any counterpart for any of these hash algorithms, and it would be algorithm-specific.

Also, there may indeed be feasible attacks on the hash algorithms I mentioned. It's just no one knows of any, yet, and lots of very smart people have looked.

One Time Pad has no technical flaws, but still has to be used correctly. I remember hearing that 's how the US broke a rusian nuclear spy ring - the russians got lazy with the one time pad, and the US spies had enough info to see what was happening.

My basic point - if you fix the human side of all these encryption issues, you'll be plugging up a lot of holes. Don't expect a 'perfect security' you can set and forget.

Enigma was an approximation to a one-time pad (mostly because you could reconnect the plug boards as you liked and re-order the wheels as you liked), but was likewise cracked because of carelessness of use.

Enigma was an approximation to a one-time pad (mostly because you could reconnect the plug boards as you liked and re-order the wheels as you liked), but was likewise cracked because of carelessness of use.

Enigma was a stream cipher, not an approximation to a one-time pad. The configuration of plug boards, wheel order and initial wheel positions made up the key.

Essentially, that's exactly what a stream cipher is. A OTP is just a random bit stream as long as the message which you XOR with the message. A stream cipher is a pseudo-random bit stream as long as the message which you XOR with the message.

But a stream cipher, being pseudo-random, is by definition not an OTP. The fact that the keystreams are generally used the same way does not make a non-OTP a kind of OTP.

Sorry if this seems like nitpicking, but it's not. From an information theoretic perspective, the difference between a strong PRNG and an OTP is enormous. In particular, it's large enough that the PRNG does not have an OTP's perfect security.

One Time Pad has no technical flaws, but still has to be used correctly. I remember hearing that 's how the US broke a rusian nuclear spy ring - the russians got lazy with the one time pad, and the US spies had enough info to see what was happening.

You're talking about the Venona project [wikipedia.org]. A very impressive bit of cryptanalytic work.

This is just an idea, but wouldn't it be better for hash algorithms to be slower since it'd make it harder to brute force?

You can always slow down a hash algorithm by iterating it, if you really want to. But there are lots of applications where they need to be fast and unless the algorithm is flawed, brute force is simply impractical no matter how fast the hash is.

Finding a collision by brute force for a specific input and an unflawed 256-bit hash algorithm will require, on average, hashing 2^255 trial inputs. That's an impossibly huge number.

Even if you just wanted to generate *any* collision, so you get a major helping

That is not what secure means with regard to hash functions. Secure means that it is not feasible to construct a document which has the same hash value as a given document (pre-image attack) or to construct two documents which have the same hash value (collision attack). The complexity of these attacks is ideally such that simply enumerating documents is the fastest way (brute force). Reducing the number of documents which you have to try to find a match makes a successful attack more likely. The complexity which is deemed as breaking the hash function depends on the adversaries and time frames relevant to a particular application.

Hash functions have many uses, a hash function can be perfectly secure for one application but not for another. (In fact this is practically always the case for defects found in modern hash functions.)

Another big issue with a lot of current hash functions (at least MD5 and I think SHA1 too) is that thier block based nature means that if you can reasonablly generate two collisions of "random garbage" you can reasonablly generate two files with a prefix you choose before generating the collisions, followed by the two sets of "random garbage" followed by a suffix you choose AFTER generating the collisions.

Combine that with a carefully selected file format (IIRC it can be done with pdf) and you can make two

as proccessing power increases brute force is getting easier - but when you find a way of cutting down the brute force required.. that is when something can become very weak very quickly.

Not really. As processing power increases the internal state and number of rounds can increase as well, together with the complexity of calculations. This might be a problem if you have tiny processors though (smart cards, some PDA's etc).

And you cannot cut down on brute force by doing something smart, because the definition of brute force is that you DON'T do anything smart (duh).

The problem with secure hash algorithms is more that it is impossible to prove that they are secure. You can try and use as many

I think this would imply P != NP. While P != NP is a reasonable thing to believe, it hasn't been proven yet, so if a hash function was proven to be perfectly secure, we would likely have a proof that P != NP.

Well... not exactly; if hashes of a given hash function H have length O(1), then let n be smallest number larger than the length of the longest hash. Then there are at most 2^n - 1 different output values: 1 +... + 2^(n-1) = 2^n - 1. Then, run through 2^n valid inputs and hash them; two will be equa

I hate to break this to you, but any "well-documented means of turning any hash function into an encryption algorithm" *is* an algorithm specifically designed for encryption, so you're still breaking the rules.

Yup, you can use it to create a stream cipher quite easily. There's very little reason to do this though, and stream ciphers have their share of problems (if only the complexity in comparison with CBC ciphers and the unavailability in crypto libraries). Using AES (or 3DES if AES is not available) in CBC mode is the normal way to do things.