I just want to ask that how difficult is to spoof the original MD5 sum (e.g.: the md5sum would be reachable through HTTPS!).

So we have:
- on the ubuntumirrorsrv: XY md5hash, and XZ ubuntu iso
- on mypc (the downloaded iso from ubuntumirrorsrv): XY md5hash, and XY! ubuntu iso.

so could the md5hash be the same (as the original one on the ubuntumirrorsrv) if there were a "mitm" attack, that modified (put a trojan) in the ubuntu iso (e.g.: one of my ISP)? (+- a few MBytes) - how difficult could that be?

2 Answers
2

It would be difficult to the point where to seriously suggest it even remotely possible is verging on lunacy.

There have been some demonstrations of theoretical attacks against MD5 wherein the "attacker" could create message data intended to yield a predetermined MD5 hash. But this is miles and miles away from adding a non-jibberish file to an ISO and having it give the same hash.

A much more likely attack scenario would be the MitM altering the page that lists the MD5sums before it gets to you so that you see the attacker's hash rather than the real one. However unlikely this may be, here are the hashes for your comparison:

agreed: +1. a corrupted image that satisfies the md5 check a lot more probable. a usable trojaned ISO that satisfies the md5 check? not so much. and IF you can manage to do it...you need to be working for DoD. a lot of people misunderstand the md5 collision attacks that related to key signing.
–
hbdgafJul 19 '11 at 13:30

@aking1012 Exactly my point. It'd be an enormous undertaking to find any dataset that matched the MD5sum exactly. But engineering a functional ISO with a malicious program included that matches the MD5sum approaches madness to even suggest.
–
Andrew LambertJul 19 '11 at 18:37

Agreed that eventual collision is inevitable...but I think it has to do with context and data size as well. Just thoughts. I would find a trojaned ISO with matching checksum an interesting point for examination if ANYONE EVER sees one.
–
hbdgafJul 20 '11 at 11:24

The would be called a second preimage. For a hash function h with an output of n bits, there are three kind of attacks that we consider; for each, there exists a generic algorithm with a high cost, and the function is deemed secure if we cannot find any method which is faster than the generic algorithm. The attacks are:

Preimage: given x (a n-bit string), find m such that h(m) = x. Generic attack has average cost 2n (expressed in evaluations of the function h over small inputs): the generic attack works by trying random messages m until hitting x (the "luck and pray" attack).

Second preimage: given m (a given string), find m' distinct from m and such that h(m) = h(m'). This is the case you envision here. Generic attack is again of cost 2n and is similar to the preimage attack.

Collisions: find m and m', distinct from each other, such that h(m) = h(m'). Generic attack has cost 2n/2 (known as the birthday attack).

MD5 has a 128-bit output. 2128 evaluations is a very high and should provide adequate security (it is one billion billions times higher than is technologically doable right now, even with a google/facebook-like budget). On the other hand, 264, while still very expensive (months of computation with thousands of computers), has already been demonstrated once (see distributed.net).

Moreover, a number of weaknesses have been found in MD5, allowing for a very efficient algorithm for generating collisions (with my PC I can generate one MD5 collision in 14 seconds on average -- using a single core). For that reason, MD5 is not considered secure anymore. But no shortcut for second preimages is currently known. The existence of weaknesses leading to easy collisions shows that the internal structure of MD5 is not "garbled enough" so we have reason to worry about preimage attacks which might be found in the near future. But, right now (July 2011), no such attack is publicly known.

So the answer to your question is that it would be overwhelmingly difficult to send you an altered ISO which would end up with the same MD5 hash than the original one. But the Ubuntu distributors would be well inspired to begin publishing SHA-256 hashes too. Just in case.

I was surprised when I looked for SHA1/256/512/etc. hashes of the ISOs and found none! One would think they would be doing those, at least, if not signing the file with a private key.
–
Andrew LambertJul 15 '11 at 23:16

Probably because this still wouldn't mitigate the most probable attack (which Amazed pointed out in his answer): Getting the MD5 from the same source, over the same channel as the file itself will not provide any authenticity. MD5 serves as nothing more than a checksum here.
–
freddybJul 16 '11 at 21:32

2

@Amazed, see the link from LanceBaynes. The sha* signatures exist, and have for several years. Plus the checksum files have been signed with GPG. Saying they don't care is completely incorrect. Your bug about requiring https for access to the signatures ouldn't even really help. If youare that paranoid use GPG.
–
ZoredacheJul 20 '11 at 19:03