It's a HASH, for crying out loud. It's not meant to be provably perfect at identifying unique data streams.

It IS meant to be computationally infeasible to find two messages with the same hash. If someone has found a way to do that, MD5 has failed at its original purpose.

Until someone shows that you can (1) take any arbitrary data set M, (2) falsify it to data set N, by (3) modifying a limited portion of M in an application-useful way and (4) adding less than a gigabyte of additional data, and (5) still come out with M=>H and N=>H hash equivalence, I'll trust MD5, thanks.

This is because you are ignorant. "I can't imagine an attack, therefore there is no attack."