MD5 uses a non-linear sin(i)* pow(2,32) ----> i plane to use cos(i)*pow(2,32)

Instead original values of A, B, C, D that are four initial seeds( or states), that changes additively during the processing of input text.----> I am planing to start with some different then given in MD5's RFC.

Also I would change the code functions. F,G,H,I (in MD5's RFC) with any other used in SHA r other.

I just want to know, What would be the effect on properties of MD5.
I want to use variant as a good hash function. I am not using Md5's variant for authentication.

Although, I have created four variants using above ideas and checked this with time inputs of 10 to 20 minutes and its working fine. Am i doing correct ?

I cannot advise against this strongly enough. Never roll your own crypto. Use a proper RNG, designed for this purpose. Windows exposes CryptGenRandom as part of its CSP, and Linux-like OSes have /dev/urandom.
–
PolynomialOct 8 '12 at 12:49

1

Is this for a uni project of some sort? I can see you may get some extra marks for this but in practice it's pointless and is more likely to make holes in your security than reinforce it.
–
Inverted LlamaOct 8 '12 at 15:00

3

@InvertedLlama If I had rolled my own crypto in a uni project, my professors would have bitchslapped me to hell for it. Never ever roll your own crypto, full stop.
–
PolynomialOct 8 '12 at 19:00

5 Answers
5

To answer the specific question of how your changes alter the characteristics of MD5, we must first restate what the MD5 security characteristics are. MD5 is a cryptographic hash function, so it is supposed to be resistant to collisions, preimages and second preimages. MD5 is not good at resisting collisions, since efficient methods for building collisions have been discovered since 2004. My own code (an implementation of Klíma's attack) produces, on average, one collision every 14 seconds, when it runs on a 2.4 GHz Intel Core2 CPU. As far as we know, MD5 seems to be strong against preimages and second preimages (a theoretical attack with cost 2123.4 has been described, and that's better than the generic attack of cost 2128, but not much better).

How do your variants fare ? Although this depends a lot on the precise modifications you intend, we can say the following:

Changing the round constants, replacing the sine with a cosine, should have no influence whatsoever on the security. None of the published attacks exploits any special structure of the constants; and, indeed, the known collision attacks are differential cryptanalysis which is oblivious to actual constant values (at least for the attack's efficiency).

Changing the fixed IV should have no bearing either. When the first MD5 collision was publicized, the result was much decried because the researchers got the endianness wrong and thus computed the collision not on MD5, but on some other function which differs from MD5 precisely on the IV. Their method did not depend on specific IV characteristics, so a few days later they ran the code again, this time with the right IV, and produced the real first collision on MD5.

Altering the bitwise functions in the rounds will have an impact, since it will change the differential paths which collision attacks use. However, the functions that MD5 uses are not especially weak; there is no indication that any other choice would make the function stronger. There is, however, a good probability that other functions might make any MD5 variant substantially weaker.

To sum up, among the changes that you suggest, there are some changes which should not change security, and some others which have a rather high probability of decreasing security quite a bit -- and beginning with a function which is already broken. That's an achievement, but not a very positive one.

It seems that you want to use your MD5 variant to generate randomness. It must be said that:

Using a hash function to build a PRNG yields poor performance. Hash functions like MD5 are very good at processing a lot of input data -- but for a PRNG, you want to spew a lot of output data, and hash functions usually suck at it. A properly optimized MD5 function, used to hash, e.g., successive values of a counter, might get you 100 MB/s worth of pseudo-random data, which is not bad, but any decent AES implementation on the same CPU will be almost twice faster.

Using a hash function to build a PRNG yields poor security. To get proper randomness out of a hash function, you need to believe that the function is a random oracle. "Believe" is the right word: we already know that MD5 (or SHA-1 or SHA-256) is not a random oracle (the length extension attack is enough to show it, and it can even be proven that it is not ultimately possible, in a very mathematical way, for a concrete function to be a random oracle). To make a sturdy PRNG from a hash function, it is best to use a more elaborate (but more expensive) construction, namely HMAC_DRBG.

Regardless of how you extend an initial random seed into a long sequence of pseudo-random bits, obtaining that initial seed is a harder problem. Many have failed.

Considering the flimsy and tenuous nature of the security properties of a hash function on which a hash-based PRNG relies, altering that hash function without any rational reason does not look like the best way to achieve security. It shall be noted that being a "perfect" hash function, with rock solid resistance against collisions and all kinds of preimages, does not make it good for a PRNG job. Being appropriate at being integrated in a PRNG is another security characteristic, that given hash functions may or may not exhibit; and it has been much less studied than resistance to collisions.

Your methodology sounds, at best, dubious. Things look as if you were throwing arbitrary tweaks at MD5, without any justification or rationality. This is more ritualistic than scientific. I have nothing against religions; however, while theosophical reasoning might give you good insights into Good and Evil, it is known to be inefficient at thwarting evildoers on an immediate basis. Jesus ensured redemption for us all (at least so goes the theory, according to the pope), but he still got nailed to the cross and died.

To sum up, your variants are unlikely to increase the security of MD5, while they are likely to decrease it; and the baseline is that MD5 is not strong to begin with; and even if it was strong as a hash function, it would not necessarily be strong as a PRNG.

Your are basically taking the problem in the totally wrong direction; in fact, in several totally wrong directions simultaneously. Before even thinking about assessing the appropriateness of any given algorithm for a job, you should first define that job with precision. If the job is generating randomness (to be precise, generating bytes which appear random, aka "unpredictable" for outsiders -- a computer being the ultimate deterministic machine, it cannot be really random), then using a hash function is not the smartest idea ever. Even more using a hash function of questionable repute like MD5. And twisting the function internals in the hope of making the function "better" is akin to performing brain surgery with a soldering iron: at least, it will make the failure indisputably spectacular.

If you need randomness, use what your operating system provides (/dev/urandom, CryptGenRandom(), os.urandom, java.security.SecureRandom... depending on your OS and programming environment). The OS is better at it than you. Just let it do its job.

@Thomas Pornin: Thanks a Lots... I was waiting since so long...:)
–
Grijesh ChauhanOct 12 '12 at 7:04

Actually I didn't want to generate random numbers. I just made some modification in MD5 for fun/learning purpose. And wanted to check whether "the variants I created effects the strength of MD5 properties and what would be result". To check this I use randomed number of 10-20 minutes. And as I was expected the changes were not noticeable. I was also aware that I shouldn't use modified MD5 (I am not mathematician). Initially I posted my question in wrong way so got to much down-votes. Just wanted to say you thanks You answered what I wanted. Other person couldn't understand my changes.
–
Grijesh ChauhanJul 8 '13 at 19:43

Never, ever, EVER roll your own cryptographic algorithm. Almost certainly, you don't have the skills required to do it securely. Instead, survey the available algorithms, determine which one meets your requirements, then BE SURE TO USE IT PROPERLY.

In this case, if you don't like your options from the crypto libraries available to you, search for a cryptographic PRNG (Pseudo Random Number Generator) and find a way to get it some pretty good seeds (no, current time is not a good seed, at least if its the only seed).

HASH functions like MD5, SHA-1 and Keccak (SHA-3) are not good random number generators. They weren't designed for that purpose and do not pass the most basic of RNG tests. If you want to know more about RNG testing, a search for RNG TESTS will give you more information. Most secure RNGs are implemented on top of secure PRNGs and given good seeding. The seeding can be Hashed, but hashing does nothing to increase security of the seed or the PRNG. Seeds are measured in Entropy and Entropy is equivalent to NON-GUESSABLE bits (information). However, total number of bits doesn't increase entropy. That's why current time isn't a very secure seed. Yes, it is unique (most of the time), but if outsiders can use good guesses to predict the seed time, then they can also predict the output of your PRNG!

"Repetition should be rare" - I'm not sure what you mean by this, but you can either use really long output from your RNG (512 - 1024 bits) or perhaps you're trying to implement some other algorithm (like choosing cards from a deck).

MD5 is not a random number generator. Not only will it (nor any variant you are likely to create) not pass basic randomness tests, but it would still require a random input, and the output will only be as random as the input you provide.

Worst, say you modify MD5. Now you have n = MD5'(x). You still need a random x in order to get your "random" number. Where, exactly, will this input come from? You haven't solved the problem; you've simply moved it somewhere else.

@Stepen Touset-Thanks for answer.But sorry, you misunderstood my question. I asked whether, the changes I made would effect the properties of original MD5 or not? And I thought it will not!. I have already verified it for 10 to 20 mins. on random input but I still want an opinion of others on this. MD5 is a hash function useful in authentication and data integrity. But may time it has been use for RNG.I need a hash function that is different than existing. Randomness needed to select randomly. And its well suited to the application.Plz give your OPINION how the change effect MD5 property.
–
Grijesh ChauhanOct 9 '12 at 4:48

What properties do you think it will not modify? Running a hash on 20 minutes of input is not even close to proof.
–
Stephen TousetOct 9 '12 at 5:59

Yes I can't trust on 20 mins output. That why I need your suggestion. Properties I required: Low-Collision, Quite different output hash values even on single input bit change...Also I can't use original one.
–
Grijesh ChauhanOct 9 '12 at 6:07

@Chauhan - We are telling you that MD5 and SHA are not good solutions to generate a random number. The chances of a collision surrounding SHA and MD5 are higher then you realize. Why don't you use a solution like GUID where the proability of a unique value is nearly guaranteed.
–
RamhoundOct 9 '12 at 12:39

Why can't you use the original one? And why would you use MD5, which is utterly broken at this point?
–
Stephen TousetOct 9 '12 at 15:58

If you need a unique hash function, the most common thing to do is to use your own unique salt with a standard algorithm - that way you don't risk accidentally making the hash weaker by messing around with its internal maths.