Short answer: Yes, except the lower/uppercase may vary. Hashes are (usually) hexadecimal anyway, so they can be treated as case-insensitive. Of course when outputted in another format (like the raw binary data, e.g. 128 'random' bits for MD5), it may be case sensitive. The output will always be the same though.
–
LucAug 8 '12 at 21:45

4 Answers
4

Hash functions are deterministic: same input yields the same output. Any implementation of a given hash function, regardless of the language it is implemented in, must act the same.

However, note that hash functions take sequences of bits as input. When we "hash a string", we actually convert a sequence of characters into a sequence of bits, and then hash it. There begins the trouble. Consider the string "café": among all the possible conversions to bits, all of the following are common:

Or even nicer, using .net's ASCII encoding: 63 61 66 3F (no idea how they came up with the ingenious idea to silently replace characters the encoding doesn't support with ? by default)
–
CodesInChaosMar 3 '13 at 21:11

Python (among other high-level languages) handles problem you described very well: there is a strict distinguition between byte and a character.
–
Smit JohnthJan 2 at 17:49

I was not sure if the hash is different for the same string if you hash it using linux and then with windows
–
eversorAug 8 '12 at 9:56

Yes it will be. The output of SHA1(x) will always be the same, no matter what OS or what library you use.
–
Terry ChiaAug 8 '12 at 9:59

4

The only possible difference between OSes will come if you're hashing something with new lines in it where it looks the same but isn't (i.e. you're hashing something that the user typed and hence they think it is the same because they pressed the same keys, but it isn't quite the same because Windows uses \r\n and Linux uses \n). Or badly written libraries, of course - I've seen some incorrect MD5s because they didn't zero-pad numbers and ended up with hashes that weren't 32 characters.
–
IBBoardAug 8 '12 at 10:05

2

@IBBoard That isn't really a hashing issue. Given x, SHA1(x) will always output the same thing no matter what. Most of the current libraries work fine, i have not come across any that outputs the wrong hash results so far.
–
Terry ChiaAug 8 '12 at 10:30

2

Agreed on "Given X, SHA1(X) is the same", but I was trying to warn that although "X1" may look like "X2" to make you think that X1==X2, whitespace (especially line endings) may differ. Wrong hashes that I've seen have normally come from people trying to wrap the Java MD5 code, which returns a byte array (IIRC) when they want a text string.
–
IBBoardAug 8 '12 at 12:31

Yes, the exact same "byte sequence" will always yield the exact same digest value regardless of implementation (assuming it's a correct implementation!)

The key word is this is always true for "byte sequence", but not always for "string" as you wrote. Depending on a lot of things, strings can be generated differently on different systems. There is the potential for a lot of white space or line ending differences, or ASCII vs Unicode UTF-16 encoding issues.

Also, be aware that when you display the digest value, you run into similar issues. Different implementations might represent hexadecimal digits with either upper case or lower case values, so a string equality test might fail.