At university we were told that it is a bad idea to implement a MAC by simply concatenating a key with the data to sign and to run it through a hash function (e.g. $s = \mathrm{hash}(k||\mathrm{data})$ or $s = \mathrm{hash}(\mathrm{data}||k)$). The next ideas that were presented then were HMAC and CBC-MAC which are a lot more complex (but standardized).

Now I'm wondering what the security of the following "idea" would be (I'm sure that there a good reasons why it is not used as it is more simple than HMAC or CBC-MAC):

This question came from our site for Information security professionals.

2

It certainly has worse properties compared to HMAC, thanks to low collision resistance of this scheme. In particular 128 bit MACs which are fine with HMAC, are too weak with your scheme.
–
CodesInChaosJul 13 '12 at 13:33

You're also relying on the strength of the cipher for both encryption and authentication, so given a non-key-disclosing attack on the block cipher you could also inject packets. With HMAC you'd only be able to read the stream unless you could recover the key.
–
PolynomialJul 13 '12 at 15:46

1

Note that hash(k||data) is secure for the next generation of hash functions, including all SHA-3 finalists.
–
CodesInChaosJul 13 '12 at 19:10

2

I would also question whether HMAC is considerably more complex that the present idea. Is $Hash( K_1 | Hash (K_2 | Data))$ really more complex than $Encrypt( Hash( Data ))$?
–
ponchoJul 15 '12 at 21:11

i have a question. If my MAC is MAC = enc(hash(m)) with hash = SHA-256 and enc = AES in CBC operation mode, then is it a secure MAC?
–
pasgabrieleOct 16 '13 at 20:46

2 Answers
2

No, in general, this is not secure, unless you make additional assumptions on the encryption method beyond the standard assumption of privacy.

To simplify things a bit, the assumption of privacy means that given a ciphertext $C$, the attacker has no information about what the plaintext might be. However, in your case, we don't really care if the attacker can figure out what the plaintext of the encryption function; we also give him the data, and he can compute $hash(data)$ himself, should he care to.

What we are concerned with is (again, to simplify a bit) that an attacker, given a message M and a valid tag for that message, cannot come up with another message, and a valid tag for that message. Translating that into your proposal, if the attacker was given $M$, and $E(Hash(M))$, can he pick another message $M'$, and come up with $E(Hash(M'))$?

Well, for a lot of encryption methods, he can. For example, if we consider a block cipher in counter mode, well, if you flip a bit in the ciphertext, the corresponding bit in the plaintext also flips. What that means that if the attacker computes $E(Hash(M)) \oplus Hash(M) \oplus Hash(M')$, well, that turns out to be precisely $E(Hash(M'))$, and so the attacker has won.

The additional property that we need to assume for the encryption method is nonmalleability; that is, given $M$ and the corresponding encryption $E(M)$, the attacker cannot modify the encryption so that it decrypts to any other specific message.

Of the standard encryption modes, well, ECB actually is nonmalleable, if (and this is a big if) the hash fits entirely within a single block output. Given that 128 bit hashes are vulnerable to collisions (and a hash collision would be another way of producing a forgery), this means using a nonstandard block cipher (for example, Rijndael with a 256 bit block size).

Authenticated encryption modes are also nonmalleable. However, this may be considered cheating; authenticated encryption modes work by effectively using a MAC internally; if the point of the exercise is to create a crypto primitive from other crypto primitives, well, this didn't do it.

It turns out that this is actually secure, up to the length of the block cipher, if $\mathrm{enc}(\cdot)$ is a secure block cipher (a pseudorandom permutation) and if $\mathrm{hash}(\cdot)$ is collision-resistant.

However, there is a catch. (You just knew there had to be one, didn't you?)

The catch is that typical block ciphers have too narrow of a block width for this to be adequately secure. In other words, the catch arises when you try to work out quantitative security level afforded by this construction.

For instance, let's say you use AES as your $\mathrm{enc}(\cdot)$ and SHA1 truncated to 128 bits (to match AES's 128-bit block width) as your $\mathrm{hash}(\cdot)$. Well, then you only get 64-bit security. This is vulnerable to collision-finding attacks of complexity approximately $2^{64}$. After examining about $2^{64}$ messages, you expect to find a pair of messages $m,m'$ such that $\mathrm{hash}(m) = \mathrm{hash}(m')$, through a simple birthday argument.

To be secure against such attacks, you'd need a hash function and block cipher whose block width is at least 160 bits. But few modern block ciphers support such a block width -- and this was especially the case when HMAC was defined.

Therefore, HMAC is a better fit for the common primitives typically available today.

There is a second reason why HMAC was defined. HMAC was designed to be robust: to minimize the assumptions it makes about the hash function. In particular, the HMAC construction was designed so that HMAC would have a chance of remaining a secure MAC construction, even if someone happens to discover collision attacks on the hash function.

This turned out to be a prescient design strategy. For instance, MD5-HMAC was widely used -- and then folks discovered feasible collision attacks on MD5. Fortunately, despite the fact that the collision-resistance of MD5 is totally broken, MD5-HMAC still appears to be secure: no one knows a way to break it. So, the designers of HMAC were pretty successful in making the HMAC construction resilient to certain kinds of failures of the hash function.

In contrast, your construction does not have that kind of resilience. This is a second reason why one might prefer HMAC over your construction.