Consider a common practically-collision-resistant hash function $\mathcal{H}$ (e.g. SHA-1, SHA-256, SHA-512, RIPEMD-160), perhaps based on the Merkle–Damgård construction as are the first three. We define a Message Authentication Code $\mathcal{C}$
$$(k,m) \mapsto \mathcal{C}(k,m)=\mathcal{H}(m||k)$$
where $||$ denotes concatenation, $k$ is a secret key (constant, or at least of fixed size), and $m$ is a message (possibly of variable length). Assume that an adversary can (iteratively) submit queries with $m_j$ and obtain $C(k,m_j)$, and wants to obtain $k$ or otherwise compute $C(k,m)$ for some $m\ne m_j$.

That MAC $\mathcal{C}$ is not trivially bad. In particular, if $\mathcal{H}$ was indistinguishable from a random function in the Random Oracle Model, $\mathcal{C}$ would be secure. And even though $\mathcal{H}$ may have the length-extension property, it does not turn into a devastating attack on $\mathcal{C}$.

The less impractical generic attack that I see is that if a collision was known for $\mathcal{H}$ with the colliding messages of moderate identical length, it could be deduced countless collisions for $\mathcal{C}$. Hence security is demonstrably not better than collision-resistance of $\mathcal{H}$ (for identical-length messages). We could assume that $k$ is half the size of the result of $\mathcal{H}$, and hope that the security is about 269 or is it 257 or even 252, 280, 2128, 2256 hash rounds for SHA-1, RIPEMD-160, SHA-256, SHA-512.

What are the known attacks against $\mathcal{C}$ (better than the above), and their cost, for each of these common hashes?

Is there hope for an argument that an attack against $\mathcal{C}$ would turn into an attack of similar cost against $\mathcal{H}$, or hint of the contrary?

Update 2: I am aware that the construction considered is weaker than HMAC, and in particular is vulnerable to collision on $\mathcal H$; I stated that, and that it is thus pointless to have the key wider hopeless to target security against some attacks better than half the hash's size. I'm asking exactly what cryptanalytic attack better than finding a collision on $\mathcal H$ there are. There is room for such an attack only by exploiting a weakness in the structure or/and the round function of a concrete $\mathcal H$.

2 Answers
2

One issue with this construction is described in section 6 of the original HMAC paper, "Keying hash functions for message authentication" by Bellare, Canetti and Krawczyk, where they note that finding a collision on $\mathcal H$, i.e. two inputs $x \ne x'$ such that $\mathcal H(x) = \mathcal H(x')$, directly yields a collision on $\mathcal C$ such that $\mathcal C(k,x) = \mathcal C(k,x')$ regardless of $k$. (Technically, this only works if the collision is internal, in the sense that $\mathcal H(x \| s) = \mathcal H(x' \| s)$ for any suffix $s$, but that's true for pretty much all known M-D hash collision attacks anyway.)

Of course, this issue is mostly irrelevant if $\mathcal H$ is assumed to be collision resistant. (Although it should be noted that, even for a perfect $n$-bit hash, a birthday attack can find a collision with only about $2^{n/2}$ evaluations, and that this collision can then be used to break $\mathcal C$ for any $k$.) However, given how hard achieving complete collision resistance seems to be compared to most other security properties asked of hash functions, immunity to collision attacks (which the HMAC construction provides, as long as the other security properties it depends on aren't compromised) is nothing to sneer at.

The MAC you created is what's commonly called a keyed hash function. The way you have done it has a couple of issues.

One is that you're hashing the message and then the key, but it's better to do the key and then the message.

The reason for that is that if someone finds a collision with your message, then they are going to end up with the same MAC. It is better to have the known-different data at the front of the construction, where it makes the most difference.

The other is the length extension attack. It's just a generalization of the above -- you want to reduce the chance that two messages of different lengths will end up making a collision.

If you assume a hash function that is immune to a length extension attack, then a keyed hash (with the key at the front) is as good as an HMAC.

Skein has this property, and also combines with it the fact that it's built on a tweak able cipher with the tweak carrying deltas. That's Skein's one-pass MAC, and there's a description of it and the security proofs in the Skein papers (see www.skein-hash.info).

The length extension attack applies to $\mathcal C(k,m)=\mathcal H(k||m)$, not $\mathcal C(k,m)=\mathcal H(m||k)$. Yes I'm aware that the later is not more secure than half the hash size, this is sated in the question.
–
fgrieuMay 29 '12 at 20:18

I'm not buying that with $\mathcal H$ immune to a length extension attack, $\mathcal H(k||m)$ is as good as HMAC; my understanding is that part of HMAC's revised security argument relies on having $k$ processed at both ends.
–
fgrieuMay 29 '12 at 20:49

2

@fgrieu Skein uses a scheme similar to $H(k||m)$ as MAC, and I believe the paper contains some security proofs for this mode. You could compare that with the proof for HMAC, and check if they made any additional assumptions. I think most proofs in the Skein paper assume certain properties of the underlying block cipher.
–
CodesInChaos♦May 30 '12 at 10:30