Date: Mon, 18 Mar 2013 02:28:02 +0400
From: Solar Designer <solar@...nwall.com>
To: crypt-dev@...ts.openwall.com
Subject: Re: scrypt TMTO defeater
Here's another attack on this kind of TMTO defeater:
On Mon, Sep 10, 2012 at 06:27:44AM +0400, Solar Designer wrote:
> For context, here's Anthony Ferrara's comment that I am referring to:
>
> http://drupal.org/node/1201444#comment-4675994
>
> Here's an attack on Anthony's tradeoff defeater ("simply writing to V_j
> in the second loop"):
>
> Assume that we don't want to use the tradeoff to its fullest, but rather
> just halve our memory needs. Without the tradeoff defeater, this is
> accomplished by storing every other V_j in an array that is twice
> smaller than original. The odd-indexed V_j's are then recomputed as
> needed from the preceding values that we did store.
>
> Now, with the tradeoff defeater we can't recompute these values like
> that because the preceding value might have been modified by prior
> iterations of the second loop. Or can we? Actually, quite often we
> can: by the time the second loop terminates, it will have modified only
> about 63% of V elements. And only 39% when we're at N/2 iterations.
>
> Of course, "quite often" is not sufficient for proper computation of the
> entire scrypt, so we do need to store the entire V array (not just its
> half) somewhere.
... and then I went on to describe the hot/cold approach:
http://www.openwall.com/lists/crypt-dev/2012/09/10/3
However, since we only write into about 63% of V elements, we only have
to store this many. We may index them via a hash table (yes, this means
we'll need some more memory) and we'll also need to store some elements
of the original V array (as computed by SMix's first loop) - e.g., every
8th element. This way, we'll reduce our memory needs to about 76% of
original (not counting memory consumed by the hash table, which might
not be much) and pay for it by doing 8x more computation in SMix's
second loop, or 4.5x more computation total (not including hash table
lookups overhead). In some cases, this tradeoff may be worthwhile (when
the attacker is constrained to a device with a disproportionate amount
of computing power for the device's memory size and/or bandwidth).
Additionally, if the attacker runs multiple such instances in parallel
and (de)synchronizes them in a smart way, their average memory usage per
instance will be less than the 76% figure. If we consider SMix's second
loop only, it may be about 52% of original. If we consider both loops
and assume they take equal time to run, it may be 32% of original. So
the attacker would reduce memory needs by a factor of 3 with only a 4.5x
increase in computation. This may be worthwhile on even more devices
than the example above.
The (de)synchronization trick is also applicable to the original scrypt
(although it makes less of a difference there), where SMix's first loop
does not need all of the memory at once. Only SMix's second loop was
included in Colin's attack cost estimates, but for practical purposes I
like to consider both loops.
Alexander