Jim Jewett <jimjjewett at gmail.com> added the comment:
On Tue, Mar 6, 2012 at 1:05 PM, Guido van Rossum
Jim Jewett:
>>... If they're just adding new keys, or even deleting other
>> (==> NOT the one being looked up) keys, why should that
>> keep them from finding the existing, unchanged keys?
>> ... The chain terminates as soon as the dict doesn't resize; without
>> malicious intent, the odds of hitting several resizes in a row are so
>> miniscule that it probably hasn't even slowed them down.
> Now I'm torn. If we'd have this RuntimeError from the start, would we
> consider it a flaw in the dict implementation or a feature?
I would personally have considered it a flaw. Wrapping all dict
access just in case some other code added a weird key seems ugly and
inefficient.
> RuntimeError when changing a dict's size while iterating over it is
> definitely a feature (so as to allow the implementation to rehash
> everything upon insert/delete).
In that case, you've explicitly asked for the whole set (or at least a
snapshot); the RuntimeError indicates that you can't get it. Maybe
there should be a way to say "I don't need the set to be perfect", but
there isn't, and you aren't getting what you actually did ask for, so
an Error is appropriate.
But in this case, you're asking for a specific key, and the key is
there. It may even be a perfectly normal string, that just happened
to hash-collide with something some other code inserted.
The RuntimeError exposes a race condition to user code -- and it may
not be the code that stuck the oddball class in the dict in the first
place.
> Note that Victor's alternate fix (nomodify.diff) has the same problem
> -- it just raises RuntimeError in the mutating thread rather than in
> the lookup thread.
Right; I'm saying that raising an error is the wrong thing to do.
But there is an analogy to the hash-code change vs collision counting.
Breaking an application that got unlucky once is wrong. But if the
bad luck *continues* to the point where it would only occur one time
in a zillion, RuntimeError is better than a likely infinite loop.
(It really was a question initially, but I've now convinced at least myself.)
I would rather see something like replacing
349 else {
350 /* The compare did major nasty stuff to the
351 * dict: start over.
352 * XXX A clever adversary could prevent this
353 * XXX from terminating.
354 */
355 return lookdict(mp, key, hash);
with
349 else {
350 /* The compare did major nasty stuff to the
351 * dict: start over.
352 * XXX A clever adversary could prevent this
353 * XXX from terminating.
354 */
355 return lookdict_paranoid(mp, key, hash, 1);
where lookdict_paranoid is a near-copy of lookdict that adds a
paranoia parameter and replaces those same lines with
else {
/* The compare did major nasty stuff *again*. */
if (paranoia < PARANOIA_LEVEL) {
return lookdict_paranoid(mp, key, hash, paranoia+1);
}
/* Something is wrong; raise a RuntimeError. */
}
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14205>
_______________________________________