[Python-Dev] Next dict crusade

I'm still trying to work off the backlog of ignored dict ideas. Way back
here:
http://mail.python.org/pipermail/python-dev/2000-December/011085.html
Christian Tismer suggested using polynomial division instead of
multiplication for generating the probe sequence, as a way to get all the
bits of the hash code into play. The desirability of doing that is
illustrated by, e.g., this program:
def f(keys):
from time import clock
d = {}
s = clock()
for k in keys:
d[k] = k
f = clock()
print "build time %.3f" % (f-s)
s = clock()
for k in keys:
assert d.has_key(k)
f = clock()
print "search time %.3f" % (f-s)
# Excellent performance.
keys = range(20000)
for i in range(5):
f(keys)
# Terrible performance; > 500x slower.
keys = [i << 16 for i in range(20000)]
for i in range(5):
f(keys)
Christian had a very clever (cheap and effective) solution:
Old algortithm (multiplication):
shift the index left by 1
if index > mask:
xor the index with the generator polynomial
New algorithm (division):
if low bit of index set:
xor the index with the generator polynomial
shift the index right by 1
where "index" should really read "increment", and unlike today we do not
mask off any of the bits of the initial increment (and that's what lets
*all* the bits of the hash code come into play; there's no point to doing
this otherwise).
I've since discovered that it's got a fatal rare flaw: the new algorithm
can generate a 0 increment, while the old algorithm cannot.
Example: poly is 131 and hash is 145. Because we don't mask off any bits
in computing the initial increment, the initial increment is computed as
incr = hash ^ (hash >> 3) ==
145 ^ (145 >> 3) ==
145 ^ 18 ==
131 ==
poly
So if we don't hit on the first probe, the new
if low bit of index set:
xor the index with the generator polynomial
shift the index right by 1
business sets incr to 0, and the result is an infinite loop (0 is a fixed
point). I hate to add another branch to this. As is, the existing branch
in both the old and new ways is of the worst possible kind: it's taken half
the time, with a pseudo-random distribution. So there's not a
branch-prediction gimmick on earth it won't fool.
Note that there's no reasonable way to identify "bad values" for incr before
the loop starts, either -- there's really no way to tell whether incr mod
poly is 0 without a loop to do division steps until incr < poly (if incr <
poly and incr != 0, incr can never become 0, so there's no more need to test
after reaching that point). Such a "pre loop" would cost more than the
existing loop in most cases, as we usually get out of the existing loop
today on its first iteration.
But in that case, what am I worried about <wink>?
time-for-a-checkin-ly y'rs - tim