Talk Abstract: Randomized algorithms are often enjoyed for their simplicity, but the hash functions used to yield the desired theoretical guarantees are often neither simple nor practical. Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees. The scheme itself dates back to Carter and Wegman (STOC '77).Keys are viewed as consisting of c characters. We initialize c tables T1,…,Tc mapping characters to random hash codes. A key x=(x1,…,xc) is hashed to T1[x1] xor … xor Tc[xc]. While this scheme is not even 4-independent, we show that it provides many of the guarantees that are normally obtained via higher independence, e.g., Chernoff-type concentration, min-wise hashing for estimating set intersection, and cuckoo hashing. We shall also discuss a twist to simple tabulation that leads to extremely robust performance for linear probing with small buffers.

Speaker Bio: Mikkel Thorup is Technology Consultant at AT&T Labs-Research where he has been since 1998. He holds a PhD from Oxford University from 1993. From 1993 to 1998 he was at the faculty of University of Copenhagen. Thorup's main work is in Algorithms and Data Structures and he is the editor of this area for Journal of the ACM. Currently he also serves on the editorial boards of SIAM Journal on Computing, ACM Transactions on Algorithms, and the open access journal Theory of Computing. Thorup has more than a hundred refereed publications and he is a co-inventor of the Smart Sampling Technologies that lie at the hart of AT&T's Scaleable Traffic Analysis Service.

Thorup is a Member of the Royal Danish Academy of Sciences and Letters, a Fellow of the ACM, a Fellow of AT&T, and a Full Professor at the University of Copenhagen. He received the 2011 Mathematical Association of America (MAA) David P. Robbins Prize for the paper "Maximum Overhang" that solved a 150-year-old problem: How far can a stack of identical blocks hang over the edge of a table?