Your password

ICM 2018 survey

Together with Alex Andoni and Piotr Indyk, we have just uploaded on arXiv a survey on (the theory of) the high-dimensional nearest neighbor search problem (NNS), which accompanies Piotr’s talk at the upcoming International Congress of Mathematicians in Rio de Janeiro. At the same congress, Assaf Naor will give a plenary talk about dimension reduction; the corresponding paper will hopefully appear online soon.

In the survey, we touch upon both classic and new topics. Let me briefly go over the sections:

In the introduction, we define and motivate the problem and put it in the historical context.

We describe two classic frameworks based on dimension reduction and locality-sensitive hashing (LSH). In addition to that, we list known LSH families.

We survey some deterministic and Las Vegas data structures for the problem.

We cover recent line of work on the data-dependent LSH (where we tailor hash families to a dataset at hand), which constitutes the bulk of my thesis. Besides, we cover a beautiful and underappreciated algorithm for NNS over the $\ell_\infty$ metric.

We go over recent exciting developments for the closest-pair problem (an off-line version of NNS, which makes it somewhat easier), which are based on the polynomial method and fast matrix multiplication.

Finally, we talk about NNS for metrics beyond $\ell_1$, $\ell_2$ and $\ell_\infty$. There are three main tools on this front: metric embeddings (deterministic and randomized), NNS data structures for direct sums of “tractable” metric spaces, and (since very recently) randomized partitions obtained via estimates on nonlinear spectral gaps (featured in the previous post).