Anyone Interested in Data Anonymization or Differential Privacy Should Read This

Larry Flynt’s Hustler Club is in a fairly isolated location in Hell’s Kitchen, and no doubt experiences significant cab traffic in the early hours of the morning. I ran a query to pull out all pickups that occurred outside the club after midnight and before 6am, and mapped the drop-off coordinates to see if I could pinpoint individuals who frequented the establishment. The map below shows my results – the yellow points correspond to drop-offs that are closely clustered, implying a frequent customer.

The potential consequences of this analysis cannot be overstated. Go ahead, zoom in. You will see that the GPS coordinates are terrifyingly precise. Using this freely-obtainable, easily-created map, one can find out where many of Hustler’s customers live, as there are only a handful of locations possible for each point. Add a little local knowledge, and, well, it’s not rocket science. “I was working late at the office” no longer cuts it: Big Brother is watching.

Even without suspicions or knowledge of the neighborhood, I was able to pinpoint certain individuals with high probability.

Differential privacy — blurring the info — can work here. Note however, that for differential privacy to work, it must be applied with a very clear idea of the future uses that could be made of the data. This is often not at all easy.