Privacy Preserving Outlier Detection: A Tutorial

Outlier detection is an important tool in risk modelling. In the context where data are distributed across multiple locations and data privacy is a concern, we need to start looking at privacy-preserving techniques for doing outlier detection. Linked here is a tutorial introduction to this topic I recently prepared.

Privacy-preserving (PP) statistical algorithms can appear difficult to appreciate at first because they sit at the intersection between data science and cryptography, both substantial topics in their own right. However, I find that once we have a good handle on a few primitives (oblivious transfer, secure scalar product, secure comparison, etc), many PP algorithms become relatively easy to understand and implement.

There are now practical PP algorithms for a range of problems so data science practitioners should really start paying attention.

The foundational technologies behind PP algorithms, including the important Secure Multi-party Computation problem, are now increasingly being used in conjunction with the Blockchain technique to produce potentially disruptive technologies like the Enigma system from MIT.