Biometrics and Privacy

I’ve written before on biometrics. If you’ve read those posts, you’ll know that I like biometrics in general. But earlier this year, a new wrinkle popped up in the field: people are accusing Facebook of capturing biometric information without users’ consent. According to the lawsuit, the plaintiffs say that Facebook is capturing “faceprints.” I don’t know how Facebook works and what it captures from images. It may be capturing face geometries.

To me, a more likely scenario is that described by Cathy Reisenwitz in an article on fee.org. She argues that Facebook uses deep learning to recognize faces and that tagging people in photos helps train the algorithm. That would be a very cost effective way to do the training. Facebook has described their use of deep learning for facial recognition in a paper on their DeepFace project. Their goal appears to be to be able to recognize any face in any photo.

I am a Facebook user. I know they currently cannot recognize all people in all photos. They do suggest people in many photos, but sometimes it even misses some Facebook friends in my posts.

The paper about DeepFace describes their approach to important parts of using deep learning for facial recognition. In the fourth part of a multi-part series, Adam Geitgey describes how machine learning is used for facial recognition. The article is a clear and excellent description of the process. I won’t go into the detail here as Geitgey does such a good job.

My concern here is with privacy. Google and others have facial recognition. Television series and movies routinely show facial recognition in action. The method often depicted is that of using facial geometry. For example, the distance between the eyes, the distance from the lips to the chin, and other geometric characteristics. This gives a set of numbers which can be compared with a database of pictures to identify an individual, whether for authentication or identification. The data cannot be used to reconstruct an image of the face.

In a deep learning or machine learning approach, we might expect to use the same measurements. But we don’t. In fact, it doesn’t matter what the points and measurement we use are! As Geitgey points out, “It turns out that we have no idea. It doesn’t really matter to us. All that we care is that the [neural] network generates nearly the same numbers when looking at two different pictures of the same person.” That means that generating the numbers is a one-way process: we have no idea what they mean, and we cannot “undo” the process to generate a face.

So what’s the big deal? The big deal is that while this would make for potentially reliable authentication, it would also mean someone could use the technique (and a large database and fast computers) to recognize anyone in a crowd. That would be good for law enforcement because they could easily find bad guys, but less honorable people could use the technology to find out where you or I am at any given time. That’s the issue.

In my next post, I’ll continue this discussion with another aspect of facial recognition.