People’s online photos are being used without consent to train face recognition AI

Photos of people’s faces are routinely taken from websites to help develop face recognition algorithms, without the subjects’ consent, a report by NBC reveals.

In January IBM released a data set of almost a million photos that had been scraped from photo-sharing website Flickr then annotated with information about details like skin tone.

The company pitched this as part of efforts to reduce the (very real) problem of bias within face recognition. However, it didn’t get consent from anyone to do this, and it’s almost impossible to get the photos removed.

IBM is far from alone. As companies scramble to improve their face recognition technology, they need access to vast numbers of images to feed their algorithms. Just taking images that have already been uploaded to the internet is a very fast—but ethically questionable—way to do that.

Face recognition might be convenient for unlocking your phone, but it could be a powerful surveillance tool as well. Its use is expanding rapidly with virtually no oversight, leading to growing calls for the technology to be regulated.

Face recognition algorithms have a poor accuracy record when it comes to identifying non-white faces, and systems can misgender people on the basis of, for example, the length of their hair. One way to combat this is to add more images of, say, black women, or men with long hair, to the training data. But doing so without obtaining people’s consent before using their photos will leave many feeling deeply uncomfortable.