How To Fool AI Into Seeing Something That Isn't There

Share

How To Fool AI Into Seeing Something That Isn't There

Getty Images

Our machines are littered with security holes, because programmers are human. Humans make mistakes. In building the software that drives these computing systems, they allow code to run in the wrong place. They let the wrong data into the right place. They let in too much data. All this opens doors through which hackers can attack, and they do.

But even when artificial intelligence supplants those human programmers, risks remain. AI makes mistakes, too. As described in a new paper from researchers at Google and OpenAI, the artificial intelligence startup recently bootstrapped by Tesla founder Elon Musk, these risks are apparent in the new breed of AI that is rapidly reinventing our computing systems, and they could be particularly problematic as AI moves into security cameras, sensors, and other devices spread across the physical world. "This is really something that everyone should be thinking about," says OpenAI researcher and ex-Googler Ian Goodfellow, who wrote the paper alongside current Google researchers Alexey Kurakin and Samy Bengio.

Today, neural nets are quite good at recognizing faces and spoken words—not to mention objects, animals, signs, and other written language. But they do make mistakes—sometimes egregious mistakes. "No machine learning system is perfect," says Kurakin. And in some cases, you can actually fool these systems into seeing or hearing things that aren't really there.

As Kurakin explains, you can subtly alter an image so that a neural network will think it includes something it doesn't, an these alterations may be imperceptible to the human eye—a handful of pixels added here and another there. You could change several pixels in a photo of an elephant, he says, and fool a neural net into thinking it's a car. Researchers like Kurakin call these "adversarial examples." And they too are security holes.

With their new paper, Kurakin, Bengio, and Goodfellow show that this can be a problem even when a neural network is used to recognize data pulled straight from a camera or some other sensor. Imagine a face recognition system that uses a neural network to control access to a top-secret facility. You could fool it into thinking you're someone who you're not, Kurakin says, simply by drawing some dots on your face.

Goodfellow says this same type of attack could apply to almost any form of machine learning, including not only neural networks but things like decision trees and support vector machines—machine learning methods that have been popular for more than a decade, helping computer systems make predictions based on data. In fact, he believes that similar attacks are already practiced in the real world. Financial firms, he suspects, are probably using them to fool trading systems used by competitors. "They could make a few trades designed to fool their competitors into dumping a stock at a lower price than its true value," he says. "And then they could buy the stock up at that low price."

In their paper, Kurakin and Goodfellow fool neural nets by printing an adversarial image on a piece of a paper and showing the paper to a camera. But they believe that subtler attacks could work as well, such as the previous dots-on-the-face example. "We don't know for sure we could do that in the real world, but our research suggests that it's possible," Goodfellow says. "We showed that we can fool a camera, and we think there are all sorts of avenues of attack, including fooling a face recognition system with markings that wouldn't be visible to a human."

A Hard Trick to Pull Off

This is by no means an easy thing to do. But you don't necessarily need inside knowledge of how the neural network was designed or what data it was trained on to pull it off. As previous research has shown, if you can build an adversarial example that fools your own neural network, it may also fool others that handle same task. In other words, if you can fool one image recognition system, you can potentially fool another. "You can use another system to craft an adversarial example," Kurakin says. "And that gives you a better chance."

Kurakin makes a point of saying these security holes are small. They are a problem in theory, he says, but in the real world, an attack is difficult to get right. Unless an attacker discovers the perfect pattern of dots to put on her face, nothing will happen. Nevertheless, this kind of hole is real. And as neural networks play a bigger and bigger role in the modern world, we must plug these holes. How? By building better neural networks.

That won't be easy, but the work is underway. Deep neural nets are meant to mimic the web of neurons in the brain. That's why they're called neural networks. But when it comes down to it, they're really just math on an enormous scale—layer upon layer of calculus. And this math is organized by humans, researchers like Kurakin and Goodfellow. Ultimately, they control these systems, and they're already looking for ways to eliminate these security holes.

One option, Kurakin says, is to incorporate adversarial examples into the training of neural networks, to teach them the difference between the real and the adversarial image. But researchers are looking at other options as well. And they're not quite sure what will work and what won't. As always, it is we humans who must get better.