Yanny vs. Laurel: Sensory Illusions and AI Self-Driving Cars

Unless you’ve been living in a cave, you likely know about the recent craze over an audio clip that has created a social media frenzy and sparked a debate among both friends and foes alike. A short audio clip that was posted on Twitter had asked listeners to report whether they heard the word “Yanny” or the word “Laurel” when hearing the clip. Thousands upon thousands of replies seemed to suggest that the world is split into Yanny believers versus Laurel believers. At times, it has been an even split, while at other times the tide starts to go toward the Yanny side and the next moment it slides over to the Laurel side.

I’ve overheard people talking about this curious and mind-puzzling audio phenomenon while I’ve been in the line at Starbucks, while at the coffee machine in the office, while at the grocery store getting my weekly foodstuffs, and even while camping in the middle of the woods. If a tree were to fall, would it sound like Yanny or sound like Laurel? There are now zillions of memes online about the matter and numerous clever take-offs have been crafted in both audio and video formats. This seems reminiscent of the craze in 2015 that had everyone debating whether a picture of a striped dress was composed of the colors white and gold versus the colors of blue and black. It became an instant hit, it provoked lighthearted controversy, it lasted for a while as an intense focus of discussion worldwide, and eventually petered out.

Is all of this some kind of mass hysteria? Maybe the power of social media is such that it can cause people to go mad. Or, maybe people are so desperate for something novel or interesting that they latch onto these fads. The fascinating aspect is that people become quite agitated that whatever they think is the right choice, they are baffled that anyone could be on the other side of the issue. There was a Yanny proponent the other day that insisted that the Laurel proponent they were arguing with was purposely trying to provoke a controversy and merely pretending to hear the word as Laurel. This Yanny-fanatic was sure that there was a giant conspiracy going on, and that those on the Laurel side did not genuinely hear Laurel and were claiming they heard it for purposes of irking the rest of the world.

These kinds of debates also give scientists a moment in the spotlight. Some talking-head scientists right away suggested it was indeed all-in-their-mind as a psychological matter that showcased what people want to hear. In other words, if you wanted to hear Laurel, and you were presented with the audio clip, you’d think you heard Laurel, even though it was maybe actually being pronounced as Yanny. Likewise, if you wanted to hear Yanny, and then you heard the clip, you’d be convinced it said Yanny. This notion that people were being led down a primrose path did not seem to widely bear out though.

Yes, it’s true that people can often be seeded to think a certain way. Yes, people are known for becoming cognitively anchored to something and it is often hard to get them to shift from their original anchor point. Those aspects though don’t seem to account for the rather large number of people involved in this social experiment of the Yanny’s versus the Laurel’s. Instead, the scientific explanation that would seem to be the most plausible overall, and account for the largest segment of those immersed in this controversy, would be the aspect of sensory illusion.

Here, in the sensory illusion explanation, we consider the nature of the sound clip and note that it is very short in length and of relatively poor audio quality. In that manner, it is readily open to potential interpretation. Were the audio clip longer, such as an entire sentence, you’d perhaps have a better chance of ascertaining what it says, and likewise if the quality was higher and more distinctive it might be less likely open to multiple interpretations. It just so happened to be short enough and ambiguously sounding enough that it allows the ear to hear something that is not fully defined, and then the brain enters into the matter and tries to help clean up what was heard. This is akin to taking a blurry visual image and cleaning it up by refocusing the image and adding more pixels to it. Your brain is taking an ambiguous audio element and trying to make sense of it, doing so by internally polishing it and then trying to match what it heard to other sounds that it knows.

Have you ever been camping, and you looked off in the distance and saw a shape that maybe was a bear? Or, is it a human? Or, is it just a log that happens to be in the overall shape of a bear or a human. Or, maybe its bigfoot. But, anyway, the point is that your eyes can be tricked by visual illusions in that you see something vaguely and then your brain tries to polish it and match it to things that you know. Being at a distance of the image, you only have scraps of visual cues to work with. Your brain takes whatever morsels are available and tries to make it into something usable.

Airline pilots are known to be susceptible to visual illusions. There are many famous cases of airplane pilots that looked at a landing strip and thought that it was wider than it actually was, or shorter in length that it actually was, or that it was more upsloping than it was, or more down sloping than it really was. There are all kinds of visual illusions that pilots are supposed to be on the watch for. One is called the black-hole and it occurs typically when there is a body of water prior to where a landing strip is. If you’ve ever looked out a plane window when landing at an airport near the water, and at night time, you’ve probably looked down and observed that the water looks entirely blacked out. Rather than visually perceiving it as a body of water, it nearly looks like a mysterious black hole, as though the earth didn’t exist there, and it was just wide open empty space.

Vection is an Illustion of Self-Motion

Sensory illusion can also include tricks of motion sensations. Sometimes, while sitting in bumper to bumper traffic, I’ll notice the lane of cars to my right proceed forward slowly, and the lane of cars to my left proceed forward slowly, while my lane is at a standstill. This combination of motions to either side can occasionally create an odd feeling or untoward sensation that’s called vection.

Vection is an illusion of self-motion. You believe that you are in motion, even though you are not. In the case of the cars around me, I at times perceive them as stopped, and I feel like my car is rolling backwards. It’s a weird thing when it happens. If you’ve never experienced it, when you do so, you’ll momentarily think the world’s gone crazy and be vexed as to how in the world could your car be rolling backwards just out-of-the-blue. You might even reflexively stomp on the brakes of your car, doing so because your brain has told you to stop rolling backwards and the way to do so would be to bring the car to a halt. It’s one of those bizarre illusions, for sure.[

I am guessing that you likely accept the notion that there are plenty of optical or visual illusions that we humans can experience. And, that you likely also accept the possibility of motion-based illusions that we humans can experience. The idea of audio illusions is a bit harder for most of us to accept. You are tempted to believe that whatever is heard, it is heard in the same way, by all. But, if you think about foreign languages, and when you hear a foreign language that you don’t quite know, I think you would agree that there are times when the foreign words are spoken that you might be unsure of exactly what you heard said. Your brain is trying to make sense of the sounds and at times it isn’t sure what the sound really was. This can apply, perhaps incredibly, even to sounds that we think we know, such as the Yanny and the Laurel debate.

I’d like to take this Yanny versus Laurel debate and use it for another purpose herein, namely to spark discussion about the dangers of sensory illusions for a subject of another kind, as I’ll explain in a moment.

What does this have to do with AI self-driving cars?

At the Cybernetic Self-Driving Car Institute, we are developing AI systems for self-driving cars, and are well aware of the dangers of sensory illusions that can impact the AI of a self-driving car. Auto makers and tech firms making such AI systems need to also be aware of the matter, and so does the general public that will be occupants in self-driving cars or otherwise be near to or around AI self-driving cars.

You might at first be bewildered by the possibility of sensory illusions being applicable to AI self-driving cars. The AI of a self-driving car is supposed to be automation that does not have the frailties of humans. Humans are the ones that are prone to sensory illusions, including the Yanny versus Laurel debate, and the examples I’ve mentioned about human pilots being susceptible to visual illusions while flying. Certainly, the dispassionate and robotic like automation of an AI self-driving car would not be prone to these human limitations, you might insist. Indeed, many proponents of AI self-driving cars keep saying that the wonderment of AI self-driving cars is that we don’t need to worry about human drivers anymore that at times tend to drive while DUI or that get distracted while looking at their phones as they are behind the wheel.

First, let’s clarify that a true self-driving car, considered a Level 5 self-driving car, is at a level at which the AI is supposed to be able to fully drive the car without any human intervention needed, and that the Level 5 consists of AI that can drive in whatever manner a human could drive a car. At the levels less than 5, the human is still considered the driver of the car, even if the automation or AI is there doing driving too. In that sense, the less than level 5 cars are still reliant upon humans, and so whenever the AI hands over the controls to the human driver, or whenever the human driver opts to take over the controls from the AI, we’re now back in the realm of dealing with human susceptibility to sensory illusions.

But, I don’t want to distract away from the very important point here, specifically that the AI is also susceptible to sensory illusions.

If you need to sit down for a moment, now that I’ve mentioned this key aspect, please do so. Yes, in spite of the talk about how perfect the AI is going to be, the reality is that the AI and the self-driving car will also be susceptible to sensory illusions. It is going to happen. It has most certainly already happened. It is a danger. It is a known danger. It is something that the auto makers and tech firms aren’t necessarily talking about. It is something that needs to be of great concern by all, and we need to put in place as many protective measures about it was we can.

What kind of sensory illusion could an AI self-driving car be susceptible to? Lots.

There are numerous sensors on an AI self-driving car. There are cameras that capture still images and video images. There are radar devices and sonar devices. There might be LIDAR (light and radar) devices. Etc. Each of these sensory devices is not perfect. Each of these sensory devices can be faulty. Each of these sensory devices can work as expected, and not be in any error condition, and yet nonetheless provide sensory data that is ambiguous. The AI needs to take the sensory data and make some logical sense out of it.

The sensors collect data about the real-world around the self-driving car. This data is then usually transformed into something more amenable for the AI to deal with. For example, the data of an image might be very large in terms of the number of pixels, along with some pixels being unspecified or being captured but considered unsure of whether they are on/off, and the raw data is so voluminous that the AI wouldn’t be able to fully inspect it per se, and thus there might be a transformation and compression of the data that the sensor software undertakes. Thus, whatever originally was captured is not necessarily what the AI is about to try and interpret.

The transformed data is fed into the AI that’s running the self-driving car. The AI needs to figure out whether there’s a pedestrian in the image that was just captured. Similar to my earlier story about camping in the woods and whether you saw a bear or a human, the AI needs to try and guess from the image whether there’s a pedestrian standing in the street up ahead or not. Maybe its just a cone in the street. Maybe it’s a child. Maybe it’s a pedestrian but they are actually further away than the image suggests. If the image is captured at nighttime, the darkness might make the shape of the pedestrian hard to fully distinguish. In short, the chances of a sensory illusion are quite substantial.

In fact, you might want to read my analysis of the Uber incident in Arizona, since it is possible that the self-driving car might have encountered a sensory illusion that led to it striking and killing the pedestrian that was walking a bicycle.

Is that a Tractor Up Ahead?

Another factor to keep in mind about the sensory illusion of an AI self-driving car involves the Machine Learning (ML) elements. Suppose we’ve used a Machine Learning approach such as an artificial neural network and trained the neural network on identifying cars based on images of cars. This is akin to having a neural network learn to identify cats in images, which the neural network might do so by perhaps identifying that cats seem to have a certain kind of ear shape and they have whiskers. Thus, when an image of something is fed into the neural network, and if the image has what looks to be cat ears and cat whiskers, the neural network would report that it has found a cat.

Suppose we fed images of the rear ends of cars into a Machine Learning element such as a neural network, doing so to allow that when the AI self-driving car takes a picture of a car up ahead, the neural network can take that rear end image of the car and try to figure out whether it is a car and what type of car it is. We might have thousands upon thousands of images of the rear ends of Ford cars, BMW’s, and so on. They all are used to train the neural network. It thusly eventually tunes until it is able to somewhat reliably detect that the image contains a rear end of a car in it.

Now, let’s go back to the cat and let’s say I fed an image of cat that had no whiskers and its ears were oddly shaped (unlike any normal cat). The neural network that had been trained on conventional cat images would be unlikely to identify that there’s a cat in the image. In that same manner, if the AI self-driving car is driving along a dirt road, and a tractor is up ahead, the self-driving car upon inspecting the image of what the object is, might not be able to ascertain that it is a tractor. The rear end image of what a tractor looks like would be a lot different looking than the rear end image of a conventional car.

In this instance, I’m willing to include this example into the sensory illusion basket, even though we might all agree that the image of the tractor is let’s say unobscured and fully visually detected. Technically it is not really a sensory illusion as we might normally consider a sensory illusion. I am allowing it to be considered as such to point out that what the AI is looking for and what it finds can be two different things. Here, the AI is looking for whether a car is ahead, and it finds something that does not seem to be a car, but we would likely agree it essentially is a car in that it is a mode of transportation that acts like car acts.

This scenario can play out in ways that are quite dangerous. The AI might assume that since its seemingly not a car ahead, maybe it can be ignored. Or, maybe it makes some other untoward assumption. Whatever action the AI decides to take regarding the self-driving car, there is now a heightened chance that any maneuvers might be poorly chosen ones. Some believe that perhaps the now famous case of the Tesla that slammed into the truck on a Florida highway in 2016, might have involved a sensory illusion issue. It was claimed that perhaps the side of the truck was perceived by the Autopilot automation as being the sky, and so the AI of the Tesla did not presumably think any object was up ahead.

One of the concerns about using Machine Learning techniques such as neural networks is that the neural network might “learn” aspects that don’t really necessarily make logical sense to what we assume the neural network presumably has identified. For the cats example, suppose the neural network had found a pattern that suggested that cats all have brown fur (pretend that the images used for training purposes were of predominantly cats that had brown fur). In that case, the neural network might report that anytime a cat image is shown later on, unless it also has brown fur, the neural network indicates that the aspect of a cat being in the image is low or nonexistent. There was a famous case of a ML system that examined images of military tanks and was shown pictures of Russian tanks and United States tanks. The neural network seemed to be able to differentiate them. Turns out that the Russian tank photos had lots of graininess while the US tanks did not, and the neural network patterned on the graininess of the images, and not on the actual distinctive features of a tank.

Returning to the Yanny versus Laurel debate, it’s a fun topic and allows us all to enjoy kidding each other about what we hear and what we think we hear. It also though fortunately and interestingly brings up the importance of sensory illusions. Humans are susceptible to sensory illusions. AI self-driving cars are also susceptible to sensory illusions. Let’s not delude ourselves into thinking that AI self-driving cars are some kind of perfection. We need to develop the AI capabilities to be able to catch its own susceptibly to sensory illusions. Unlike the rather idle consequences of whether you hear Yanny or Laurel in that now ubiquitous audio clip, sensory illusions for AI self-driving cars can have decisive life-and-death consequences. It’s a serious matter, and I assure you that’s no illusion.