Humans come equipped with a multitude of useful faculties to control a car: We’ve eyes and ears to sense the world around us; a fast-paced brain to process those inputs; and, for the most part, a strong sense of memory which allows us to drive the many roads we know well with great confidence. But there’s a world of difference between the way we see the world and the way a computer does. “You could program a car to drive simply by putting in the rules straight from a DMV handbook,” suggests Katelin Jabbari from Google’s self-driving car team. “But that doesn’t account for 99 percent of the things we encounter on the roads. How softly should I break? How quickly should I take a turn?” Recreating those skills using digital sensors and silicon chips is possible—it just isn’t easy.

An autonomous car needs three basic skills to drive itself around: first, it has to understand where it is; second, it has to figure out what the safest route is to its next location; and third, it must know how to move to that next spot. It’s exactly what we all do as we drive: locate, perceive, move. Fortunately digital technology has rendered the third stage trivial. Power steering systems are easily motorized, as you’ll know if you’ve seen a car park autonomously.

So the physical act of driving—pressing pedals and turning the steering wheel—was always going to be easy. The problem is getting cars to master those first two skills. Essentially it means training computers to mimic the way human brains perceive physical space as they move through it.

Advertisement

The Prior Map

“I’m often asked, ‘Why not just use GPS?’” Will Maddern quips. He’s a researcher with the University of Oxford’s Mobile Robotics Group, which has developed Robot Car. “This is actually a very sensible question: the GPS satellite constellation is a stunning engineering achievement.”

The problem is that GPS has some major limitations. First, it doesn’t work without a clear view of the sky—so navigating tunnels, indoor car parks or even forests is ruled out. More importantly, it has a resolution on the order of meters, which is certainty not enough to safely navigate a car through city streets, where mere centimeters can mean the difference between safety and a collision.

Advertisement

“We often use GPS to kick-start our localization algorithms,” admits Maddern, “but we do not rely on it to tell us where in the road lane the car is.” Instead, self-driving cars learn to identify their position on the road in completely different way.

Combined LIDAR and stereo camera point-cloud map of Oxford (Image: University of Oxford)

Advertisement

“Two things need to happen,” explains Sridhar Lakshmanan, a Professor of Engineering who specialises in image processing and computer vision for autonomous vehicles at the University of Michigan. “The maps need to become more accurate and the registration to them needs to get more accurate.”

Rather than relying on the kind of maps that your in-car navigation system uses, the autonomous cars developed by the likes of Google and Oxford University rely on rich, 3D maps of the road known as prior maps. They look a little like the image just above. Then, they use on-board sensors to compare what the car sees at any point in time to what it has stored away in its memory. “Prior maps allow the car to have much better understanding of where it is in the world before it sees real-time data,” explains Jabbari. “So it knows what should be happening, it can see what’s actually happening in real time, and it can make a judgement about what it should do.”

The need for a map introduces a natural limitation on the way the cars are used, though. “The first thing we have to do before we can drive autonomously is map the roads,” admits Jabbari. Google and others build those maps piece-by-piece, using the same cars they deploy for autonomous driving to record the world that surrounds them in glorious detail—they’re already loaded with the kit that’s needed to make the maps, after all.

Advertisement

The Real World

And autonomous cars really are dripping with exotic sensing equipment. Google’s cars rely heavily on laser and radar scanners. A Velodyne 64-beam laser sits atop their vehicles, spinning merrily like a police car’s flashing light. You can see them on the roof of the cars in the image below.

Advertisement

Velodyne 64-beam lasers atop Google’s self-driving cars. (Image: AP)

Those systems—refereed to as LIDAR because of their laser light-radar genre mashing—scan 1.3 million points each time they spin round, in concentric waves that begin eight feet from the car. They’re capable of identifying a fourteen-inch object a hundred and sixty feet away. The radar system that has twice that range — it’s less precise, but should in theory allow the car to see through rain, sleet and snow. The cars being developed in Oxford, meanwhile, use a series of stereo cameras and LIDAR senses. Others—from the AutoNOMOS Labs in Germany to Stanford’s Center for Automotive Research—use a combination of similar sensors. Regardless of the exact sensors on the vehicle, the goal is the same: to suck up as much information about the surroundings of the car as possible.

But why bother making maps at all when you have such amazing real-time data from the sensors? “Not all autonomous vehicle developers agree that prior maps are important, or even necessary, for autonomous navigation,” says Maddern. “A number of researchers and manufacturers take the view that a vehicle does not need to know exactly where it is in order to react to the environment around it — by following lanes, obeying traffic signs and traffic lights and reacting appropriately to other cars and road users, they suggest that a car should perform at least as well as a human driving on an unfamiliar road.”

Advertisement

The folks at Google disagree with that notion, though. “If you’re only relying on real-time data, there’s a lot more processing that needs to be done,” explains Jabbari. “Prior maps make it easier... it’s an added layer of safety and understanding.” She likens having a prior map to driving in a familiar neighborhood: you always know the roads like the back of your hand, wherever you are.

Right now, at least, it seems that the biggest strides forward are being made by those who embrace the prior map. That’s not to say that the purely real-time approach is unwise; just that it presents a much more difficult task. While it could ultimately prove to be the best approach, it currently looks unlikely to be first to market.

Even with an on-board map, an autonomous car is always gathering fresh information about how the world looks—especially in different conditions, such as snow, wind, rain or dark. Maddern explains that the car being developed in Oxford is always creating new maps, building “multiple experiences of the same location under different conditions.” That way, the car has many “memories” of the same route, and can access the one most appropriate to current driving conditions at any given moment.

Advertisement

Google’s self-driving car (Image: Google)

Still, generating the first round of prior maps is often cited as a barrier to the widespread roll-out of autonomous cars. But “Google is really good at mapping,” points out Jabbari. Combine Google’s maps with Maddern’s idea that cars will develop a database of experiences, and autonomous cars could develop a solid memory base of their environments. And, unlike humans, all they need to do is connect to a data network to get the latest version.

Advertisement

Seeing the World Through Car Eyes

Even with a perfect memory, the car still needs to compare its mental maps with what it’s actually seeing. How that’s done depends on the sensors aboard the car. In the case of a laser scanner, like the one on the roof of Google’s car, the hardware generates a millimeter-accurate 3D representation of its surroundings that’s called a point cloud. Because point cloud images are so incredibly accurate, it’s easy enough to compare the prior map to new data using well-known algorithms that compare the two maps—prior and present—aligning them as closely as possible. That makes it very easy to identify with sub-centimeter precision where the car is on the road.

But 3D LIDAR sensors remain very costly, have high power requirements and many moving parts, says Maddern. That’s why many cars—such as the one being developed in Oxford— use normal digital cameras to compare reality with the car’s map. Some of these are stereo, others monocular—but none are as accurate as LIDAR. That means no easy algorithmic comparisons.

Advertisement

Instead of comparing the prior and realtime maps pixel by pixel, says Maddern, the computer will identify ‘interest points’, such as corners, edges and other features. These interest points in both maps are abstracted into “patches” which are compared. Maddern says the system works pretty well — “even when the patches are different sizes or are under different illumination conditions.”

For the most part, it works. The video below, which shows Robot Car’s fisheye-lens monocular camera footage being matched to a prior map, demonstrates as much. And the lure of cheap sensors is obvious: adding them to a car costs only thousands of dollars, rather than ten or hundreds of thousands. But the technique can struggle when it’s faced with motion blur, lens flare and inclement weather conditions. Given that cost is a common criticism of autonomous vehicles, it’s not surprising that “solving these challenges... is a very active research area,” as Maddern puts it.

GIF

Advertisement

Monocular camera footage is matched to prior map data by Robot Car (Image: University of Oxford)

Object Recognition

All these techniques, though, are simply a means for the car to match up one image to another to allow it to locate itself. Crucially what human drivers then do with the same information is process it—to know how they should act on it. So what researchers are trying to do is make cars that recognize and then respond to features in their surroundings—basically, we’re talking about cars developing an equivalent to human perception.

Advertisement

“That is the holy grail of autonomous vehicles,” admits Lakshmanan. It’s an “attentiveness to unexpected obstacles, whether it be traffic barrels laid out today, a pedestrian crossing in front, or a car making a sudden lane change.” And it’s exactly what cars need if they’re “to adapt to immediate changes in their surroundings.”

Perception, of course, has varying degrees of complexity. At its simplest, teaching a car to recognize objects is similar to teaching Google’s Image Search to know the differences between a teapot and cat. Provide it with enough labelled images of the two to learn from, and it will eventually get the hang of things.

This is pretty much what Google’s done so far with its cars, too. “We’ve taught it to understand categories of things. So it recognizes pedestrian and cyclists and vehicles—even particular types of vehicles, like police cars and school buses,” explains Jabbari. And based on each object’s range of movements, Google also trains cars to predict what future actions to expect. “If it sees a cyclist putting their left hand up, it’s probably an indication that they’re going to turn left or merge into the left lane,” explains Jabbari. “It allows us to adjust [the car’s] behavior.”

Advertisement

GIF

LIDAR data being used by Robot Car to classify other road users. (Image: University of Oxford)

Ultimately, though, the car needs to be able to figure all this out for itself, with no teaching required. Its this level of perception that represents the single biggest hurdle for self-driving cars. “The ability emulate the human brain... that’s going to be—if I may say—the secret sauce, the thing that allows one company to get to the market first,” explains Lakshmanan.

Advertisement

Until an autonomous car can, say, recognize and veer around a child running out into the road all by itself, there’s always going to be resistance to their presence on the streets. But the sheer weight of technological and algorithmic advance suggests that won’t be too long. Driving a car might be easy—it just turns out that perfecting is quite another matter.