Friday, May 11, 2007

DARPA Urban Challenge: Obstacle avoidance - Isolating important cues

I have had requests from several parties to give an overview on the type of obstacle avoidance scheme that might be most promising. Right now, we (Pegasus) are still evaluating some of these, so this entry should not be construed as part of our algorithm unveiling entries but rather a general overview we did a while back. It is important to realize that the main contribution of this entry is really about defining a hardware + software solution to the localization of cues that will be later learned/categorized. One large component of an obstacle avoidance solution is the machine learning/statistical device used to identify rapidly these cues as problematic for the autonomous vehicle or not. This is not unlike the human cortex (see reference section [1] [2] [3]).

In the case of the Urban Challenge, we are facing not only stopped obstacles but moving one as well. The moving obstacle have behaviors from which one needs to learn from as well. In other words, in the case of vision, a lot of work boils down to producing some amount of cues/features (a small number) from a very large set of data (pixels from an image). In some areas of computer science this is called dimensionality reduction.

Stereo-imaging:

The fly algorithm, a robust stereo algorithm using real time a genetic algorithm (yes, there is such thing as real time genetic algorithm!) and has been tried on cars. and specifically to has been used to avoid people and other objects. The initial thesis with the algorithm is in french. Improvement over the thesis have been focused on the car driving experience.

There are also numerous commercial solutions as listed by the folks at V2_lab's where they discuss each of them. I found this entry pretty revealing about the state of the affairs with regards to stereovision, you have to look at the comment section

For most stereo matching algorithms the Firewire cameras produce higher quality uncompressed images that do not wreak havoc on sensitive feature detectors. I use the Unibrain Fire-I board camera http://www.unibrain.com/index.html with the CMU 1394 Digital Camera API (Windows), which gives you very complete control and works with just about any Firewire camera, because they all use the same standard interface. http://www.cs.cmu.edu/~iwan/1394/ . When I read the technical reports for the 2005 DARPA Grand Challenge almost every report showed pictures of vehicles equiped with stereo pairs of cameras, but at the race just about all of them had been removed, presumably because of basic issues such as camera synchronization.

Monocular evaluation of the distance field. There are two approaches that caught our attention:

(With regards to Monocular information, we should not forget the excellent Mono-SLAM : This website has a Matlab implementation of the SLAM using monocular vision. There is a timely thesis on the subject here where it looks like using two cameras implementing both the monoslam algorithm.)

A Random Lens Imager, it is a hardware implementation of a totally new concept in data processing known as compressed sensing (don't ask anybody around you about it because it is too new). It needs only one camera but much of the work goes into the calibration.