Jam Session – Deep Learning & Robotics

Vito Carleone uses some basic C++ for the embedded programming and a lot of bash scripting for smoothing out the pipeline. I’ll be putting up that code (along with the deep learning code) on GitHub soon (by the end of March is the plan). The base robot kit actually comes with some code samples to get you started. I found the sample code didn’t work all that well out of the box, but with a bit of modification, it was pretty good. The repo will also contain some gists that might help if you want to try setting up your own AWS deep learning environment.

For the heart of the project, everything is python, which is what most people use these days in both industry and academia for deep learning (and machine learning in general). For this sort of work, there are some amazing open source python libraries. Keras is the high-level library I use. It allows you to describe a neural network in very few lines of code. Under the hood, it uses Google’s TensorFlow library (or you can use other backend libraries instead), which in turn uses numpy for the low-level data manipulation. I also use TensorFlow directly for some stuff, and I think it’s a really good idea to learn Tensorflow (or another mid-level library) first before using Keras. You can even use numpy to implement the math for the deep neural networks from scratch, which is a useful exercise but not very practical (it would be like rolling your own JS framework every time you make a new website).

If you just want to play around with some cool computer vision stuff, check out Darknet. You can get up and running with some basic object detection etc. quite quickly on your computer just using your CPU. For the really fun stuff, you’ll need to use a GPU. I use AWS g2 instances for the fancier stuff, and there are someprebuilt AMIs that you can use as a point of departure.

To learn the libraries above and the associated techniques, there are some great courses out there:

For machine learning in general I particularly enjoyed this University of Washington course, although the data manipulation library it uses (called GraphLab Create) is not the industry standard so you’d also want to find some tutorials on Pandas. The APIs for GL Create and Pandas are really similar, but the naming conventions are different.

For deep learning in particular, Andrew Ng has a fantastic set of courses on Coursera. My one criticism is that he handles most of the real-world challenges for you (i.e. collecting good data, and converting it into usable structures). That makes the course fun because you can move fast, but it also makes it quite hard to apply what you’ve learned to real problems later.

For applying deep learning specifically to self-driving cars, a lot of what I showed in the presentation is adapted from the deep learning module of the self-driving car nanodegree. It’s more expensive than the Coursera courses, and is also way harder, but totally worth it if you have some spare money and time.

A free alternative is this MIT course. My experience of the MIT online materials is that they’re harder to learn from than courses that are specifically designed to be online, like Coursera and especially Udacity.

Finally, here’s the NVIDIA paper that I used the CNN from. I think it’s fairly approachable for non-specialists, so worth taking a look at even if you haven’t studied deep learning in depth.