Pete Warden

Pete is the CTO of Jetpac Inc, a start-up focused on analyzing billions of public photos. He's been a recipient of an NSF grant for his computer vision work, worked on image processing at Apple for five years, and has published a number of popular open source data analysis projects and O'Reilly books. He is writing a book on deep learning and blogs at https://petewarden.com. He is @petewarden on Twitter.

The culture we’ve built has turned from a liberating revolution into a repressive incumbency.

Editor’s Note: 2014 has been a contentious year in the world around technology. I started a few times this week to write about it, but everything I wrote felt like a footnote to this clear piece from Pete Warden, which Pete has given us permission to repost. While I hope we can avoid the “nuking the entire thing from orbit” option, he lays out how things have changed — without changing enough — as nerd culture has reached new heights of influence. — Simon St. Laurent

My first girlfriend was someone I met through a MUD, and I had to fly 7,000 miles to see her in person. I read a paper version of the Jargon File at 15, and it became my bible. Just reading its descriptions of the Internet, I knew it was world-changing, even before the web, and as soon as I could, I snuck into the local university computer labs with a borrowed account to experience the wonder of Usenet, FTP, and Gopher. I chose my college because Turing had once taught there, and the designer of the ARM chip would be one of my lecturers. My first job out of college was helping port the original Diablo to the first Playstation, and I spent five years writing games. I’ve dived deep into GPU programming. I’ve worked for almost two decades at both big tech companies and start-ups. I’ve spent countless hours writing about coding for the pure love of it. I’m a grown man who still plays Dungeons and Dragons! Read more…

Step-by-step instruction on training your own neural network.

When I first became interested in using deep learning for computer vision I found it hard to get started. There were only a couple of open source projects available, they had little documentation, were very experimental, and relied on a lot of tricky-to-install dependencies. A lot of new projects have appeared since, but they’re still aimed at vision researchers, so you’ll still hit a lot of the same obstacles if you’re approaching them from outside the field.

In this article — and the accompanying webcast — I’m going to show you how to run a pre-built network, and then take you through the steps of training your own. I’ve listed the steps I followed to set up everything toward the end of the article, but because the process is so involved, I recommend you download a Vagrant virtual machine that I’ve pre-loaded with everything you need. This VM lets us skip over all the installation headaches and focus on building and running the neural networks. Read more…

Announcing a new series delving into deep learning and the inner workings of neural networks.

When I first ran across the results in the Kaggle image-recognition competitions, I didn’t believe them. I’ve spent years working with machine vision, and the reported accuracy on tricky tasks like distinguishing dogs from cats was beyond anything I’d seen, or imagined I’d see anytime soon. To understand more, I reached out to one of the competitors, Daniel Nouri, and he demonstrated how he used the Decaf open-source project to do so well. Even better, he showed me how he was quickly able to apply it to a whole bunch of other image-recognition problems we had at Jetpac, and produce much better results than my conventional methods.

I’ve never encountered such a big improvement from a technique that was largely unheard of just a couple of years before, so I became obsessed with understanding more. To be able to use it commercially across hundreds of millions of photos, I built my own specialized library to efficiently run prediction on clusters of low-end machines and embedded devices, and I also spent months learning the dark arts of training neural networks. Now I’m keen to share some of what I’ve found, so if you’re curious about what on earth deep learning is, and how it might help you, I’ll be covering the basics in a series of blog posts here on Radar, and in a short upcoming ebook. Read more…

If you’re like me, you probably want to see it for yourself, to understand by experimenting. There are severalgreatopen-sourcesolutions out there, but they’re largely aimed at academics and back-end engineers, with a lot of dependencies and a steep learning curve. The current state of software makes it look like the technology is destined to remain in data centers running on high-end machines.

The good news is that the code for running deep belief doesn’t have to be complex, and by expanding beyond data centers, the networks are going to add some spookily effective AI capabilities to all sorts of devices. You can check out a demo of a full deep belief network running in Javascript and WebGL in real time on Jetpac’s website, and we have a tiny footprint C version optimized for mobile ARM devices that runs the same full 60 million connection network on everything from smart phones to Raspberry Pis, completing in under 300ms on an iPhone 5S. Read more…

There's a lot of new ground to be explored in large-scale image processing.

Jetpac is building a modern version of Yelp, using big data rather than user reviews. People are taking more than a billion photos every single day, and many of these are shared publicly on social networks. We analyze these pictures to discover what they can tell us about bars, restaurants, hotels, and other venues around the world — spotting hipster favorites by the number of mustaches, for example.

Treating large numbers of photos as data, rather than just content to display to the user, is a pretty new idea. Traditionally it’s been prohibitively expensive to store and process image data, and not many developers are familiar with both modern big data techniques and computer vision. That meant we had to cut a path through some thick underbrush to get a system working, but the good news is that the free-falling price of commodity servers makes running it incredibly cheap. Read more…

HubSpot has found the sweet spot between data, education and customer loyalty.

HubSpot's location (near Boston) and its target market (small businesses) may keep it under the radar of Silicon Valley, but the company's approach to data products and customer empowerment are worthy of attention.

It's time to accept and work within the limits of data anonymization.

Because we now have so much data at our disposal, any dataset with a decent amount of information can be matched against identifiable public records. To keep datasets available, we must acknowledge that foolproof anonymization is an illusion.

It's time to accept and work within the limits of data anonymization.

Because we now have so much data at our disposal, any dataset with a decent amount of information can be matched against identifiable public records. To keep datasets available, we must acknowledge that foolproof anonymization is an illusion.