Other sites

Machine Learning for Hackers

"Machine Learning for Hackers" is a new book from O'Reilly Media by Drew Conway and John Myles White. A "hacker", here, is "someone who likes to solve problems and experiment with new technologies", and "Machine Learning" is usually thought of as a black-box, algorithmic approach to producing predictions or classifications from data.

This book, however, takes a pleasingly statistical approach to real-life prediction and classification problems. Rather than merely providing a "cookbook" approach to say, building a "who to follow" recommendation system for Twitter, it takes the time to explain the methodology behing the algorithms and give the reader a better basis for understanding why these methods work (and, equally importantly, how they can go wrong).

An analysis of author Drew Conway's Twitter network, classified by topic area favored by each Twitter user.

The book assumes familiarity with command-line scripting, programming, and algorithms in general. It does, however, give a gentle introduction to the R programming language, which is used to implement all of the examples. (The R scripts and associated data are also available for download.) In fact, this section also serves double-duty as an introduction to some of the basics of statistical thinking (moments, distributions, visualization, etc.), which is a very work addition in a "machine learning" book. It's also rich with many data visualizations (mostly created with the ggplot2 package), which not only helps explain the algorithms but is a useful demonstration in its own right of the value of data visualization in the data modeling process.

Machine Learning for Hackers is available for purchase now in hardcopy or digital format from the link below. I recommend it to any programmer who needs to generate predictions or classifications from data — using R and learning more about the statistical techniques behind the methods will help you to create better data hacking applications in the long run.