Grant Ingersoll, founder of the Mahout project, talks with Robert about machine learning. The conversation begins with an introduction to machine learning and the forces driving the adoption of this technique. Grant explains the three main use cases, similarity metrics, supervised versus unsupervised learning, and the use of large data sets. He also provides a brief history of the Mahout project and the connection between Mahout and Hadoop. The remainder of the episode dives into the three main uses cases: recommendations, clustering, and classification. Grant and Robert discuss each use case, illustrating with examples and a typical algorithm. Recommendation is a technique for identifying items that a user would like to buy, use, or otherwise consume based on the preferences of similar users. Clustering is the partitioning of the data set into a small number of sets of similar items. Classification is the assignment of new items to a small number of existing sets.

When you were talking about ways machine learning is used I immediately thiught about lastfm.com. It’s a really cool website that tracks what music you’re listening to and based on what other people who listen that same music like makes suggestions for you.