Sessions at Strata 2012 about MapReduce and Collaborative Filtering on Wednesday 29th February

Collborative filtering is a method of making predictions about a user’s interests based on the preferences of many other users. It’s used to make recommendations on many Internet sites, including LinkedIn. For instance, there’s a “Viewers of this profile also viewed” module on a user’s profile that shows other covisited pages. This “wisdom of the crowd” recommendation platform, built atop Hadoop, exists across many entities on LinkedIn, including jobs, companies, etc., and is a significant driver of engagement.

During this talk, I will build a complete, scalable item-to-item collaborative filtering MapReduce flow in front of the audience. We’ll then get into some performance optimizations, model improvements, and practical considerations: a few simple tweaks can result in an order of magnitude performance improvement and a substantial increase in clickthroughs from the naive approach. This simple covisitation method gets us more than 80% of the way to the more sophisticated algorithms we have tried.