Tuesday, February 01, 2011

YouTube uses Amazon's recommendation algorithm

In a paper at the recent RecSys 2010 conference, "The YouTube Video Recommendation System" (ACM), eleven Googlers describe the system behind YouTube's recommendations and personalization in detail.

The most interesting disclosure in the paper is that YouTube has switched from their old recommendation algorithm based on random walks to a new one based on item-to-item collaborative filtering. Item-to-item collaborative filtering is the algorithm Amazon developed back in 1998. Over a decade later, it appears YouTube found a variation of Amazon's algorithm to be the best for their video recommendations.

Other notable tidbits in the paper are what the Googlers have to do to deal with noisy information (noisy video metadata and user preference data), the importance of freshness on videos (much like news), that they primarily used online measures of user satisfaction (like CTR and session length) when competing different recommendation algorithms against each other and tuning each algorithms, and the overall improvement (about x3 better) that recommendations got over simple features that just showed popular content.

Some excerpts from the paper:

Recommending interesting and personally relevant videos to [YouTube] users [is] a unique challenge: Videos as they are uploaded by users often have no or very poor metadata. The video corpus size is roughly on the same order of magnitude as the number of active users. Furthermore, videos on YouTube are mostly short form (under 10 minutes in length). User interactions are thus relatively short and noisy ... [unlike] Netﬂix or Amazon where renting a movie or purchasing an item are very clear declarations of intent. In addition, many of the interesting videos on YouTube have a short life cycle going from upload to viral in the order of days requiring constant freshness of recommendation.

To compute personalized recommendations we combine the related videos association rules with a user's personal activity on the site: This can include both videos that were watched (potentially beyond a certain threshold), as well as videos that were explicitly favorited, “liked”, rated, or added to playlists ... Recommendations ... [are the] related videos ... for each video ... [the user has watched or liked after they are] ranked by ... video quality ... user's unique taste and preferences ... [and filtered] to further increase diversity.

To evaluate recommendation quality we use a combination of diﬀerent metrics. The primary metrics we consider include click through rate (CTR), long CTR (only counting clicks that led to watches of a substantial fraction of the video), session length, time until ﬁrst long watch, and recommendation coverage (the fraction of logged in users with recommendations). We use these metrics to both track performance of the system at an ongoing basis as well as for evaluating system changes on live traﬃc.

Recommendations account for about 60% of all video clicks from the home page ... Co-visitation based recommendation performs at 207% of the baseline Most Viewed page ... [and more than 207% better than] Top Favorited and Top Rated [videos].

For more on the general topic of recommendations and personalization on YouTube, please see my 2009 post, "YouTube needs to entertain".

By the way, it would have been nice if the Googlers had cited the Amazon paper on item-to-item collaborative filtering. Seems like a rather silly mistake in an otherwise excellent paper.

Update: To be clear, this was not intended as an attack on Google in any way. Googlers built on previous work, as they should. What is notable here is that, despite another decade of research on recommender systems, despite all the work in the Netflix Prize, YouTube found that a variant of the old item-to-item collaborative filtering algorithm beat out all others for recommending YouTube videos. That is a very interesting result and one that validates the strengths of that old algorithm.

We might be talking about different things here. Are you talking about related items? Showing a list of similar items for an item? Like association rules? That's different than collaborative filtering (which uses the behavior of similar users to find items a user might like). Not sure what you are saying, could you clarify? What exactly did you see at Firefly?

Hi, Daniel. Yes, and, if we're looking back to early references, a 1992 paper out of Xerox PARC on a system called Tapestry also is earlier work (though that paper describes filtering information much more than recommendations, but it depends where you draw the line). Sue Dumais also has some work in the early 1990's on information filtering that might be applicable, again, depending on what you consider collaborative filtering and recommender systems.

But item-to-item recommendation algorithms are not the same as what are described in those papers. There certainly was earlier work in recommender systems before Amazon, but this particularly efficient and effective algorithm for recommender systems was invented at Amazon.

Item-to-item recommendation is merely replacing term frequency vector with purchase frequency vector and using the same cosine etc. similarity measure to compute a similarity matrix for documents (or items). In addition, it is highly correlated with market basket analysis or association rule analysis (To be clear, what YouTube used is association rules anlaysis as their scoring function rather than cosine vector similarity in Item-to-item recommendation settingn).If these things are made clear, Amazon may be the first to "APPLY" the technique in the item recommendation domain commercially but will never be the first to invent such a standard text book technique even though a capable lawyer is able to construct a sucessful patent application from manipulation and invention of highly skilled glossory.

@m, Amazon was the first to use this technique either academically or commercially. And, yes, new inventions often seem head-slappingly obvious -- why didn't I think of that? -- in retrospect, as you seem to be saying, though they clearly were not obvious beforehand. PageRank, for example, seems obvious in retrospect, just an application of graph propagation. That does not make that work any less important.

@Dinesh, in my read, the core algorithm is the same, but there is a little twiddle to the measure of similarity in the first part, and there is a bigger post-processing layer that does a bit of additional reordering. The key point here is that this algorithm has endured for more than a decade (and through all the work of the Netflix Prize), and it keeps proving to be efficient and effective.

Item-item collaborative filtering is not a fundamentally new algorithm compared to the original collaborative filtering algorithm proposed in Resnick et. al.'s paper.

Also, Amazon is not the first commercial implementation of collaborative filtering. I believe that was Net Perceptions.

There are 2 implementations of collaborative filtering, both based on the same user-item matrix. Let users be rows, and items e the columns.

User-user collaborative filtering finds rows which are similar, yielding similar users - then if user B is similar to user A, we can recommend items consumed by user A to user B. This was the original method in Resnick et, al and used by Net Perceptions.

Item-item collaborative filtering finds columns which are similar, yielding similar items - then if item A and item B are similar, we show item B to users who consumed item A. This is what Amazon uses, as do most commercial applications today.

Item-item collaborative filtering has been shown to be more effective, because sparsity of data make conclusions about user-user similarity rather suspect.

Statistically, item-item collaborative filtering is an implementation of a clustering algorithm, with the formula used to determine similarity of two different item column vectors acting as the similarity measure between items that is needed for clustering [an example is the cosine of the two vectors]. The advantage of using collaborative filtering for clustering rather than traditional methods like k-means or even SVD is that the former does not require iteration of the set of columns; it is [invariably] a one-pass algorithm, at the expense of reduced yet acceptable optimality.

In general though, all collaborative filtering methods suffer from typical problems listed in many academic papers, like the cold start problem, popular item dominance, etc. They really work best in massive scale systems [both # of users and # if items]. This is why it did not work that well at Netflix, which has a small number of items [in the 100Ks rather than 100s of millions at Amazon and YouTube], so they needed better methods like the ensemble approach that won the Netflix prize.

Having said this, it is truly a shame if the YouTube paper did not refer to the Amazon work. Running a collaborative filtering system with 100s of millions of users and items [imagine the size of the matrix] before we had tools like Hadoop, HBase and Cassandra is an engineering marvel, and Amazon blazed the trail in this. Wonder who the reviewers for the YouTube paper were ...

Thanks, Jeyendran, I think we mostly agree. The only spot I would differ is that I think that, before Amazon's item-to-item collaborative filtering, we had an algorithm that was not scalable or efficient, so it was not useful at scale. As you said, no one uses the original collaborative filtering algorithm anymore, but people are still using item-to-item collaborative filtering.

hi there, I found your article very informative. Recently, I am working in expertise recommendation system for (collaborative) software development. Can you suggest me some literature related to data mining techniques for such? I would really appreciate it. ^_^