Google Uses Amazon’s Algorithm For YouTube’s Recommendation Engine

Greg Linden reports that Google has switched the algorithm they use for YouTube’s recommendation engine from their own to a variation of Amazon’s algorithm that was designed in the late 90s.

This is an interesting move being that Google has the man power and smarts to build a fairly good recommendation on their own. But here they opt to use an algorithm designed by Amazon in 1998? Of course, the best algorithms are enhancements on top of previous algorithms. Google’s own algorithm is light-years beyond where they first were with their original PageRank patent.

Here is a relevant snippet from the Google’s RecSys 2010 paper:

Recommending interesting and personally relevant videos to [YouTube] users [is] a unique challenge: Videos as they are uploaded by users often have no or very poor metadata. The video corpus size is roughly on the same order of magnitude as the number of active users. Furthermore, videos on YouTube are mostly short form (under 10 minutes in length). User interactions are thus relatively short and noisy … [unlike] Netﬂix or Amazon where renting a movie or purchasing an item are very clear declarations of intent. In addition, many of the interesting videos on YouTube have a short life cycle going from upload to viral in the order of days requiring constant freshness of recommendation.

To compute personalized recommendations we combine the related videos association rules with a user’s personal activity on the site: This can include both videos that were watched (potentially beyond a certain threshold), as well as videos that were explicitly favorited, “liked”, rated, or added to playlists … Recommendations … [are the] related videos … for each video … [the user has watched or liked after they are] ranked by … video quality … user’s unique taste and preferences … [and filtered] to further increase diversity.

To evaluate recommendation quality we use a combination of diﬀerent metrics. The primary metrics we consider include click through rate (CTR), long CTR (only counting clicks that led to watches of a substantial fraction of the video), session length, time until ﬁrst long watch, and recommendation coverage (the fraction of logged in users with recommendations). We use these metrics to both track performance of the system at an ongoing basis as well as for evaluating system changes on live traﬃc.

Recommendations account for about 60% of all video clicks from the home page … Co-visitation based recommendation performs at 207% of the baseline Most Viewed page … [and more than 207% better than] Top Favorited and Top Rated [videos].