formerly cdixon.posterous.com

why netflix’s recommendations are more interesting than amazon’s

I have always found Netflix’s recommendations to be more interesting than Amazon’s. When you buy a Bruce Willis movie on Amazon, they often recommend other Bruce Willis movies. Netflix, on the other hand, is much more likely to suggest movies I wouldn’t have thought of or even heard of. Why is this?

Generally recommendation systems can be tuned to simply please the user (emphasize overall popular items) or to wow users (emphasize less popular items).

I think the difference is due not to difference in expertise but difference in goals. Amazon sells products and Netflix rents them (at least in the case of physical DVDs). Amazon wants an algorithm that simply optimizes sales, and showing a popular item is more likely to lead to a sale.

Netflix has a different challenge. They have an inventory of DVDs. If everyone is renting the most popular movies, they have to stock tons of those movies, and the “long tail” movies sit idle in their warehouse. They want an algorithm that gets people to go down the tail and watch movies that would otherwise go unwatched and therefore keeping more of their inventory in circulation. Hence the algorithm is tuned to “going out on a limb,” making the recommendations more interesting.

Zeidel- that was a great program and certainly helped netflix. but i’m assuming amazon has such massive resources they could match netflix if they wanted to algorithmically. so i’m assuming the “conservativeness” of their algorithm is by choice.

Netflix also has the luxury of focus. Their algorithm is designed solely for movies. My guess is AMZN focuses their recommendations on items that are much higher margin than movies (which fits with your sale vs rent argument).

Fair. One might say that internal resources do not match the wisdom (expertise) of the crowd. That said – what are you thoughts on long tail recommendations now that their model has moved to streaming?

move to streaming – they would have less economic incentive to “go out on a limb” (assuming streaming costs are same for all movies – I don’t know). But maybe they’ll keep the algorithm that way just because people like it….? who knows.

Interesting times. Don’t know if you saw this post this morning – but outlines the economic model. Netflix moves from DVD’s to streaming the invoice from the Hollywood studios has grown from $229MM 3months ago to $1.2B http://bit.ly/bPALyiGood post – thanks for the fodder.

The Netflix Prize helped a great deal on accuracy, however those tuning parameters are still something that can be decided.I collaborated with Liang Xiang (one of the top solo contestants on the Netflix prize) to later win the Github contest. He published a paper based on our work, where we varied how conservative the algorithm was based on how eclectic a person’s tastes were.For new users, Netflix can recommend popular movies, building trust. Later, with a better profile, they can go out on a limb and suggest less popular fare. The tuning parameter doesn’t have to be global.Too often, rec engines are seen as a pure mathematical / numeric optimization problem when there’s a huge strategy component to choosing the behaviour we want to encourage. I think you hit the nail on the head when emphasizing Netflix’s trying to delight vs. Amazon just trying to sell.A last example: there are a couple dozen movies that accounted for a massive amount of error on the Netflix Prize (~15% of the remaining IIRC). Although it was termed the Napoleon Dynamite problem, the #1 movie for aggregate error in that data set was What Women Want. When the goal is the best customer experience, you could remove those items, stop recommending them to new users or just warn customers.Speaking of rec engines, any idea why Apple is so bad at them?

I think it’s apples and oranges. Key difference: netflix makes more money the longer you stay a member, amazon makes more money the more stuff you buy. And I posit that the way the are measuring performance of the algorithms is sensitive to this key difference in their business models and this leads to the more “conservative” nature of amazon’s recommendations.I would imagine that amazon is measuring performance of it’s recommendations by conversion to additional sales. Obscure less popular choices though interesting are not as likely to generate revenue. The only thing that matters to the algorithm team is revenue generated from the recommendations they are showing.Netflix recommendations aren’t a direct cross-sell but tightly woven into the experience of using the product. Softer measures like customer satisfaction (which probably ends up influencing lifetime value and $$$s at some point) are probably driving it’s optimization. So instead of measuring immediate conversion they’re taking a longer view on cust sat, experience, and LTV.Any PA/DM algorithm will grow in the direction you incent it to. It’s not a question of cleverness but which direction your optimizing towards.

I think ecomm sites, particularly those who own inventory and fulfillment, have much the same challenge as netflix: long tail inventory that sits idle and unproductive in warehouses. Netflix’s recommendation approach could be a data-based solution (in addition obviously to sales) to proving inventory winners and losers. Like this comment: “They want an algorithm that gets people to go down the tail and watch movies that would otherwise go unwatched and therefore keeping more of their inventory in circulation.”This can 1) identify new inventory winners that have been overlooked in the past and potentially raise them to top sellers/renters and 2) prove losers via non-action from users seeing the recommendation, which could lead to eliminating unprofitable inventory. Reminds me of Fred’s tweet/comment a few weeks back about how his non-action to Twitter’s recommended followers should be as strong of a signal as those he does choose to follow.Not sure about Amazon, but maybe the Netflix rec approach could play a role in other ecomm sites that would normally stick with the Amazon model.

Netflix also customizes movie ratings (stars) based on what you’ve watched/liked in the past. i.e. the ratings for Ironman could be different for you and I. I would think that Amazon (or any other retailer) would benefit from adding such wrinkles to their recommendation/rating systems.

While NetFlix’s tendency to “go out on a limb” and create demand for the tail is understandable and attractive in many situations, I wish they wouldn’t so blatantly hide the newer movies. I am dissatisfied with their recommendations and turn to other sites, like Amazon, to discover more recent releases. NextFlix is even worse at helping you discover recent options on demand, probably because the older stuff costs less. I’m always amazed by the way people refer to NetFlix’s recommendation engine – they’re clearly optimizing for their bottom line over an ideal customer experience, and because of this I would switch to a reasonable competitor in a heartbeat.

If you don’t like Netflix recommendations you can always turn to alternative sites like http://Filmaster.com (which have an open source movie recommendation algorithm similar to Netflix but with no goal to deceive you) or Criticker (a similar site, not open source though).

Was just discussing this today. I think another reason is the nature of the algorithm. Amazon is generating recommendations based on common subjects: people, topics, places, etc (nouns). Netflix is more sensitive to the nature of storytelling, and what looks at what *connects* those subjects. In other words, Netflix categorizes content by identifying more “meta” metadata, like the nature of the story arc (ex: ironic tragedies), the type of narrative (nerds overcome jocks), or the theme (hipster romance). Amazon’s algorithm makes more sense for them because it applies to a range of products, from sweaters to toasters to films, while Netflix has the luxury of optimizing for movies–visual stories.

Hi Chris. Nate and I interviewed Robert Bell and Chris Volinsky of the Bellkor team at AT&T’s R&D lab in upstate NY for the book project earlier this year. The Netflix algorithms use matrix models that figure out what the best dimensions to describe movies are and what distinguishes movies and taste from one another. The model automatically figures out what dimensions to put people on, and then to put movies on. The most basic dimension uses linear modeling to declare if someone likes one movie, they will not like another. The second dimension separates movies orthogonally, like with rom coms on one side and Eraserhead on the other. They do this 50 dimensions deep, and after the 10th dimension there aren’t really any words in the English language to describe the movies that occupy those spaces, but the computer’s numerical definition of those spaces can match them to their likely viewers.I don’t know about the stocking thing, it probably has something to do with the recommendations. But I think what they are far more concerned with is showing people things they like and not showing them things they won’t like, because the ultimate goal is getting them to subscribe for another year. Rate stuff you love and also stuff you hate and it will keep turning up delightfully weird selections for you.

It’s my understanding that Amazon uses an item-to-item collaborative filtering technique, which would take the Bruce Willis movie and try to find other items like it (more Bruce Willis movies). Basically, you decide how similar two items and populate a matrix based on these relationships. Based on the excellent summary by arikia above, that Netflix uses a combination of an item-based approach and a user-based approach, which turns up those “delightfully weird selections.”

I think you also have to consider the data that Amazon and Netflix have available to train their algorithms on. Amazon has past purchasing behaviour, Netflix has past renting/streaming behaviour. Perhaps people are more adventurous in what they rent than what they buy?

I think the difference with recommendation systems has more to do with context and data than with goals.Amazon’s data for recommendations probably looks very different from Netflix’s. Netflix has a large number of ratings per user. They have the list of movies the user has rented, but it’s more important to know what the user thought of the movies they rented, not just that they rented the movie.Amazon, on the other hand, may have much less information per user. I haven’t purchased as many movies from Amazon as I have rented from Netflix. More importantly, even of the items I have purchased from Amazon, I have rated very few. A big part of the reason I have rated so few is that, in the past, you had to write a review to rate. I think you’re allowed to rate without writing a review now, but I don’t know when that was enabled and the functionality is buried far from where the average star rating of the item is shown – you can’t just click on the stars. The percentage of users willing to write a review is vastly different from the percentage of users willing to rate an item. For example, on our site (http://www.discovereads.com), virtually *every* user on the site has rated books, and the average number of books rated per user is quite high. Contrast that to the normal expected 1-3% of users who write reviews (this seems pretty consistent across industries), and among those users, there’s probably a very small percentage who write more than a handful of reviews. If you have very little preference data per user, it’s really not possible to do the kind of personalization that the Netflix prize focused on. You fall back on averages of things, such as “average ratings”, “on average users who bought this also bought that”, etc. But if you look at various data sets, on average, only 30% of users who rate an item agree with the average rating. The vast majority of the time, the average rating is wrong (for the user viewing it). And if you look at Amazon product pages that says “Customers who looked at this bought…” you’ll see that very often the most popular item recommended has a single digit percentage associated with it.There’s also a significant context difference (what is known at the time a recommendation must be made).To get recommendations on Netflix, you have to log in. Netflix knows exactly who you are and, in the majority of the cases, has a long list of ratings you’ve entered. I imagine that a good percentage of the time, Amazon doesn’t even know who you are when they need to recommend items to you. This percentage is probably going down as more and more people buy more frequently from Amazon (and are remembered by the site) but, of course, in the earlier days the standard scenario was most likely that Amazon didn’t know who was browsing until they decided to check out.If you’re building a recommendation system to handle the scenario where you don’t know who you’re recommending to, then you can’t draw on a deep understanding of that user’s tastes. You’re forced to do your best with the single item the user is looking at as your context, or you use that session’s click stream (which is noisy) as a guide.Now, rather than say “Amazon’s recommendations and Netflix’s recommendations are both good in their own respects” I’ll plant my flag firmly on Netflix’s side. Disclosure: the taste engine we built for Discovereads was in part inspired by the Netflix prize (one of the people who worked on our taste engine was part of the team that narrowly lost the Netflix prize), so feel free to take this with a grain of salt. But Discovereads’ users overwhelmingly prefer our recommendations over Amazon’s, so I take that as a bit of validation of my argument.With an item-item similarity recommender, when the user is looking at a movie they don’t like, your only recommendations are for other movies that are similar to this movie they don’t like. How likely is it that the user also won’t like those similar movies? Of course, if you knew that particular user’s tastes, you might be able to recommend a movie that’s similar, yet different so that the user would actually like the movie. (Note: this is more of an issue with product categories where how “good” an item is is more subjective, such as with books, movies, and music).Also, it’s important to be right about how much a specific user will like a specific item. As I noted, it’s normal for the average rating to be right for this user only 30% of the time. So, most of the time that an average star rating tells you you’ll like something, you won’t, and vice versa. We should expect better of recommendation systems today.In the end, the goal of a good recommendation system (other business considerations aside) is to match users with things that they will love – usually things that they would not have easily found on their own without the help of the recommendation system. Of course, Netflix has had a business consideration that made them take their output and skew it away from newly released blockbusters. But when that consideration is removed, I believe their approach will still produce more “interesting” recommendations because we don’t all equally love all the same blockbuster movies.After all of that, I will agree on at least one case where Amazon and Netflix have different goals because of retail vs. rental: cross-sell / bundled purchases. In this case, of course, they’re trying to make it so convenient to buy that other thing you’ll also need that creativity and “interestingness” doesn’t play much of a factor. But I don’t think that’s really what we’re talking about in this discussion.

Chris: Predictive analytic approaches to modeling demand are far more elegant than suggested by your post. The current leading edge is the use of spiking neural networks to deduce semantic relationships. There are two general approach: using auto-generated ontologies or using holo-semantic pattern recognition. Neither approach is discussed above and unlikely that either Netflix or Amazon would ever disclose their use. I know that Overstock, Ebay, Wal-Mart and CoinStar/RedBox are all experimenting with these technologies. I will be posting blog entries on these semantic web technologies in the near future. Unfortunately, working and consulting cuts into my time to write blogs. As inferred by arikia Netflix is most likely using a matrix of single variable decomposition (SVD) on a portfolio of dimensions of preferences that are most commonly associated with movies. It is important to note that Netflix’s contest was flawed at the onset: They required that the solution be algorithmic. This reflects they just don’t get semantics and machine learning technologies. Rather, they would have been far better served by asking for a solution — not an equation. Ebay’s work is really interesting. They are generating 50TB per day of user data at a time when sales are lagging. They have all the incentives in the world to create a predictive behavioral model for purchases and preferences. My understanding is they have a staff of more than 200 working on a suite of solutions ranging from traditional BI data mining, CART (correlation and regression trees) and more cutting edge approaches. Overall, I’m disappointed that not more people are looking at the math and the limitations of relational databases to solve problems associated with “big data.”Thanks for posting and raising awareness of this issue. I encourage you to look at some of the more interesting solutions beyond the household names of Amazon and Netflix. Apple did… and bought SIRI for more than $200 million — all to get their hands on some cool semantic technology.