Background: The Netflix Prize is an ongoing open competition for the best collaborative filtering algorithm that predicts user ratings for films, based on previous ratings. The competition is held by Netflix, an online DVD-rental service, and is opened for anyone (with some exceptions). The grand prize of $1,000,000 is reserved for the entry which bests Netflix's own algorithm for predicting ratings by 10%.

Let's see, $1,000,000 split 7 ways gives us $142,857.14 each. Let's say taxes take half, now you are down to only $71,428.57 each. Unless one of them kills all of their partners like in The Dark Knight that ain't much of a prize.

Well, it was for AT&T. No, they don't want the prize money; they're donating it charity. But what they do have now is an algorithm that can be turned into a commercial product or service. The individual researchers may not have had money as their primary motivator, but their employer sure has hell did.

Well, just like the Ansari X Prize didn't cover the costs of developing and launching a suborbital rocket, the Netflix Prize isn't really meant to be a large enough prize to fully fund the development of a new recommendation algorithm. The purpose of the prize is to stimulate interest and get people started. The real reward will come when they turn their algorithm into commercialized software - the rewards from making such a thing applicable outside of Netflix could be large indeed.

The X-Prize was designed to encourage the creation of a vehicle that would demonstrate the feasibility of a new market. It was backed up with a whole lot of market research which showed that people would happily sign up for a flight on such a vehicle. The anticipated business plan that it was trying to encourage was:

1. Build a vehicle that is very reusable and can put passengers into space.2. Win the prize and get the PR.3. Take bookings and reservation fees to fund the next flight.4. Fly the first passen

If this was your only job, you would have to pay rent/utilities on the building where you work in addition to rent/utilities on the building where you live. You also have to pay your own health insurance, dental/vision, and retirement/401K.

The above should cut it down to around a bit above (if not below when you have an office) the average starting salary of an Engineer (55K at last I checked).

Now that the 10% barrier has been reached, people have 30 days to submit their final results. At the end of the 30 days, whoever has the best result wins.

That's true, but like the story title indicates, the prize may have been achieved. From the contest rules:

The RMSE for the first "quiz" subset will be reported publicly on the Site; the RMSE for the second "test" subset will not be reported publicly but will be employed to qualify a submission as described below. The reported RMSE scores on the quiz subset

"As of the submission by team "BellKor's Pragmatic Chaos" on June 26, 2009 18:42:37 UTC, the Netflix Prize competition entered the "last call" period for the Grand Prize. In accord with the Rules, teams have thirty (30) days, until July 26, 2009 18:42:37 UTC, to make submissions that will be considered for this Prize. Good luck and thank you for participating!"

That actually makes a lot of sense... Remove the wildcards and you'll be able to get a much more accurate result for everyone else. You might suffer a bit when you're reccomending to people who like movies that are absolutely terrible, but you'd make up for it my not factoring that into the equation at all. This is, of course, assuming that only a few people actually watched those movies.

not controversial in terms of dealing with abortion or gun control, but controversial in terms of some people really found the movie totally stupid, while some people really found the movie to be really funny

movies like napolean dynamite are genre edge conditions, and people who apparently agree on everything else about movies in general encounter movies like this one and suddenly dramatically differ on their opinion of it, in completely unpredictable ways

I might be grouped with folks who enjoy flicks about identity, man vs. man, those who aren't easily offended, etc. But there doesn't seem to be as clear a way to find a group of people who find aggression offensive, which is basically the driving theme of Fight Club. Perhaps given enough negative ratings it could be possible, but even though I've clicked 'Not Interested' on all the Disney movies, they keep suggesting I want their latest direct-to-DVD crapfest, so I'm left to assume they're rating mostly based on positive ratings.

Perhaps given enough negative ratings it could be possible, but even though I've clicked 'Not Interested' on all the Disney movies, they keep suggesting I want their latest direct-to-DVD crapfest, so I'm left to assume they're rating mostly based on positive ratings.

A co-worker gets almost no recommendations at all from Netflix, and customer service told him that they generate recommendations based on ratings of 4 or 5 (though you'd think that the recommendations that they do generate would have to filter through similar movies that you've rated at 0). He was told to rate the movies that he likes higher in order to fix it, but that's never really accomplished anything as he has several hundred movies in the 4-to-5 range and maybe a dozen recommendations total.

I'm pretty sure that the Disney/children's movie recommendation flood that most everyone seems to be getting is driven by parents who don't actually love those movies, but are rating those movies on behalf of their children. That causes a weird connection to movies that they themselves enjoy, and it makes it seem like the same audience is enjoying both types of movie. They need to have an "I'm a parent" flag somewhere to help them sort that out

> They need to have an "I'm a parent" flag somewhere to help them sort that out

I'd love to have family member tags in the queue. Associating an age with them would be fine. They have a thing where you can have separate queues for different family members, but it's a separate login IIRC, and a real pain to manage cohesively. I'd much rather just have one queue, and have it send discs in a round-robin for thing1, thing2, and the folks.

it isn't bad movies that are the problem, taste in bad movies can still be uniform

also there are people like my wife and I who share the netflix since we watch most of the movies together and have different tastes in movies: for one I HATE '30s and '40s dramas and she loves them so she will get them sometimes and I love B grade sci fi and horror movies and she isn't the biggest fan of them

You can generalize that to any movie with an action hero, and a baby. Any movie with a pseudo-star (Hilton, Spears, Madonna, etc.). And any movie with Uwe Boll or similar people.

On a more serious note: I think the best way to improve recommendations, is to first relate the IMDB rating to the IQ of the rater. I found that more intelligent people do not like movies with a simple plot, because it bores them, and less intelligent people do not like movies with a complex, subtle plot, because they don't get it.

You're being elitist and silly. Perhaps you could do some research, but I have seen no evidence that interest in simple vs. complex plots has anything to do with intelligence. Certainly the type of plot one likes in a movie is something reasonable to consider. But assuming a relationship to IQ or EQ from that is silly.

I published a paper using Netflix data. (Yeah, that group [slashdot.org].)

It's certainly cool that they beat the 10% improvement, and it's a hell of a deal for Netflix, since it would have cost them more than a prize money paid out to hire the researchers, the interesting thing is whether or not this really advances the the field of recommendation systems.

The initial work definitely did, but I wonder how much of the quest for the 10% threshold moved the science, as opposed to just tweaking an application. Recommender systems still don't bring up rare items, and they still have problems with diversity. None of the Netflix Prize work address any of these problems.

1.) Rare could also be defined as unpopular. Trying to recommend unpopular movies is problematic. Is the computer program going to be able to discern under-rated (Glengarry Glen Ross) or just crap (Ishtar)

2.) Suggesting "My Little Pony" when I usually rent black and white Samurai movies could be called diversity. Do you want a program that recommends things that are different or things that are similar?

Trying to recommend unpopular movies is problematic. Is the computer program going to be able to discern under-rated (Glengarry Glen Ross) or just crap (Ishtar)

That is indeed an interesting question, and I think it's what the grandparent meant when he pointed out Netflix's contest didn't really address it. The performance measure Netflix used was root-mean squared error, so every prediction counts equally in determining your error. Since the vast majority of predictions in the data set are for frequently-watched films, effectively the prize was focused primarily on optimizing the common case: correctly predict whether someone will like or not like one of the very popular films. Of course, getting the unpopular films right too helps, but all else being equal, it's better to make even tiny improvements to your predictions of films that appear tons of times in the data set, than to make considerable improvements to less popular films' predictions, because the importance of getting a prediction right is in effect weighted by the film's popularity.

You could look at error from a movie-centric perspective, though, asking something like, "how good are your recommender algorithm's predictions for the average film?" That causes you to focus on different things, if an error of 1 star on Obscure Film predictions and an error of 1 star on Titanic predictions count the same.

True, but not true. If we had talked about plain mean error I would have agreed. But as it stands, not every error counts the same. Making a serious error (guessing 1 instead of 5) costs 16 times what making a small error (guessing 4 instead of 5) costs.

With popular movies you usually have enough data to come decently close with guesses. Sure, you can optimize them more to get even closer with the guesses. But it is a minor profit. On the other hand, you don't have much data on the less popular movies, so

That's true, but since there's not a huge range in ratings, that root-squaring doesn't have nearly as big an effect as the many orders of magnitude difference in popularity. I don't recall the exact numbers offhand, but I think the top-10 movies, out of 17,500, account for fully half the weight.

1.) Rare could also be defined as unpopular. Trying to recommend unpopular movies is problematic. Is the computer program going to be able to discern under-rated (Glengarry Glen Ross) or just crap (Ishtar)

You know what. I actually like Ishtar. I really do. The blind camel, and the line "We're not singers! We're songwriters!" gets me every time.

So really, the even harder problem is to know when to buck your your friends and go with the the outlier. It's hard, because kNN methods work pretty well, and t

So... What does this mean in real-world analysis? What does the score represent? Since the score shown seems to be smaller-is-better, does this mean that 85+% of the movies recommended won't be attractive to the target, and less than 15% would be found interesting?

Does anyone find Netflix recommendations any good anyway? I used http://criticker.com/ [criticker.com] for quite a while and was very happy about the recommended stuff. Recently switched to http://filmaster.com/ [filmaster.com] (which is a free service) and it's equally good, even though both probably use a pretty simple algorithm compared to Nextflix.

The problem is the bell curve. There aren't a lot of 5 star movies out there, and I've seen them. There are a lot of 3 star films, but my life is short and I don't want to spend a lot of time on movies I merely "like".

In fact, it's not really a bell curve. I rarely provide 1-star or 2-star ratings simply because it's not at all difficult for me to identify a film I'm going to truly hate. I don't have to waste two hou

To me it's essential to read a few reviews (even very short ones like "too sentimental" or "powerful and original") by people I know to have a similar movie taste to me. That's why I like the two websites I mentioned above -- they exactly tell you whose opinions you should care for and whose you can safely ignore.
On the other hand, http://jinni.com/ [jinni.com] and http://clerkdogs.com/ [clerkdogs.com] have some pretty cool ideas as well - they suggest you movies based on your mood and your taste. Still in beta though, so it does no

The key advantage Netflix has over other services is that it's right there. They know what you watch and you don't have to go searching.

Of course you still have to go though the back catalog and talk about all the things you've already seen, which runs in the hundreds for somebody who likes movies. That pain is essentially the same with any service.

But going forward, Netflix can present you the opportunity to rate a movie more easily. It's a small user-interface thing, but significant.