Grand Prize awarded to team BellKor’s Pragmatic Chaos

It is our great honor to announce the $1M Grand Prize winner of the Netflix Prize contest as team BellKor’s Pragmatic Chaos for their verified submission on July 26, 2009 at 18:18:28 UTC, achieving the winning RMSE of 0.8567 on the test subset. This represents a 10.06% improvement over Cinematch’s score on the test subset at the start of the contest. We congratulate the team of Bob Bell, Martin Chabbert, Michael Jahrer, Yehuda Koren, Martin Piotte, Andreas Töscher and Chris Volinsky for their superb work advancing and integrating many significant techniques to achieve this result.

The Prize was awarded in a ceremony in New York City on September 21st, 2009. We will post a video on this forum of the presentation the team delivered about their Prize algorithm. In accord with the Rules the winning team has prepared a system description consisting of three papers, which we both make public below.

Team BellKor’s Pragmatic Chaos edged out team The Ensemble with the winning submission coming just 24 minutes before the conclusion of the nearly three-year-long contest. Historically the Leaderboard has only reported team scores on the quiz subset. The Prize is awarded based on teams' test subset score. Now that the contest is closed we will be updating the Leaderboard to report team scores on both the test and quiz subsets.

To everyone who participated in the Netflix Prize: You've made this a truly remarkable contest and you've brought great innovation to the field. We applaud you for your contributions and we hope you've enjoyed the journey. The Netflix Prize contest is now closed.

We will soon be launching a new contest, Netflix Prize 2. Stay tuned for more details.

The winning team’s papers submitted to the judges can be found below. These papers build on, and require familiarity with, work published in the 2008 Progress Prize.

Re: Grand Prize awarded to team BellKor’s Pragmatic Chaos

congrats to everyone. does anyone have any video of the award ceremony? here are 4 links that popped up immediately for me today & I hope to see other coverage also. and many thanks to netflix for continuing the contest.

acc. to these articles, the new contest will focus on a shorter time period [over in 1.5 yr] and work with the users with sparse ratings, attempting to improve rating estimation based on demographic data such as location, gender, movies rented etcetera. and its still $1M overall prize. cant really ask for anything better than that.

and I hope from this coverage, other companies see the power of collaborative statistics/datamining, and come up with their own contests, or maybe even approach Netflix who has excellent experience/infrastructure in the area.

Re: Grand Prize awarded to team BellKor’s Pragmatic Chaos

Well run competition with one notable exception: The KDD dataset that was allowed to be used in the solutions (thru a forum response by Prizemaster to a question posed in a post), but was not included in the data directory that people downloaded, nor its existence broadcast to all the contestants by email.

Those who knew about this data set and used it in their solution had an unfair advantage over those who didn't know. The difference between winning an losing could have been just that, given the photo finish that we saw.

It is unfortunate that the fairness of such a major competition be compromised by something that obvious.

as to the quote that "this contest proves that most collaborations fail" by so-and-so [sorry I dont have his name handy] .. maybe just a throwaway comment, but I must respectfully strongly disagree.

every aspect of the competition arguably contributed to the solution and collaboration in particular. collaboration demonstrably had the effect of pushing the bar ever upward.

the contest can be seen as an evolution in collaboration. the early results were mainly by individuals, think Simon Funk. the team members began to coalesce and combine, and in the end, entire teams coalesced and recombined. anyone who thinks collaboration was not at the very heart of this competition is just very mistaken Id say. publishing the results is also a method of collaboration among teams. and we see a collaboration by Netflix to participate with conferences, etcetera.

the key words for this is "symbiosis" and "feedback loop". the more feedback loops, the better. the scoreboard is a feedback loop. publishing results is a feedback loop. the online forum is a feedback loop. the two conferences are feedback loops. teams combining is a feedback loop. media coverage is yet another feedback loop. and "collaboration" is tightly coupled with all that.

what is true is that "most contestants fail" -- to win first place. but that was known prior to the competition. its an intrinsic property of all competitions in general, not of the netflix prize in particular. also, winning 1st place is of course not the only good reason to participate.

I will take an analogy from biology that some might laugh or cringe at, but which I would argue is highly meaningful and analogous. there is only one egg, and millions of sperm. only a single sperm fertilizes the egg. and its superiority over the 2nd best sperm is surely so close to nothing as to be negligible.

therefore a contest among humans is similar to the fertilization of the egg, a mechanism that has been tested and reused by evolution across uncountable species over maybe hundreds of millions of years.

and of course the internet is a massive, naturally specialized realm/field for this to play out.

This is not about the new competition. It is about the one that just ended. These data from 2006 (after the qualifying set dates) have been there for some time, and Netflix allowed their use by the contestants in the competition, but the contestants were not notified of the existence of such data.

Re: Grand Prize awarded to team BellKor’s Pragmatic Chaos

statistician wrote:

This is not about the new competition. It is about the one that just ended. These data from 2006 (after the qualifying set dates) have been there for some time, and Netflix allowed their use by the contestants in the competition, but the contestants were not notified of the existence of such data.

Is this really true? There was an extra year of data available, and only people who looked into these KDD conferences enough, or read these forums enough, knew about it? That's a real bombshell if so -- it sort of made KDD people a privileged elite. Pretty unfair. I sure could have used that data. It's no excuse that it was posted on the internet -- What were we supposed to do, google "additional Netflix Prize dataset" occasionally just in case something turned up somewhere on the internet?

Re: Grand Prize awarded to team BellKor’s Pragmatic Chaos

DandA wrote:

statistician wrote:

This is not about the new competition. It is about the one that just ended. These data from 2006 (after the qualifying set dates) have been there for some time, and Netflix allowed their use by the contestants in the competition, but the contestants were not notified of the existence of such data.

Is this really true? There was an extra year of data available, and only people who looked into these KDD conferences enough, or read these forums enough, knew about it? That's a real bombshell if so -- it sort of made KDD people a privileged elite. Pretty unfair. I sure could have used that data. It's no excuse that it was posted on the internet -- What were we supposed to do, google "additional Netflix Prize dataset" occasionally just in case something turned up somewhere on the internet?

I knew about the extra data, and tried my best to incorporate it (in several different ways). But it never lowered the RMSE of any of my methods noticeably, nor did it improve blended results. In some cases it actually made things worse. Eventually I stopped using the extra data altogether. Perhaps this denotes a lack of cleverness on my part, but I've heard similar reports from other contestants.

Re: Grand Prize awarded to team BellKor’s Pragmatic Chaos

Regarding the KDD’07 data – it’s much ado about nothing.Given the negligible size of that data and its quality (no numerical ratings), there was no much hope that it matters.We (the winning team) did not use it at all.

Re: Grand Prize awarded to team BellKor’s Pragmatic Chaos

YehudaKoren wrote:

Regarding the KDD’07 data – it’s much ado about nothing.Given the negligible size of that data and its quality (no numerical ratings), there was no much hope that it matters.We (the winning team) did not use it at all.

-Yehuda

I have to contradict my teammate, although it doesn't change the conclusion. The KDD'07 data contains 7804 ratings (with numerical values), which represent less than 0.01% of the training data. This additional data contains no date, which is very inconvenient when used with the more accurate time dependant models.

PragmaticTheory experimented with this extra data, and one set in the Grand Prize solution makes use of it (bk3-c50x). Our observation is that addition of the extra data did not make a measurable improvement to the model, nor did it contribute to the blend.

Re: Grand Prize awarded to team BellKor’s Pragmatic Chaos

YehudaKoren wrote:

Regarding the KDD’07 data – it’s much ado about nothing. Given the negligible size of that data and its quality (no numerical ratings), there was no much hope that it matters.

Hi Yehuda,

First, congratulations on your win. This is a great accomplishment. Regardless of whether you won by 20 minutes or a full percentage point, everyone will agree that your team and co-teams have put the most quality effort in this competition, and it is poetic justice that you guys came out on top.

There is absolutely no blame or diminished value for your teams because of the KDD data. The blame lies squarely on Netflix who have undoubtedly made a naive mistake in an otherwise highly professional competition.

As you can see from The Ensemble's web page, it would have taken less than a basis point of performance to give them the additional 0.0001 to win according to the 4-digit rounding rule. I have no idea if they have used the >7500 KDD labeled points or not, nor if that would have made a difference or not, but any way you look at it, Netflix made a sophomoric mistake by allowing the KDD data to be used without informing all the contestants about it via email.

Re: Grand Prize awarded to team BellKor’s Pragmatic Chaos

statistician writes: "I have no idea if they [The Ensemble] have used the >7500 KDD labeled points or not."

I was a member of The Ensemble, and had not heard of the KDD dataset till I read this thread on 10-30-2009. It is highly unlikely that any of our members took advantage of it.

statistician writes: "everyone will agree that your team [BellKor etc.] and co-teams have put the most quality effort in this competition."

I agree that BellKor etc. maintained the highest level of quality over the entire course of the competition, but what about the quality of The Ensemble's effort in the last month (or so) - during our team's brief existence? The quality of that effort was amazing. We were reducing the RMSE so quickly that our Quiz RMSE would have been .8552 (.0001 less than our Quiz-winning .8553) 2 hours after the close of the competition (if we had been allowed to submit then). In the end, we were beaten by the "one-submission-in-24-hours" rule. But this was a "winner takes all" competition, and we all knew the rules, so I add my congratulations to Yehuda and all the BellKor team(s) for their Test-winning performance.