02 October 2006

If you feel like you have the world's greatest recommender system, you should enter the NetFlix challenge for improving their movie recs. In addition to the possibility of winning a lot of money and achieving fame, you also get an order-of-magnitude larger data set for this task than has been available to date. (Note that in order to win, you have to improve performance over their system for 10%, which is a steep requirement.) I'll offer an additional reward: if you do this using NLP technology (by analysing movie information, rather than just the review matrix), I'll sweeten the pot by $10.

10 comments:

Anonymous
said...

What a neat project: try to help people enjoy movies more, and have a chance to win money at the same time! When Greg Linden posted about this on his blog today (October 2), he offered links to two possible sources of supplementary data for participants: IMBD's mass download interface and Amazon Web Services. If anyone's interested his blog is at http://glinden.blogspot.com/

Yes, you could crawl other information sources, but, apart from the legal/license pb, there is a *lot of* information available in the data set. This is basically a collaborative filtering challenge. The netflix baseline score was achieved without using other information, and I'm sure that most participants do the same.This is not an easy problem.