Algorithm Needed; $25,000 Reward

It’s not quite the Netflix Prize — which paid $1 million to whoever could improve that company’s Cinematch recommendation algorithm by 10 percent — but there’s a new competition designed to predict magazine sales at newsstands. It’s being run by Hearst and the Direct Marketing Association. Rules are here; the top prize is $25,000. [%comments]

Yeah, especially since Hearst can learn from all the applicants. A clever framework for solving a problem can be more valuable than a winning solution, and Hearst gets to keep all the applicants’ submissions.

If you’re not allowed to include any external data sources, aren’t many contestants going to end up with the same answer?

Additionally, they’re holding data back? That’s like saying “tell me the seasonal batting averages of these players” and only providing the data where they played the Pittsburgh Pirates, then providing the Yankees and wondering why the model wasn’t accurate.

Nate, the question is whether they can predict future sales. In order to do this, they have to test the models against data. They can’t simply test against the same data that was used to build the data because this would be akin to knowing the results of the future data, making the model itself unnecessary.

Also, presumably, the data provided will be a random sample of the full data. Your example assumes that they are going to hold back specific types of transactions whereas that would make little sense from a modeling perspective.

It is common practice to build models from a sample and test against another random sample. This is how you validate your model.

Nate (#3), it’s about using the sources that you’re given so that the solution to the problem can be replicated by the companies themselves. Additionally, data is withheld because the withheld data is used to check the algorithm’s predictive accuracy. The same was done with the netflix prize.

Netflix paid a ton of cash because the algorithm was *worth* that amount to them (not to mention the publicity value). I’m pretty sure Hearst and the DMA could scrape together double this from their couch cushions and lunch martini budget alone.

I hope they get a grand total of one applicant from a struggling grad student who scribbled some idea on a napkin and sent it in.