On the intersection of technology, media, politics, culture, and everyday life

July 30, 2009

The nail biting "end" to the epic Netflix Challenge has been widely covered over the last few days. Cited as an example of prize economics and the crowdsourcing of innovation, the contest drew over 50,000 contestants. Catching particular attention was the characteristics of the top teams and how they were formed (BellKor--a cross-company, international group of statisticians, machine learning experts, and computer engineers--and the Ensemble--a last minute merger of other former teams). Similar to the instances of collective intelligence that have occured in pop culture (e.g. highlighted in Convergence Culture with fans of Survivor), communities formed to closely collaborate on Netflix's complicated and apparently tantalizing problem.

The Netflix Challenge had a big prize, clear question, and clear end goal for contestants. In contrast, human computational games usually have small incentives and indirect end goals, but have also been applied to improve the search for relevance. One of the first popular human computational game was the ESP game, where people tackled image recognition. Leveraging the same premise (the >200m hours/day spent on games could be used to solve tricky problems), Microsoft Research wondered how a game could be used to answer the question: "Given a web page, what queries will effectively find this web page?"

What they came up with was Page Hunt. Players are shown a random web page, and must guess the query that would find the page in the top few results on a search engine (LIve Search). If their query word(s) ran and matched the top URL results, players got points. Microsoft in turn got data elicited from players, providing metadata to improve algorithms for pages, for query alterations/refinement, and for identifying ranking issues.

For example, using bixtext matching to learn from their game data, they found top scoring query alterations (iht=International Herald Tribune, jlo=Jennifer Lopex, capital city airport = Kentucky Airport, etc). Another resulting finding was that as the length of URL(in characters) increases, the page becomes harder to find.

The Netflix Challenge attracted brainiacs to persist for nearly three years. As perhaps a hint that shorter, smaller incentives can also spur productive mass activity, Page Hunt was fun enough (designed with time limitations, points + bonus points, leaderboard, and taboo queries/constraints) that 47% of sessions were from people who played twice or more, and 16% from those who played five times or more. A research paper on that experiment to use games to improve search here.