Posts Tagged ‘search’

I am interested in the use of search data to predict and forecast real-world events. One example I have mentioned here before is the Google Flu Project, which uses the volume of searches for flu-related topics to actually do early detection and tracking of flu outbreaks.

I thought it might be interesting to see whether or not there was anything I could tell about likely election outcomes from the volume of searches related to the Republican and Democratic parties. I did a comparison of the search volumes for “Republican Party”, “Democratic Party”, and “Tea Party” during October 2010, and looked at the same data for October 2008 (leading up to the last presidential election). Interestingly. the major party with the lead in October searches came out the winner in both cases. The Tea Party search volume needs to be explained though – if the voting followed the search volume completely, then we’d all be speaking Tea Party-ese now.

October 2010: More searches for “Republican Party” (the red line) than for “Democratic Party” (the blue line)

October 2008: More searches for “Democratic Party” (the blue line) than for “Republican Party” (the red line)

I KNOW, ELECTIONS ARE ACTUALLY MORE COMPLICATED THAN THAT

One glaring weakness of my half-hour exploration into election forecasting is that it is hard to imagine prospective voters searching mainly using party names. It is far more likely that candidates’ names and words relating to major issues would be the search terms of interest for predicting election outcomes. That, however, is more work than I would do for a blog post. I encourage anyone reading this to take up the gauntlet and pursue the more detailed view. Let me know how that comes out!

WHAT ABOUT THE YELLOW LINE?

Another factor that would have to be dealt with in building a real live election forecasting tool using search data would be the curiosity factor. People don’t just use search engines to research their voting interests – they also use them to satisfy their curiosity about topics (and political parties) that are currently in the public eye. That complicates the forecasting problem a bit. How can you tell idle curiosity from actual voting interest? I will have to mull that over…