Let’s take, for example, Google Flu Trends (mostly/entirely because it was just the subject of a major analysis in Science). The company uses its algorithm to correlate the frequency of search terms for very obvious things like “flu symptoms” and slightly less obvious things like “Robitussin” to predict where there is about to be a rash of flu cases. Futuristic thing tells the future!

The only problem is that Google Flu has been extremely wrong for a very long time now. Last year, the tracker’s predicted flu prevalence were more than double the Centers for Disease Control’s official numbers. Before that, the numbers were similarly disparate.

In fact, Google Flu Tracker has overestimated for 100 of the 108 weeks starting from August, 2011, according to the Science analysis, by researchers at Northeastern University and Harvard University.

“The problems we identify are not limited to Google Flu Trends. Research on whether search or social media can predict x has become commonplace and is often put in sharp contrast with traditional methods and hypotheses,” lead author David Lazer argues. “We are far from a place where they can supplant more traditional methods or theories.”

That’s been true, so far. In fact, “lagged” CDC data, which has been available for many years, is a much more accurate predictor of flu outbreaks. It’s unclear whether we even need a Google Flu Tracker, when traditional models are already working fine.

“Because a simple lagged model for flu prevalence will perform so well, there is little room for improvement on the CDC data for model projections,” Lazer writes. “If you are 90 percent of the way there, at most, you can gain that last 10 percent.”

And it’s hard to go that last 10 percent if you have no idea how the thing works. Science relies on transparency in order to be critiqued and improved upon, which is something that a private company like Google is highly unlikely to ever provide, because it’d allow imitators access to proprietary information.

Image: Science

“Science is a cumulative endeavor, and to stand on the shoulders of giants requires that scientists be able to continually assess work on which they are doing,” Lazer wrote. Google won’t tell us exactly which search terms it’s using, and one of the only scientific papers Google engineers have published on the subject was extremely vague.

“The few search terms offered in the papers do not seem to be strongly related with either GFT or the CDC data,” Lazer wrote. “We surmise that the authors felt an unarticulated need to cloak the actual search terms identified.”

In that paper, the authors say Google has various categories of relevant search terms, all of them related to either disease treatment, symptoms, or complications. Meanwhile, the CDC says that things such as school closures and high school basketball game cancelations are more accurate indicators of a flu outbreak. It’s unclear if Google is taking search terms related to that into account.

Big data isn’t always going to be perfect, and eventually we’ll settle into a system where we know when it works and when it doesn’t. Right now, it appears like the Google Flu Tracker is a failing experiment that, while not necessarily doing harm, isn’t necessarily doing what the company designed it to do, which is to save lives.

These are the only search queries we know Google uses. Image: PLOS One

“Early detection of a disease outbreak can reduce the number of people affected,” Google says. “Our up-to-date influenza estimates may enable public health officials and health professionals to better respond to seasonal epidemics and pandemics.”

Lazer says we have to get to a point where we’re using big data to complement what we already have, rather than using it as a cure-all. He calls it “big data hubris,” which is the “implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis.”

With Google Flu Trends, it appears we have a classic case of that. It has been a noble experiment so far, but maybe it’s time to revamp it—and to take the results of other big data experiments with a healthy dose of skepticism. Let’s hope you picked some up at the pharmacy, along with the Theraflu you probably didn’t end up needing.