Google Flu Trends Do Not Match CDC Data

A new study reports that Google's tool does not actually match up with the Center for Disease Control's confirmed flu infection data—which means that it may not tell public health officials what they think it does.

If this year's swine flu scare taught us anything, it was that the behavior of a pandemic is impossible to predict. That's why accurate, real-time flu surveillance information is crucial—if we know what a virus did yesterday, we at least have an inkling of what it might do tomorrow. Experts have heralded Google Flu Trends, a tool launched in 2008 that estimates influenza activity based on flu-related search trends, as the first system to provide flu surveillance data with a lag of just one day—a big improvement over the U.S. Centers for Disease Control and Prevention's one- to two-week delay. But a new study presented at the annual meeting of the American Thoracic Society in New Orleans on May 14 reports that Google's tool does not actually match up with the CDC's confirmed flu infection data—which means that it may not tell public health officials what they think it does.

Justin Ortiz, a pulmonary and critical care doctor at the University of Washington School of Medicine, and his colleagues compared data from Google Flu Trends from 2003 to 2008 with data from two government networks: The CDC's influenza-like-illness surveillance network, which reports the proportion of people who visit a physician with flu-like symptoms such as fever, fatigue and cough; and the CDC's virologic surveillance system, which reports the proportion of people who visit a physician and actually have lab-confirmed influenza. Ortiz found that while Google Flu Trends matched up extremely well to the CDC's influenza-like-illness data, it did not accurately predict lab-confirmed influenza activity.

In other words, Google Flu Trends is better at monitoring nonspecific respiratory illnesses—bad colds and other infections, like SARS, that seem like the flu—than it is at monitoring flu itself. And that's a big distinction, because "not only are there a number of infections that can cause influenza-like illnesses that aren't actually due to influenza virus, but not always is influenza an influenza-like illness," Ortiz explains. This year, up to 40 percent of people with pandemic flu did not have "influenza-like illness" because they did not have a fever. "Influenza-like illness is neither sensitive nor specific for influenza virus activity—it's a good proxy, it's a very useful public-health surveillance system, but it is not as accurate as actual nationwide specimens positive for influenza virus," he says.

To Google, these findings come as no surprise, because Flu Trends was not designed to monitor confirmed flu infections. "If you have a daughter who is ill, you wouldn't be able to tell the difference between flu and influenza-like illness when you're searching online," explains Matt Mohebbi, Google Flu Trends' lead engineer. And it's just as important to monitor clusters of symptoms as it is to monitor specific infections, he says. "There are many cases where it makes more sense to look at the clinical [data] in combination with the virologic [data] in order to get a full picture of what's going on."

Still, if Google Flu Trends is not flu-specific, it could cause confusion. Doctors using Google Flu Trends might, for instance, overstock on flu vaccines or assume that the respiratory infections they see in their patients are due to influenza when they are not. In addition, since its algorithm is based on human behavior, it sometimes falters. Compared with other years, Google's 2003 data did not match up as well to the 2003 CDC influenza-like-illness data, possibly because that flu season began early, was particularly intense, and was covered heavily in the media, causing a disproportionate number of healthy people to use Google for flu-related searches. Google's data "are likely to be sensitive to factors that modify human behavior but which are not related to true disease rates," says Elena Naumova, director of the Tufts University Initiative for the Forecasting and Modeling of Infectious Diseases. It may not represent the entire population, either. "Chances are that older adults are not searching in Google in the same proportion as a younger, more Internet-bound, generation," she says. Yet the elderly are, of course, at high risk for influenza.

Google Flu Trends might, however, provide some unique advantages precisely because it is broad and behavior-based. It could help keep track of public fears over an epidemic—people like to Google the things that scare them—and it also could help health officials monitor other respiratory illnesses. For example, New York City surveillance data has indicated a sudden surge in influenza-like illness, but not in confirmed flu infections. "What they were able to glean from this was that there was something that was going on that was not influenza-related, but was causing influenza-like illness," Mohebbi says.

The concept behind Google Flu Trends could be useful outside the public health realm, too. In August 2009, Google launched Insights for Search, an experimental tool that uses search trends to provide information on how people's interests—based on their search patterns—change over time and geography. The tool has been used to monitor real-time unemployment, mortgage and foreclosure rates, for instance, and companies can use it to gauge, in real time, what their consumers want. "It's really exciting," Mohebbi says. "They call it predicting the present."

A Part of Hearst Digital Media
Popular Mechanics participates in various affiliate marketing programs, which means we may get paid commissions on editorially chosen products purchased through our links to retailer sites.