Big data could help battle flu outbreaks

Updated: 2014-06-09 00:43

By Shan Juan (China Daily)

The Chinese Center for Disease Control and Prevention is working with online search giant Baidu to tap its huge user database to help forecast flu outbreaks.

Gao Fu, deputy director of the center, said a data-crunching prediction tool might be available as early as this year to track flu outbreaks nationwide. "Big data will play a major role in safeguarding and improving public health."

The tool is expected to provide the public with estimated flu epidemic levels and help boost prevention measures as well as aid authorities with more targeted responses, said Lai Shengjie, a researcher at the CDC's infection prevention and control department.

Previous reports said that Baidu's search engine had more than 100 million users a day, making it the most popular site of its kind in the country.

Users often search for flu-related information like symptoms and medications when they come down with the ailment and the data gleaned from such activity can be faster than that collected through traditional flu surveillance based in hospitals, Lai said. Keywords in searches such as "cold" and "fever" as well as a combination of these can be applied to gather useful data.

"There might be an outbreak following a surge of related search queries. Global positioning devices can then be used to locate any such outbreak and its real-time movements," he said.

Other search data involving population movements, weather conditions and geographical factors can also help with flu predictions through scientific analysis, Gao said. "These are gaps that the data can help fill to supplement the traditional flu monitoring system."

US search giant Google launched a similar predictive tool in 2008 called Google Flu Trends, "which inspired us to create a Chinese one", Lai said.

Gao added that other medical conditions and issues such as digestive tract diseases, food poisoning, smoking control and infectious outbreak response might be included in the project.

But some experts questioned the credibility of such analysis, saying huge datasets do not necessarily guarantee validity.

Last year, science journal Nature reported that Google Flu Trends had overestimated the peak Christmas flu season in 2012 by 50 percent.

Lai said that the project will adjust and fine-tune the data by repeatedly comparing the data-generated trends with traditional flu surveillance results.

Other factors affecting the validity of the analysis, like the rapidly increasing use of smartphones, the rural-urban gap in Internet access and segmented search engine users, will all be considered to increase accuracy, he said.