Gordon Crovitz: Why 'Big Data' Is a Big Deal - WSJ.com

To save this post, select a stash from drop-down menu or type in a new one:

Cites examples from a new book, "Big Data: A Revolution That Will Transform How We Live, Work, and Think," written by Oxford scholar Viktor Mayer-Schonberger and Kenneth Cukier, data editor of The Economist:

Until recently, big data made for interesting anecdotes, but now it has become a major source of new knowledge. Google is better than the Centers for Disease Control at identifying flu outbreaks. Google monitors billions of search terms ("best cough medicine," for example) and adds location details to track outbreaks. When Wal-Mart analyzed correlations using its customer data and weather, it found that before storms, people buy more flashlights but also more Pop-Tarts, even though marketers can't establish a causal relationship between weather and toaster pastries.

Technology researchers in Canada analyzed premature births, tracking more than 1,000 data points per second. They shocked doctors by showing that when vital signs are unusually stable, that correlates with a serious fever 24 hours later. Physicians now prevent fevers through treatment though causation remains a mystery.

Data scientists working for New York City analyzed hundreds of data points to predict where owners were illegally subdividing houses and apartments, which leads to overcrowding and raises the risk of serious fires. By tracking data, including foreclosure proceedings and reports of rodents, inspectors were able to filter complaints so efficiently that they found dangerous conditions 70% of the time they inspected, an increase from 13%.

Air travelers can now figure out which flights are likeliest to be on time, thanks to data scientists who tracked a decade of flight history correlated with weather patterns. Credit scores predict who needs reminders to take medicine. Publishers use data from text analysis and social networks to give readers personalized news.

Using big data to improve health care is one of the biggest opportunities, but current laws make it hard to mine even data aggregated from many patients. If we had electronic records of Americans going back generations, we'd know more about genetic propensities, correlations among symptoms, and how to individualize treatments.

It's interesting to see the evolution from "business intelligence" in the 1990s to "big data" now.

The tools have gotten more sophisticated but the strategy of data mining has been around for a generation.