Should the big data revolution be feared? Professor says it’s all in the numbers

Like the physical universe, the digital universe is expanding rapidly, a steady stream of YouTube clips, financial transactions, sensor data, and much more that’s projected to double every two years through 2020.

By itself, this growing, global data pool is unremarkable, but recent improvements in statistical and computational methods have resulted in machine-learning systems that can make sense of large data sets, providing useful insights to industries as diverse as retailing, health care, transportation, engineering and education.

For industry leaders and scientists, the harnessing of big data promises increased efficiency, new discoveries and informed decision-making backed by statistical evidence.

Critics of the data revolution, however, question if people can trust automated systems that vastly exceed human-level decision-making abilities. In short, will big data lead to big trouble?

In the right hands, big data’s potential for good -- in the form of healthier, happier and more effective lives -- vastly exceeds its potential pitfalls, says Jennifer Neville, an associate professor at Purdue with a joint appointment in the Departments of Computer Science and Statistics.

“Data analytic methods can be technically challenging to understand, but it’s important to realize that these methods aren’t being developed in a vacuum,” Neville says. “It takes human input for these systems to make rational decisions that can be applied to solve problems. Essentially, they’re only doing what our math tells them to do.”

The conference called "Dawn or Doom: The new technology explosion" will take place Sept. 18 on Purdue's West Lafayette campus. It's free and open to the public. More information about the conference is available at www.purdue.edu/dawnordoom. (Photo provided) Download Photo

Neville will be pulling back the curtain on the computational and algorithmic framework used to develop big data tools during a lecture titled “Are we too smart for our own good? How large-scale machine learning systems can vastly exceed human-level decision-making abilities” at a Purdue conference called “Dawn or Doom: The new technology explosion.”

The Dawn or Doom conference is being held Thursday, Sept. 18 on the Purdue West Lafayette campus and is free and open to the public.

In addition to an overview of the mathematical models that drive machine-learning methods, Neville will draw from the history of machine learning, a subfield of artificial intelligence, to explain how its methods have evolved since researchers first tackled AI at a seminal conference at Dartmouth College in 1956.

“Machine learning is the part of AI that has worked,” Neville says. “We’re a long ways away from creating machines that can think like people, but we’ve found success in targeted ways when we combine a lot of data with mathematical tools and computing power to find correlations we can use to make predictions.”

Today, those successes manifest themselves in beneficial, albeit hardly earth-shaking technologies that detect credit card fraud, filter spam from email inboxes and recommend movies on Netflix. The next wave of big data applications promises personalized medical treatment, smart infrastructure that responds to its environment, and more efficient water use in agriculture, to name just a few examples.

On the flip side, critics have raised concerns that data collected by public and private entities, including governments, could have less celebrated consequences – such as the infringement of individual privacy.

“We have to think about the policy implications,” Neville says. “There’s no reason to be fearful if people own their own data, have a basic understanding of how these models work and have a mechanism to challenge decisions.”