Using Data to Influence Behavior and Predict Outcomes

Whether you're at a casino in Las Vegas, or a patient on the active arm of a clinical trial, no knowledge is more coveted than what's going to happen next.

Of course, no one can know with certainty what the future holds—there are far too many variables, known and unknown—but that's not really the goal of predictive analytics anyway. Speaking in the keynote slot at Duke University's Fifth Annual Technologies & Consumer Healthcare Conference, predictive analytics guru Eric Siegel defined his chosen field as "technology that learns from experience—i.e., data—to predict the outcome or behavior of individuals." Sounds promising, but when it comes to human health, biology often behaves badly, despite the best of intentions.

The famous baseball statistician Bill James, who brought scientific analysis and big data to bear on the sport back in the 1970s, began his project by obsessively studying box scores in an attempt to understand why some teams win and others lose. Despite James's undying interest in hard numbers and percentages as tools for understanding and predicting the game, he always stresses the anomalous factors, and the need to wed traditional player statistics with the more ethereal and less quantifiable characteristics the players embody. Phenomena like luck, the effects of playing at home or away, and clutch performances in the bottom of the ninth inning turn out to be pretty unpredictable.

This isn't an attempt to debunk predictive analysis as a marketing tool and a potential route to better health outcomes. The ROIs are written on the walls. But the dramatic increase in the number of people wearing biometric sensors, paired with all of the "listening" or spying campaigns being conducted on social media platforms, to name just two small streams in the flood of new and accessible data, have made certain commercial enterprises increasingly confident about the degree to which they can influence and then predict an individual's behavior.

Achieving that predictive capability, described at conferences as "the holy grail" or, in Siegel's parlance, "the golden egg," is starting to make the question of what technology can accurately predict about people less interesting than what it still cannot.

At any rate, Siegel got around to admitting that predictive analysis is "not necessarily [about] predicting individual outcomes," but is more about segmenting risk levels. The process begins with a crucial first step: prepare the data by organizing it so that two time frames are juxtaposed: historic data, and the present day data that companies would like to be able to predict. Siegel says the relationship between past data and present data is analogous to the relationship between present data and future data. Once the data is prepped—no small task in the context of patient information—a decision tree can take root.

The decision tree—a basic predictive analytics model—begins with a top-line value—<12 months of progression-free survival, say—and then groups and subgroups individuals through a series of yes/no questions. At the bottom of the tree, you end up with groups of people with different risk percentage values that are likely to have very different ­outcomes, even if, in this hypothetical example, they all have cancer.

The proposition being made is that extensive patient data and transparent clinical drug information can bring personalized medicine closer to home for many patients. And it could also upend traditional treatment pathways and protocols, since no two people are exactly alike. Siegel said predictive analytics at the patient's bedside is "inevitable," although it might be five years out, or 20. That's because the culture is lagging, not the technology. "We have to learn to trust the machine," said Siegel.

During his presentation, Siegel cited examples of pharma companies who have dabbled in clinical predictive analysis—GSK has experimented with predicting clinical trial enrollment, Pfizer with predicting health outcomes—but Siegel himself hasn't fully examined the healthcare space as of yet. That will change in 2014; next October, Siegel's Predictive Analytics World organization will host its inaugural healthcare-focused conference in Boston.

Prediction has come a long way since Nostradamus. Today's predictive analysis isn't concerned with causality, for two reasons. One, it's often impossible to determine; and two, it's largely irrelevant. What matters are the correlations that begin to emerge once the datasets grow large enough. The owners of those datasets, or the people and machines that have the best access to them, are in a position of power that will only increase. Toward the beginning of his keynote, Siegel told attendees "your experience today depends on how organizations and companies treat you." The most promising and unsettling thing about that statement is that it's probably true.