Big Data Hype (and Reality)

The potential of “big data” has been receiving tremendous attention lately, and not just on HBR’s site. With interest in the topic growing exponentially, it has been the focus of countless articles and perhaps too many meetings and conferences.

But to the extent that big data will have big impact, it might not be in the classic territory addressed by analytics. Most applications of data mining and analysis have been, at their hearts, attempts to get better at prediction. Decision-makers want to understand the patterns in the past and the present in order to anticipate what is most likely to happen in the future. As big data offers unprecedented awareness of phenomena — particularly of consumers’ actions and attitudes — will we see much improvement on the predictions of previous-generation methods? Let’s look at the evidence so far, in three areas where better prediction of consumer behavior would clearly be valuable.

Film ratings. As a company that thrives when people consume more content, Netflix routinely serves up personalized recommendations to customers based on their feedback on films they’ve already viewed. This is a prediction challenge; Netflix must venture an informed guess that, if someone gave a certain rating to movie a, they will rate movie b similarly. Famously, five years ago, the company launched a competition to improve on the Cinematch algorithm it had developed over many years. It released a record-large (for 2007) dataset, with about 480,000 anonymized users, 17,770 movies, and user/movie ratings ranging from 1 to 5 (stars). Before the competition, the error of Netflix’s own algorithm was about 0.95 (using a root-mean-square error, or RMSE, measure), meaning that its predictions tended to be off by almost a full “star.” The Netflix Prize of $1 million would go to the first algorithm to reduce that error by just 10%, to about 0.86.

In just two weeks, several teams had beaten the Netflix algorithm, although by very small amounts, but after that, progress was surprisingly slow. (See the chart.)

Netflix Price Competition Progress It took about three years before the BellKor’s Pragmatic Chaos team managed to win the prize with a score of 0.8567 RMSE. The winning algorithm was a very complex ensemble of many different approaches — so complex that it was never implemented by Netflix. With three years of effort by some of the world’s best data mining scientists, the average prediction of how a viewer would rate a film improved by less than 0.1 star.

Customer attrition. Now consider the bane of wireless service providers: the churn in their customer bases. If predictive analytics drawing on big data could accurately point to who in particular was about to jump ship, direct marketing dollars could be efficiently deployed to intervene, perhaps by offering those wavering customers new benefits or discounts. Analysts measure how accurate the list of potential churners is by using a measure called “lift.” Let’s say, for example, that a wireless provider has a churn rate of 2% per month. If an algorithm can learn indicators of customer defection, and generate a list of the subscribers most likely to leave, and 8% of those subsequently do leave, then this list has a lift of 4 (because the method produced a list with four times more defectors than a random sampling would have). Such a list would be very valuable, given the costs of the marketing and inducements it would save. But still, it is 92% wrong. With the benefit of big data, will marketers get much better prediction accuracy?

A study [pdf] that Brij Masand and I conducted would suggest the answer is no. We looked at some 30 different churn-modeling efforts in banking and telecom, and surprisingly, although the efforts used different data and different modeling algorithms, they had very similar lift curves. The lists of top 1% likely defectors had a typical lift of around 9-11. Lists of top 10% defectors all had a lift of about 3-4. Very similar lift curves have been reported in other work. (See here and here.) All this suggests a limiting factor to prediction accuracy for consumer behavior such as churn.

Web advertising response. Finally, let’s turn to the challenge of predicting the click-thru rate (CTR%) of an online ad — clearly a valuable thing to get right, given the sums changing hands in that business. We should exclude search advertising, where the ad is always related to user intent, and focus on the rates for display ads.

The average CTR% for display ads has been reported as low as 0.1-0.2%. Behavioral and targeted advertising have been able to improve on that significantly, with researchers reporting up to seven-fold improvements. But note that a seven-fold improvement from 0.2% amounts to 1.4% — meaning that today’s best targeted advertising is ignored 98.6% of the time.

What are we to conclude from these three areas — all of them problems with fine, highly motivated minds focused on them? To me, they suggest that the randomness inherent in human behavior is the limiting factor to consumer modeling success. Marginal gains can perhaps be made thanks to big data, but breakthroughs will be elusive as long as human behavior remains inconsistent, impulsive, dynamic, and subtle.

Activities that are governed by physics and precise laws like the force of gravity can be predicted to an amazing degree. Think of the predictions that allowed NASA’s rover Curiosity to shoot a fantastically complex landing and end up only 1.5 miles from its target — after a journey of 350 million miles. But when an activity is driven by consumers’ whims, no amount of ingenuity can produce the ability to know what will happen. Predictive analytics can figure out how to land on Mars, but not who will buy a Mars bar.

Big data analytics can improve predictions, but the biggest effects of big data will be in creating wholly new areas. Google, for example, can be considered one of the first successes of big data; the fact of its growth suggests how much value can be produced. While analytics may be a small part of its overall code, Google’s ability to target ads based on queries is responsible for over 95% of its revenue. Social networks, too, will rely on big data to grow and prosper. The success of Facebook, Twitter, and LinkedIn social networks depends on their scale, and big data tools and analytics will be required for them to keep growing.

We can expect big data to have transformative effects in other areas, too. Location analytics and location-based services such as foursquare come to mind. So does healthcare, where big data will drive progress in personalized medicine.

Finally, big data will see its biggest and most important application in the realm of artificial intelligence. IBM Watson has beaten the best human players in Jeopardy games. Apple’s Siri has been conversing, with some success, with millions of people. Google has made significant steps towards AI with its Knowledge Graph. Google Now — Google’s answer to Siri — can learn from user behavior to anticipate its users’ requests. By 2020, all of these will be vastly more capable thanks to the growing ability to make sense of big data and learn.

So you should expect big data to have big impact. And you can bet that it will help machines interact more usefully with our unstructured, changing, and sometimes downright confused human ways. But if you’re counting on it to make people much more predictable, you’re expecting too much.

Gregory Piatetsky-Shapiro is a co-founder of KDD (Knowledge Discovery and Data Mining) Conferences, president of KDnuggets, which provides analytics and data mining consulting, and editor of www.KDnuggets.com analytics and data mining portal.