Why Websites Still Can’t Predict Exactly What You Want

It should be a golden age for personalization online, and the predictive algorithms that drive it. Data scientists, supported by the stunning growth in the gathering and processing of so-called big data, can extract patterns from massive stores of browsing and sales data in order to predict our likes and dislikes and tailor marketing experiences to us.

The predictive algorithms are often built to impress, designed over years by teams of PhDs from top universities. In 2006, Netflix tapped these brainiacs by offering a million dollars to anyone who improved its home-grown algorithm to predict movie ratings. It took three years before a combined team of seven researchers reached the target of ten-percent prediction improvement. The winning entry was not one algorithm but an ensemble of over one hundred algorithms, bearing exotic names such as restricted Boltzmann machines, and singular value decomposition. Big data flexed its muscles.

But there is a problem. Personalization still isn’t that good. Consumers still talk about it mostly when it’s laughably bad. Just this week, my co-worker groaned, “Facebook emailed me a list of my ex-boyfriends, suggesting that I friend them.” It wasn’t the first time. Another office mate chimed in, “Once I bought a TV from Amazon, they started suggesting other TVs for me to buy. Do they really think I am buying another?” Not to feel left out, I contributed my own head scratcher: while playing a Youtube video of classical music, cartoon roaches crawled over the player, advertising extermination services. I finally puzzled out that I had just read a news item about an infestation at the famous “cronut” bakery.

Oh, and there’s this: Netflix engineers long ago abandoned the famous algorithms that won the Netflix prize when they learned that “the additional accuracy gains did not seem to justify the engineering effort needed to bring them into a production environment.”

Example: Somewhere, Amazon stores all my purchases, including about 10 pairs of shoes I’ve bought from the site. Yet Amazon still can’t filter search results and recommend American size 7.5 shoes to me. How does it not know that? Another: I sign into my bank’s website several times a month to pay bills. The checks always go to two credit card companies, a utility, and my cellphone provider, in that order. In five years, the bank’s computers haven’t figured out this simple pattern. Travel sites, too, if they inspected my past tickets could easily know that I fly economy-class with the fewest number of stops, and depart from JFK instead of LGA if the flight time forces me to to commute to the airport during rush hour.

All that data and still an underwhelming result. What’s happening here? It seems that it’s a matter of how the companies position personalization. They regard it as a tool for upselling–they want to push us out of our comfort zone, to buy new things, and to buy more things. To achieve that goal, the companies can’t just look at one’s historical browsing or purchase patterns. Instead, data scientists look for traits in similar customers. When you position personalization this way, you build algorithms that are based on finding variables.

But the examples I cited above about the shoes, and the bank, and the travel, are based on invariables such as physical attributes (the size of your feet), cyclical life events (paying bills), or habits (brand loyalty). The beauty of such measures is the chances of matching user to need is much higher than the chances when predicting needs based on variant behaviors.

Invariable personalization is less like an upsell and more like great customer service, a soft-sell technique that builds loyalty with customers.

I’d be much happier if Amazon just got rid of the eight ancillary products I may or may not like and instead just eliminated shoes that aren’t close to my size from my search results. It’d be nice if my bank pre-populated the expected bill payments and simply asked me to confirm the action.

Recently, online grocer FreshDirect took a promising step toward this personalization model. If you search for a product you have purchased in the past, FreshDirect lists those items first, labeling them “Your Fave.” When I look for “water,” Poland Spring shows up at the top of the list; if I search “Poland Spring,” the computer knows my standard order of a six-pack of one-gallon containers. That’s the kind of personalization I can get behind.

But then, why aren’t more companies doing it?

Here’s my take: it’s too simple. The required engineering effort is modest: the computer examines only an individual’s past transactions to tailor the experience. One can hardly call that an algorithm, and the technique is too trivial to include in a course on machine learning. Algorithms and machine learning, automation, massive parallel processing, these are the concepts by which data science has defined itself. (See, for example, “Data Science and Prediction” by Vasant Dhar, Communications of the ACM, December 2013.).

In short, these kinds of easy wins aren’t sexy enough for data scientists. And maybe they fear their effort would go unnoticed if we can get better personalization without teams of PhD’s spending three years to create hundreds of algorithms.

Data scientists are vital to the future economy and advanced algorithms are an extremely important part of their work. But from a market-facing perspective, simplicity and quick wins should be part of the data science toolbox. Even Netflix has come around to this. Just yesterday it was reported that the company will drastically cut back on the number of recommendations it makes . As documented by Alexis Madrigal in The Atlantic earlier this year, Netflix has transitioned from predicting how you would rate a movie to predicting which genres of movies you like. On my blog, I celebrated this smart strategic shift as exactly the kind of simple, customer-centric approach more companies should be taking. Even though the latter approach is, from a data perspective, a benign challenge, the recommendations are more logical, accurate, and useful.

For marketers who’ve been waiting years for the ultimate prediction machine, my advice is: stop waiting. Move away from the hard sell and start wowing customers with world-class service. With personalization, simple is the new sexy.

Kaiser Fung is founder and CEO of Principal Analytics Prep, a next-generation data science bootcamp based out of the HBS Startup Studio. He directed the MS in Applied Analytics at Columbia University, and is the creator of Junk Charts, a blog devoted to the critical examination of data visualization in the mass media. His latest book is NumberSense: How to Use Big Data to Your Advantage. He holds an MBA from Harvard Business School, and degrees from Princeton and Cambridge Universities, and was an analytics leader at Vimeo, SiriusXM Radio and American Express.