The Netflix Factor: Big Data and the Future of Video

What is A24?

Is reCaptcha Training Robocars?

Oops. We Shared Your Data. Our Bad.

4 months ago

On February 1, 2013, the world experienced the definitive example of how big data is shaping the future of what we watch, what we listen to, and what we talk about weeks later.

It was the night Netflix premiered its series House of Cards, starring Kevin Spacey as Francis Underwood—a ruthless politician who would stop at nothing to conquer the dark political world of Washington, D.C.

And we watched it—one glorious binge-worthy episode at a time.

Gone are the golden days of prime time television, when millions of families would gather around the televisions in their living rooms, and tune in to the latest network sitcom. Gone, too, are the days of discovering new music on over-the-air radio. Today, the streaming platforms we rely on attract hundreds of millions of users. Sophisticated algorithms make it possible to track user data, and provide the platforms with a unique and valuable window into viewers’ trends and habits—information that can be used to make further content recommendations, and to guide the creation of new content, manufactured not necessarily for originality, but specifically for popular appeal.

Could this algorithm-based approach to content creation have a reserve effect on otherwise innovative content? Could it mean we are doomed to live in a ‘pop culture’ echo chamber, at the cost of creativity and originality?

Or could algorithms actually forge new connections with genres and experiences we might not otherwise have found?

Gut Instinct

Gaining insights into customers’ preferences and habits and then crafting a product or service in response to user behavior is not a new process. In fact, it’s been the basis of most business practices for decades. Before the introduction of the Internet, however, many of these insights were an accumulation of loose facts. In many cases, executive decisions were based on intuition, and heavily influenced by personal experience and bias. This ‘shoot from the hip’ approach, led by pure instinct, proved to be risky for companies in the entertainment industry.

Historically, from pitch to pilot, launching a new TV series would involve jumping through multiple hoops, and making decisions based on past show ratings and ‘gut feeling.’ Inevitably, hundreds—possibly thousands—of TV series were cancelled within weeks of their premiere screening, because of low viewership and poor reviews. With no access to hard data for real viewer behavior and preferences, executives had to rely on their own perceptions and often made bad guesses that didn’t align with what the market wanted. Vastly expensive in terms of development and marketing, these misinformed series—which were months or even years in the making— were launched at a huge cost to the networks. Ditto for movie studios and record labels.

With the dot-com boom of the late 1990s, the Internet would disrupt, and redefine, just about everything we did, and how we did it—including the way we watched movies and listened to music.

Driven by Data

When Netflix came on the scene in 1997, co-founders Reed Hastings and Marc Randolph simply wanted to offer DVD movies for rent, through an online subscription-based model. The concept quickly gained traction and, despite some early growing pains, Hastings and Randolph were successful in keeping the young company afloat as the dot-com bubble burst.

In 2007, with high-speed Internet becoming the new standard, Netflix repackaged its DVD rental platform as a streaming media service, allowing users to watch whatever they wanted, whenever they wanted—instantly, and without the headache of having to return DVDs.

The platform was revolutionary—and not only because it could beam high-quality video directly to users on-demand, without the need for physical media and a logistically complicated mail system. It also offered Netflix executives something that television networks and movie studios did not have: the ability to collect highly accurate data, by tracking viewership behavior across millions of users.

“Netflix employs sophisticated algorithms to power its recommendation engine,” explains Greta Hsu Ph.D., a professor, at the University of California at Davis, who has done extensive research on the influence of data related to consumer behavior. “By looking at user metadata for everything from niche-genre selections to the color patterns for box art, the Netflix system can identify movies or TV shows that are similar to what a user currently enjoys, and offer personalized recommendations based on this. Netflix keeps track of users’ search, browsing, and viewing patterns, utilizing a wide variety of data points to infer a user’s preferences.”

Because it owns the insight into the binge-watching behaviors of millions of subscribers around the world, Netflix has been successful in disrupting the traditional process of ‘green-lighting’ new TV series and films, simply by looking at what people are already watching, and how they watch it—with no gut instinct required.

But can you really quantify the popularity of art before it’s been created?

In 2011, Netflix successfully outbid other major television networks for the rights to the U.S. version of House of Cards. The deal involved 26 episodes across two seasons, at an estimated cost of $4 million to $6 million per episode. By spending over $100 million for a series that was yet to be created, Netflix executives had effectively stacked all their chips on black—but they knew exactly what they were doing.

Using combined data collected across their platform, they were able to determine several things: that a large percentage of viewers had enjoyed watching the British version of House of Cards; audiences had a strong liking for Kevin Spacey; and they had no problems watching director David Fincher’s film The Social Network from start to finish. From this hard data, which represented a significant portion of their user base, Netflix executives were able to make a well-informed decision. The $100 million bet on their first original series would pay off big for audiences—and, of course, for them.

“The novelty of House of Cards lies less with its themes than with its delivery system, Netflix, which has more than 27 million subscribers to its streaming service in the United States,” said a New York Times article just days before the 2013 series premiere. “On Friday, Netflix will release all 13 episodes of the show, calculating that many of its customers prefer to watch their favorite series in supersize marathon sessions.”

Fans binge-watched the entire season over that first weekend, discussions about the show made their way to the water cooler, and Netflix attracted millions of new subscribers in the following weeks—effectively covering their bet on the whole series. What’s more, they were able to generate interest in the show, right across their existing subscriber base, by using ten different trailers, each edited to match one of the ten different viewing preference behaviors. This helped to retain existing subscribers and add value to the Netflix brand.

Ultimately, Netflix proved that the days of ‘shooting from the hip’ were over—provided you had the data.

“Big data has always been used to generate predictions of what kinds of thing a firm should invest in, but the effect has expanded to many other industries,” explains Professor Hsu.

“Take, for example, the algorithm-based approaches to recruiting that were vividly captured in Michael Lewis’ book (and the subsequent film) Moneyball. Before the introduction of algorithm-based recruiting, baseball scouts were heavily influenced by personal experience and intuition about what characteristics defined a high-performing baseball player. But intuition is heavily influenced by bias. By utilizing instead the vast array of available data about baseball players’ performance, recruiters could identify which characteristics were strongly associated with high performance, and select players accordingly. This algorithm-based approach helped identify players who were undervalued by the larger market because they didn’t fit common stereotypes of what a star baseball player looked like. It allowed franchises to make better-informed investments in their recruitment.”

But big data’s influence on digital content isn’t limited to films and TV series; the music industry is also undergoing a data-fueled revolution of its own.

Tuning the Perfect Pitch

Founded just two years after Hastings and Randolph created Netflix, Shazam was established by a group of friends who wanted to realize their dream of creating a solution for real-time ambient music recognition. Their product, a mobile app that allows users to ‘listen’ to a few seconds of a song with their mobile device, and receive instant identification for discovery purposes, has radically changed the future of music—for better or for worse.

Like Netflix, the company has access to real-time data on the listening behaviors of millions of users around the world: what they’re listening to, when they’re listening, and where. This might seem trivial at first, but when the facts are seen through the lens of ‘the search for the next great sound’, it’s a whole new focus for the record labels. Shazam users might be discovering something new about a song they like, but Shazam is learning something very profitable about its users.

“Algorithm-based predictive tools have the ability to detect emerging trends, identify obscure songs that have the elements of a popular hit but lack a major star or backing, and enable firms to understand when the popularity of an offering or genre is starting to decline,” explains Hsu. “It allows them to study variation in the broader market landscape, to distinguish between segments of customers based on their preferences and decisions, and develop customized recommendations for specific customers.”

And that’s just one method of using music-based data. Tour managers, tasked with scheduling promotional tours for breakout artists, can select specific geographical locations and venues, and make stops only where tickets sales will be high—based purely on listener behavior.

“Firms can increasingly take advantage of the vast amount of available information, to understand what to invest in, how customer preferences are evolving, and how best to reach targeted customers,” adds Hsu.

Ever wondered why a particular musician skipped over your city on the last tour? Big data predictions could be your answer.

A Prediction Engine

In the never-ending sea of free and paid content, where more customer preference data becomes available, algorithm-based approaches to content creation will be increasingly important. The most innovative brands and content creators will blend hard data with original ideas to create and deliver to users content that is tailor-made—but doesn’t feel as though it were created by a robot.

Let’s face it: in the age of ‘Netflix and Chill’, people want to discover new content all the time, and recommendation engines can be the pulsing life force that keeps a streaming content platform fresh. But at what point might we be letting go of truly innovative concepts in favor of content tailored to the popular vote?

“There is a potential danger in big data, in that highly innovative offerings that don’t currently resemble existing popular offerings may go overlooked,” explains Hsu.

“Consider the film The Blair Witch Project, which took the marketplace by storm in 1999. It helped popularize a new sub-genre of ‘found footage’ horror. Could algorithms have predicted that popularity? If production studios were basing their investments only on patterns detected in existing popular films, the answer may likely have been ‘No’.”