Can Big Data Prevent Hollywood’s Blockbuster Flops?

In June, iconic director Steven Spielberg predicted that Hollywood was on the verge of an “implosion” in which “three or four or maybe even a half-dozen megabudget movies are going to go crashing to the ground.” The resulting destruction, he added, could change the film industry in radical and possibly unwelcome ways.

Welcome to the Boom.

In the weeks since Spielberg’s prediction, six wannabe blockbusters have cratered at the North American box office: “R.I.P.D.,” “Turbo,” “After Earth,” “White House Down,” “Pacific Rim,” and “The Lone Ranger.” These films featured big stars, bigger explosions, and top-notch special effects—exactly the sort of summer spectacle that ordinarily assures a solid run at the box office. Yet for whatever reason, all of them failed to draw in the massive audiences needed to earn back their gargantuan budgets (although—like “Waterworld” and some other epic flops—they might eke out a small profit after years of digital-download sales and cable reruns).

Could salvation lie in Big Data analytics? Google has suggested that, by crunching aggregate search queries and other datasets, movie studios can predict how well their films will perform ahead of time—perhaps with enough leeway to at least turn a few box-office bombs into moderate successes.

“By understanding how and what they are searching for, we can uncover unique insights into moviegoers’ awareness and intent,” reads the introduction to Google’s whitepaper on the matter. Search query patterns and paid clicks, it added, “can help us in the quest to quantify ‘movie magic,’ and ultimately predict box office performance.”

But how would such an analytics model actually work? As with so many other things these days, people turn to online resources to figure out which movies to see—they watch trailers, read reviews, and scan what their friends are saying on social networks. Google data suggests that a correlation exists between movie-related searches and opening-weekend box-office performance—lots of people type in “Iron Man” on weekends when a new “Iron Man” movie is opening, for example.

“During traditionally slow periods in the box office, generic non-title keywords over-index,” the whitepaper suggested, “signaling moviegoers’ (a) general curiosity and lesser awareness of films being released during this period, and (b) broadening their consideration set to multiple titles.”

After analyzing 99 films and crunching the data through a simple linear regression model, Google found that 70 percent of the “variation in box office performance can be explained with search query volume”:

That’s pretty good, considering how a perfect prediction model would deliver 100 percent accuracy. Throw in data from people clicking on paid movie ads, along with theater count and “franchise status” (i.e., whether the film has a prebuilt audience, such as a James Bond or Pixar flick), and the model’s accuracy rises from 70 percent to 92 percent.

The whitepaper also offers an equation for a film’s longer-term performance, which combines such metrics as previous weekend performance, Rotten Tomatoes audience score, theater count, and search-ad clicks; search query volume apparently fades as a significant factor beyond a film’s first weekend.

With regard to predicting how well a movie will perform before it opens, trailers are apparently the other key factor. “Trailers remain one of the most influential sources throughout the decision process to see a movie,” the whitepaper continued. “In fact, we found that trailers are the most searched for category of information upon discovery of a new film.”

Indeed, four weeks in advance of a release, trailers factored mightily in Google’s predictions of box-office success: “Coupling a film’s t-4 trailer related search query volume with its franchise status and seasonality metrics yields a regression model predicting box office results with R^2 at a nearly perfect 94 percent.” For those who create and market movies, that data suggests that pumping more information about a film online (particularly visual information, such as clips and trailers) ahead of release is a solid way of ensuring eventual box-office success:

Or is it? All the movies that tanked this summer came with massive marketing budgets, and all dutifully issued clips and trailers on a regular basis—for example, “The Lone Ranger” had ten trailers, clips, and behind-the-scenes looks posted to iTunes and YouTube ahead of its release date.

Google’s equations also can’t solve for a key variable: the movies themselves. By the time audiences become aware of a blockbuster on the release slate, the film’s already been scripted, cast, shot, and edited; untold millions of dollars have been spent; the marketing campaign has already been plotted. Data analytics can predict a bomb in the making, there’s often little a studio can do—not every production has the money or time for the substantial reshoots that saved Brad Pitt’s “World War Z.”

In other words, analytics can help studios refine their rollout strategy for new films—but the bulk of box-office success ultimately comes down to the most elusive and unquantifiable of things: knowing what the audience wants before it does, and a whole lot of luck.

YOUR CAREER. YOUR PATH.

Author Bio

Nick Kolakowski has written for The Washington Post, Slashdot, eWeek, McSweeney's, Thrillist, WebMD, Trader Monthly, and other venues. He's also the author of "A Brutal Bunch of Heartbroken Saps" and "Slaughterhouse Blues," a pair of noir thrillers.