When Netflix wanted to create a hit TV show, it turned to data. By analyzing its viewers habits, Netflix uncovered that its customers particularly liked Kevin Spacey, director David Fincher, and political thrillers. In part because of these interests, Netflix brought the three together to create House of Cards—and thus far, the results have been tremendous.

Having binged our way through Season 2 of House of Cards, we in the entertainment world now turn our attention to the Oscars, and particularly, the race for Best Picture. In doing so, perhaps we could take a page from Netflix’s book. Perhaps, using data about movies and the relationships between them, we can identify a perfect cocktail of movie attributes—PG-13-ratedbiopicsaboutcelebrities, or heart-wrenching World War II storiesdirected by Steven Spielberg, or anything related to Michael Bay—that strikes every Best Picture nerve. Perhaps, just like a hit TV series, a Best Picture can be engineered.

Using data collected from Rovi, I explored attributes that define Best Pictures. In addition to finding characteristics that frequently appear in nominees for Best Picture, I also looked for what made candidates for Best Picture stand out from other Oscar nominees. For example, it’s well known that most Best Picture nominees are dramas, but hundreds of dramas are released every year. Maybe there are rare, niche genres that only produce a few films a year—but always get noticed by the Academy.

Furthermore, concluding “we should make a drama” isn’t very instructive. Why not sketch out the entire movie, complete with a plot, themes and tones, and a cast and crew?

The following does exactly that. I first define the rough outline and plot of the movie, and cast it and pick a crew. I then built a model to blend together movie titles and synopses from Oscar nominated-films. The result is the ultimate Frankensteinian Oscar-bait—and like Frankenstein, it could be a triumph or an abomination.

Sorting out the Basics

When engineering a Best Picture, a few elements are essential. First, popular opinion about dramas is correct—dramas do much better than other genres. Fifty percent of movies nominated for any Oscar are dramas, compared to 75 percent for Best Picture nominees. However, there’s a stronger bias for crime dramas and biographical films, suggesting that some specialization could be beneficial.

On the other end of the spectrum, comedy, science fiction, action, and horror movies are all under-represented in Best Pictures. Out of the roughly 550 movies of these types nominated for Oscars, less than 30 were nominated for Best Picture.

Second, the film shouldn’t be a sequel. Though sequels do well at the box office (for example, Transformers 2 and 3; The Matrix 2 and 3; Spider-Man 2 and 3; Batman 2 and 3; Pirates of the Caribbean 2, 3, and 4; Twilight 2, 3, 4, and 5; and Harry Potter 2, 3, 4, 5, 6, 7, and 8), they rarely impress Oscar voters. Only three of over 80 sequels nominated for Oscars were nominated for Best Picture.

While the Academy shuns sequels for Best Pictures, it actually favors adapted writing over original writing. Over 50 percent of the nominees for Best Adapted Screenplay (or a past variant of the award) were also nominated for Best Picture, but less than 40 percent of Best Original Screenplays garnered Best Picture nominations.

Constructing a Plot

This gives us the basic outline of a film—it should be well-written adapted biopic about crime. But other studies confirmed these discoveries. Surely we can be more specific. Fortunately, Rovi provides detailed classifications of films’ tones and moods, and identifies specific characteristics and keywords related to their plots. Using these attributes, I developed a more well-defined Best Picture.

The table shows the themes and plot elements most over-represented in Best Picture nominees (it excludes characteristics that only appear in very few films). Broadly, Best Picture nominees should take on sweeping themes and be bittersweet and compassionate. They should never be goofy, silly, or campy.

Though the tone of the films are best in a minor key, they should end triumphantly. Light, just-for-fun adrenaline rushes struggle (apologies to Michael Bay).

Regarding the specifics of the plot, movies about cross-cultural relations and forbidden loves are strong performers. The best cultural differences to explore are those that emerge from economic inequality—stories about class differences and servants and employees are among the Academy’s favorites. Social injustices, particularly those that address racism and mental illness are well-received, though movies that touch on injustices done to Native Americans may be ignored. Other important characteristics to avoid are kidnapping, self-referential movies about filmmaking, pregnancy, and evil aliens (apologizes to Michael Bay).

Though these plotlines are solid bets, they’re all also well-trodden—what if we want to try something a little edgier? Though they’re infrequently made (and excluded from the table above), films about wheelchairs and farm life have played well to Best Picture voters. A movie about the struggles of a wheelchair-ridden farmer, perhaps?

Finally, the movie should present these difficult and complex themes in depth and without censorship. Oscar-nominated films are an average of 114 minutes long, Best Picture nominees are an average of 130 minutes long, and Best Picture winners are 142 minutes long. Moreover, the film should be R-rated: 40 percent of Oscar-nominated films are R-rated, compared to 50 percent of Best Picture nominees.

Casting a Best Picture

Bad acting can ruin a film. Can good acting make a movie a Best Picture?

Though it doesn’t appear critical, casting good actors certainly helps: About 60 percent of films nominated for best actor or best supporting actor were nominated for Best Picture. Sadly, the gendered “actor” isn’t incidental—only around 45 percent of films recognized for outstanding performances by actresses are nominated for Best Picture.

When filling the roles, we may be inclined to turn to the greats like Meryl Streep, Tom Hanks, and Jack Nicholson. These three—along with Harrison Ford, Dustin Hoffman, Robert De Niro and Leonardo DiCaprio—have been in a number of films nominated for Best Picture, but they’ve also been in many great films that weren’t nominated for Best Picture. When casting the film—especially if it’s on a budget—we want actors and actresses that collect Best Picture nominations as efficiently as possible.

To round out the supporting cast, Billy Boyd—a poor man’s John Cazale—is an obvious first choice. All four of the Oscar-nominated movies in which Boyd has had parts (all three Lord of the Rings films and Master and Commander) were nominated for Best Picture. After Boyd, Shane Rimmer and Peter Cellier are among the best men, while Miranda Otto and Talia Shire stand out among the women.

(As an aside, this film has run into a minor issue at this point. The Academy indirectly told us that we should focus on issues about race. However, the Oscars have historically favored white men. Maybe Oscar-nominee The Last Samurai found the solution to this dilemma—make the hero of the oppressed race…Tom Cruise.)

Finding a Worthy Crew

A quality crew is perhaps even more important than the cast. Seventy-five percent of films nominated for Best Director were also nominated for Best Picture, which is the strongest overlap Best Picture nominees have with any other Oscar category.

To lead the crew, Martin Scorsese is a clear choice for the director. He has more Best Picture nominations than any other director, and has collected them with amazing efficiency. Beyond Scorsese, Norman Jewison, Ang Lee, and James Brooks would all be strong choices. On the other side of the coin, Tim Burton and Michael Bay have been very successful at having their films nominated for Oscars, but not as Best Pictures (apologizes to Michael Bay).

The “Moneyball” picks for the cast are editor Thelma Schoonmaker and cinematographer Robert Richardson. The two have been involved in a total of 27 Oscar nominated films, 15 of which were nominated for Best Picture. Moreover, films nominated for best editing and best cinematography were also nominated for Best Picture at rates above 50 percent, suggesting these two may provide the most bang-for-their buck out of any cast or crew member. Notably, Schoonmaker typically works with Scorsese, so she may be riding his coattails—or he may be riding hers.

When looking for other crew members, we should focus on sound over visual elements. Forty-two percent of films nominated for Best Score received Best Picture nominations. (Interestingly, the rate was only 14 percent for films nominated for Best Song.) By contrast, the overlaps between Best Picture nominees and Best Costumes, Best Makeup, and Best Visual Effects nominees are among the weakest of any Oscar category, excluding those dedicated to particular genres like documentaries or shorts.

Finally, the Weinstein brothers easily top the list as the best candidates to bankroll the movie. Like Scorsese, they’ve been remarkably successful in getting films they produce nominated for Best Picture, and done so with impressive efficiency.

Bullet to the Dark Side

Based on this outline, 12 Years a Slave appears to be the clearest Oscar-bait among this year’s Best Picture nominees. It’s a dramatic biopic; crime is central to the plot (though one of the chief crimes is kidnapping); it was nominated for Best Directing and Best Editing; and the arc of the plot—a hopeless social injustice, accented with cross-cultural relationships, breaks the audience’s spirit before ending in poignant salvation—fits the mold perfectly.

But we can do better. The cast could be improved. At “only” 134 minutes, it could be longer. And imagine if Scorsese had directed it.

What film would be better? Using a model to blend titles and plot synopses of all the Oscar nominees over the last five decades, I generated several potential Best Picture-worthy titles and plot summaries. Among the randomly-generated plots and titles produced by the model, the two titles and six plots below best fit the themes and tones recommended from the analysis above (I paired the synopses with the titles they best matched). Some of the results could be clear winners; others, however, might need some of that Schoonmaker magic…

Suggested Title 1: Bullet to the Dark Side

Possible Plot 1: A portrait of a newlywed couple who are reunited in the Afghan mountains.

Possible Plot 2: A ‘50s housewife and a disgraced cop team up to exact revenge upon her one-time lover.

Suggested Title 2: Hurt Me the Hidden World

Possible Plot 1: A suicidal former Union soldier ends up joining a Sioux tribe. He then takes up arms to defend them when they become entangled with Russian mobsters in London.

Possible Plot 2: A farmer tries to woo a wealthy uncle, meets and falls for an agnostic Roman soldier during WWII.

Possible Plot 3: A rich playboy who escapes from prison to reunite their divorced dad poses as an eccentric teacher at an unconventional brothel.

So to all the aspiring writers and filmmakers in the world, you now know what to do. The path before you is clear. Six outstanding movies are practically written. The necessary themes and plot twists are known. All that’s left to do is assemble the right cast and crew, and collect the inevitable hardware.

Data

I collected data via the Rovi Cloud Services API. While Rovi provides an impressive amount of data on each film, the dataset still has a few holes, most notably regarding the awards each film was nominated for (data on Best Picture nominees is complete). Additionally, Rovi provided no data on about 30 of the 2,900 films that were nominated for an Oscar over the last fifty years. The list of Oscar winners was collected from the Academy Awards Database.

To determine the top attributes in a Best Picture, I found which attributes were most over-represented in Best Pictures relative to all Oscar nominees. Unless otherwise noted, when finding top themes and plot elements, I only considered those attributes that appeared in at least 10 of the nearly 3,000 Oscar nominated movies.

While comparing Best Picture nominees to other Oscar nominees rather than all films introduces some bias (Oscar nominees aren’t a perfect sample of all movies), it has benefits as well. The dataset is restricted to movies that had some degree of critical or popular success, excludes made-for-TV movies, and largely focuses on American films (few foreign films are nominated for Best Picture).

The title and synopsis mash-ups were randomly generated using a Markov n-gram model trained on a dataset of all Oscar nominees. Because the set of Best Picture nominees is small, creating n-gram models using only Best Picture titles and synopses unfortunately isn’t possible.

To anyone who is interested in this work and would like to explore other methods for characterizing Best Picture nominees, I’m happy to share all my analysis. The conclusions presented here are a simple start to figuring out what makes a Best Picture; like so many other analyses, it could be greatly strengthened by others’ data and others’ ideas.