Predicting Blockbusters With Wikipedia, New Research May Predict If A Movie Will Flop Or Be A Hit

Researchers may be able to solve the box-office mystery with a simple mathematical model. New research has created a way to predict the success of a movie a month before it is released in theaters.

Movies have long development periods, so the new model, developed by researchers from Oxford University, the Central European University at Budapest and Budapest University of Technology and Economics, may not prevent box-office flops like “The Lone Ranger” or “R.I.P.D”, but it could provide some insight, and scientific rigor, to predicting a film’s success.

Researchers turned to Wikipedia, something every college student is told not to do, and explored the popularity of 312 films American films released between 2009 and 2010. Instead of using the information on these pages, researchers studied the number of edits, human editors, page views and user diversity, which created a sort of basis for the buzz of a particular film. Researchers eliminated article creation time and length from the model as they did not show any correlation to box-office success. Values were given to each of these factors, which lead to the creation of a mathematical model. The researchers then compared the Wikipedia buzz with the film’s opening-weekend box-office gross and determined there was a strong correlation between the two.

In a lot of ways, using Wikipedia to measure general interest makes sense as it could be the beginning of a potential filmgoer’s research into a movie. The researchers applied their model to movies that have already been released to determine accuracy.

The model was able to accurately predict the success of “Inception,” “Alice in Wonderland,” “Toy Story 3” and “Iron Man 2” but was not accurate in determining the lack of success for “Animal Kingdom” or “Never Let Me Go,” which was an adaptation of Kazuo Ishiguro’s novel.

According to the researchers, the model was able to predict the box-office gross of six films to within 1 percent of the actual gross, 23 movies had a 90 percent accuracy, and 70 movies had a prediction accuracy of 70 percent or higher. All told, the model had an accuracy of 77 percent, and the researchers say that’s much higher than the models currently used for marketing firms, which have an accuracy of about 57 percent.

Online data could be used, in theory, to predict the box office and other social events. Taha Yasseri, from the Oxford Internet Institute at the University of Oxford, said Wiki edits require some effort and are a better gauge than Twitter activity, as that can be influenced by marketing firms.

Yasseri said in a statement, “We were able to demonstrate how we can use socially generated online data to predict a lot about future human behavior. The predicting power of the Wikipedia-based model, despite its simplicity compared with Twitter, is that many of the editors of the Wikipedia pages about the movies are committed moviegoers who gather and edit relevant material well before the release date.”

The next step for researchers will be testing the model on yet-to-be released movies. Further improvements, like refining the model to include other variables, such as controversy, could also lead to better accuracy. The research was published in the journal PLOS ONE.