Monday, July 1, 2013

Networks of the "Sight & Sound" film polls

There are at least three things wrong with "best of" lists: (i) there is rarely any clear idea of what "best" is supposed to mean; (ii) the list is of arbitrary length (eg. Top-10 only); and (iii) the ranking does not reflect the differences in the original scores.

A good example of all three problems is the "Greatest Films Poll" produced every decade by Sight & Sound magazine, which lists the Top 10 films as voted by selected film critics. You will find dozens of web sites that reproduce these lists (which started in 1952), with arguments and counter-arguments about the films that have appeared on the various lists. Much of the commentary on the latest poll is summarized here.

As far as point (i) is concerned, since the lists are compiled from the voting of film commentators, rather than film makers or the general viewing public, this clearly defines "best" as having something to do with the things that appeal most to critics. For example, Scott Tobias notes that: "Since its inception, the poll has championed films that have dramatically altered the [film] landscape", rather than films that the general public enjoys. We cannot change this bias, and for the sake of the argument, we will accept the critics' point of view in this blog post (ie. we are going to look at films that are challenging, or that changed the ways things are done).

However, we can easily address the other two problems in a quantitative way. For point (ii) we do not need to truncate the list at an arbitrary number like 10, and for point (iii) we can use the original votes because (to one extent or another) they are available on the web. In this post, I use a network to investigate the patterns in the data from the votes cast in all seven of the Sight & Sound polls to date.

The poll

Sight & Sound magazine solicits a list of their "top 10 films" from a each of a number of critics. [Note that all each critic does in order to vote is write a list of 10 films, and send it in.] For example, in 2012 the magazine solicited lists from 1000 commentators and received 846 lists in reply. However, the numbers of lists available for the polls were only 119-145 in 1982-2002 and 31-35 in 1952-1972. The number of times each film appears in any of the critics lists is summed; and the total is used to produce the Top-10 list for that particular poll (ie. the total equals the number of critics who listed the film in their Top 10). However, the magazine's Top-10 list rarely contains precisely 10 films, due to ties in the voting.

It is important to emphasize how these data can be interpreted, because most of the media reports get it badly wrong. To say, as many of the media did, that "In the 2012 poll the critics voted Vertigo the best movie of all time" is wrong, because the vast majority of the critics (77%) said that this film doesn't even belong in the top 10. And yet it is listed as the no. 1 film, because more critics (23%) put it on their list than did so for any other film. Similarly, 91% of the critics said The Searchers should not be in the top 10, and yet it is ranked no. 7. So, the rank order of the films is simply that — a rank order; it does not tell you how many critics think highly of each film.

The data

I tried to compile all of the available data for the Critics' Poll (not the separate Directors' Poll), and I ended up using the following sources:

The 1972 data are somewhat incomplete, and the 1982 data are somewhat doubtful, but I could not find multiple copies of these lists to cross-check them.

Most films get very few votes in any one year. For instance, in 2012 the 5th placed film was listed by only 11% of the critics. Clearly, rank order means little below that point, as the order of the films then involves splitting hairs. Part of the problem here is that most of the film directors have the critics' votes spread across several of their films, so that each of their films receives only a few votes even though the director actually accumulates many votes for their whole body of work. This does not seem to happen for the top-ranked films, where one film is "chosen" as the outstanding one, and almost all of that director's votes go to that single film.

Moreover, most films do not appear regularly in the polls. For example, of the 94 films that appeared at least once in the top 30 across all of the polls, only 4 appeared in all 7 polls, and 50 of the films appeared in the top 30 only once.

The analysis

In order to make the data comparable between polls, I have had to break the dataset up into two overlapping subsets, because of the paucity of data available in the polls from the early years:

data for all seven polls, which consist of the vote scores only for movies that appeared in the top c. 30 ranking (a total of 94 films) — the number of films varied from 27-33 between polls depending on tied votes, except 1972 for which I could find data only for the top 23 films;

data for the 1982, 1992, 2002 and 2012 polls only, which consist of the vote scores for movies that appeared in the top c. 130 ranking (a total of 232 films) — the number varied from 125-130 between polls depending on tied votes.

In the latter case, 53 of the 232 films appeared in all 4 years, while 95 films appeared in one poll only.

For the technical details of the analysis, I normalized the data within each poll, and I then calculated the similarity of the poll results using the Steinhaus dissimilarity (which ignores "negative matches", as discussed in a previous blog post). A Neighbor-net analysis was then used to display the between-film similarities as a phylogenetic network. So, films that are closely connected in the network are similar to each other based on their poll results, and those that are further apart are progressively more different from each other.

Comparison of the polls

When comparing all 7 polls, the important point turns out to be that only 4 / 94 films appeared in all 7 polls, while 50 / 94 films appeared in only one of the polls. This means that there is very little consistency between the polls.

The network illustrates this by arranging the years in an anti-clockwise circle. So, the 1952 poll shares a lot of films with the 1962 poll, which shares films with the 1972 poll, and so on around the circle. Thus, the 1952 poll shares little with the 2012 poll. Therefore, we can conclude that film preference changes through time for the critics. This is partly because the critics change (ie. they come and go, as critics), and also because the films available change, with new films appearing constantly. This can be investigated further by looking at networks of the films themselves, which I do next.

The seven polls for 1952-2012

The relationships among the films, as shown in the network, are strongly determined by which polls they appeared in, rather than by where they were ranked in those polls. That is, the locations of the films in the network is determined most by their presence / absence in each of the seven polls.

Clearly, how many polls a film can appear in is determined by when the film was made, as well as when it achieved critical appreciation. There are 17 films that have appeared in the Top 30 list for every poll after they first got onto the list.

Four films made it into all 7 polls:Citizen KaneLa Règle du Jeu (The Rules of the Game)Battleship PotemkinPassion de Jeanne d'Arc (The Passion of Joan of Arc)
Two films missed the first poll only:Sunrise: A Song of Two HumansL'Avventura
Four films missed the first two polls only:VertigoThe Searchers2001: A Space Odyssey8½
Two films missed the first three polls:Seven SamuraiSingin' in the Rain
One film missed the first four polls:À Bout de Souffle (Breathless)
Four films missed the first five polls:RashomonThe GodfatherThe Godfather Part IIAu Hazard Balthazar

Other films have come and gone from the Top 30, which determines their position in the network. For example, The General did not make the first or last polls but did make all of the ones in between; and Ugetsu Monogatari missed the first one and the final two. Others have come and gone sporadically, notably Greed, City Lights, Bicycle Thieves, Intolerance, and Wild Strawberries (Smultronstället). The Gold Rush and La Grande Illusion (The Grand Illusion) made only the first three lists, and Zéro de Conduite, The Childhood of Maxim Gorki, Monsieur Verdoux, and Earth made the first two lists only — these films have clearly fallen from favour.

Les Enfants du Paradis (Children of Paradise) has a strange position in the network because it appeared only in the 1952 and 1982 Top 30 lists. Most oddly, Tokyo Story, L'Atalante, and Pather Panchali made the 1962, 1992, 2002 and 2012 lists (except Pather Panchali in 2012) but none of these appeared in the 1972 or 1982 Top 30 lists. Perhaps this relates to the poor quality of the data that I have for those two polls.

Anyway, there is little evidence here of consistency regarding how appealing any particular film is to the critics. I won't go so far as to say there are fads through time, but clearly the idea of "best films" is not a particularly constant thing across decades.

The four polls for 1982-2012

The four-poll dataset includes many more films for which there are data available. This dataset shows much stronger clustering of the films, because there are far fewer possible patterns of relationship across only 4 polls instead of 7. I have labelled only the Top 10 films from the the most recent poll (2012), since that is the current "best" list.

Of the 232 films, 53 appeared in the Top 130 list for all four polls, 13 films missed only the first poll, and 8 missed only the last poll. The remaining films have appeared sporadically, with 95 of them appearing only once.

The top 15 films stand out from the others based on their average score across the four polls, and they form the cluster in the bottom-right corner of the network. That is, the network suggests that we should have a Top-15 list (not a Top-10 list). In order of average critic scores, the films are:

Citizen KaneLa Règle du JeuVertigoTokyo Story2001: A Space OdysseyBattleship PotemkinThe SearchersSunrise: A Song of Two Humans8½Singin' in the RainSeven SamuraiPassion de Jeanne d'ArcL'AtalanteL'AvventuraThe General

Note that this list, and its order, is based on all four polls. Consequently, it reflects each film's assessment over 40 years, rather than merely its current popularity. Note, also, that the top 2 films (Citizen Kane, and La Règle du Jeu) also appeared in all 7 polls (see above), indicating that they have been consistently appreciated by the critics for 70 years, with only Battleship Potemkin and Passion de Jeanne d'Arc as competitors. Of the directors involved in these four films, probably only Orson Welles is well-known to the general public, possibly because of the age of the films (only four of the Top-15 have been made in my lifetime!).

In the list there are 6 American films, 3 French, 2 Japanese, 2 Italian, 1 Russian, and 1 American-British collaboration. Of the15 films, 7 were nominated for at least one Academy Award, but only 4 of them won one. Very few of them were particularly popular with the public when they were first released.

As noted above, no director appears more than once in this list, as one of their films has been singled out by the critics. These are not necessarily their best films, but are more likely to be their most radical film in terms that affected other filmmakers and the critics. The big losers in this sense are those directors who have many films in the list, none of which scores well on its own. For example, Jean-Luc Godard has 8 nominated films, as does Luis Buñuel, while Robert Bresson has 7 films, Howard Hawks, Satyajit Ray, and Charles Chaplin all have 6, and Ingmar Bergman has 5, as also does the team of Michael Powell & Emeric Pressburger.

Finally, the graph below illustrates the changing nature of the critics' choices. It shows the fate of 9 of the top 10 films from the list over the four polls. The top five films in the most recent poll (2012) have all changed consistently in score across the four polls, with two decreasing (Citizen Kane, La Règle du Jeu) and three increasing (Vertigo, Sunrise: A Song of Two Humans, and 2001: A Space Odyssey). The most obvious connection between Citizen Kane and Vertigo (the former and current No. 1, respectively) is that Bernard Herrmann wrote the music for both movies, and his music has been considered to play an important dramatic role in both cases.

Conclusion

Those of you who are interested in the top-rated movies might like to look at Bill Georgaris' list from February 2013:

2 comments:

". . . Scott Tobias notes that: 'Since its inception, the poll has championed films that have dramatically altered the [film] landscape', rather than films that the general public enjoys."

There is a straightforward measure for the "best" film enjoyed by the public: admission ticket sales at movie theaters (which unlike ticket sales revenue, doesn't have to be adjusted for inflation). A real-world "vote of one's wallet."

"Gone With the Wind" sold roughly 200 million movie theater admission tickets. In comparison, "Avatar" sold 97 million tickets.

"What is the one film on your favorites list that you wish you had made?"

Fuhgettabout the dubious distinction of whether it "dramatically altered the [film] landscape."

We see these polls querying artists all the time, when they declare "Damn, I wish I had (i) written, (ii) composed, (iii) painted, (iv) sculpted, (v) filmed, (vi) recorded that work of art.

Those heartfelt declarations of admiration, awe, even professional jealousy carry more weight with me . . . a guy who lives in "The Film Capital of the World" and has worked in the entertainment industry.

I suspect that the changes over time are due to film professors retiring, and new tenures choosing different films to show in their classes. Hence the likes of 'Bicycle Thieves' and 'Wild Strawberries' simply haven't been seen by the voters, since they weren't hailed as masterpieces in the classes they attended.This is probably even more skewed when considering non-English language films.