Defining the average screenplay, via data on 12,000+ scripts

A byproduct of that research was that I had a large number of data points on a whole bunch of screenplays. This allowed me to look at what the average screenplay contains.

Hopefully, this research will prove useful to writers, producers and directors looking to understand what a typical screenplay looks like and a benchmark against which they can assess their own work.

All of these scripts were reviewed by professional script readers, either as a part of a screenplay competition or to create a script report. The vast majority of these scripts will not have been produced into movies yet and a large number of the screenwriters will still be at entry level, rather than professional writers. That being said, within the dataset are scripts which have won awards, been optioned by established producers and been written by professionals and Hollywood stars.

In this article, I’ll share what the typical feature film screenplay contains with regard to seven topics:

Number of pages

Level of swearing

Gender-skewed genres (and who writes female characters)

Number of speaking characters

Number of scenes

Location places and times of day

Age of primary characters

1. Number of pages

The median length across all of our scripts was 106 pages. However, there was a broad spectrum of lengths, with 68.5% of screenplays running between 90 and 120 pages long. As the chart below shows, there are spikes on round numbers; namely pages 90, 100, 110 and 120.

Horror scripts are the shortest, with an average page count of 98.6 while the longest were Faith scripts at 110.0 pages.

2. Level of Swearing

Warning: The charts contain uncensored uses of bad words. If this is not your thing, then skip to the next sub-section.

Almost four out of five scripts contained the word ‘s**t’, with two-thirds featuring ‘f**k’ and just under one in ten using the word ‘c**t’.

Although more scripts feature one ‘s**t’ than one ‘f**k’, when a ‘f**k’ does appear it tends to be used more frequently than ‘s**t’. Across all our scripts, ‘s**t’ is used an average of 13.2 times, ‘f**k’ 23.9 times and ‘c**t’ 2.1 times.

Unsurprisingly, the swear words were not spread equally across all scripts. I developed a swearing score, based on the frequency of the three swear words I tracked, awarding a ‘1’ for each use of ‘s**t’, ‘1.17’ for ‘f**k’ and ‘8.51’ for ‘c**t’.

Comedies are the sweariest, beating Action and Horror scripts by a tiny margin (Comedy scores 42.8, Action scores 42.5 and Horror scores 41.8). The genres featuring the lowest levels of swearing are Family (1.2), Animated (1.3) and Faith-based scripts (2.8).

Only sixteen scripts used ‘c**t’ without also using either ‘s**t’ or ‘f**k’ at least once.

3. Gender-skewed genres (and who writes female characters)

I have written at length in the past about gender inequality in the film industry, and so I won’t discuss the topic in detail here. However, it is interesting to note how the gender split changes between different genres of scripts in the dataset.

The most male-dominated genres are Action (in which 8.4% of writers were women), Sci-Fi (14.1%) and Horror (14.5%). Women were best represented within Faith (47.2% female), Family scripts (41.5% female) and Animated (39.1%).

An interesting finding in last week’s research was that when we look at the scores given by readers, there seems to be an advantage to writing in a genre dominated by another gender.

For example, Action is male-dominated but is also a genre in which female writers outperform their male counterparts by the second-largest margin. Likewise, Family films written by men received higher ratings than those by women.

My reading is that when it’s harder to write a certain genre (either due to internal barriers like conventions or external barriers like prejudice) the writers who make it through are, by definition, the most tenacious and dedicated. This means that in a genre where there are few women (such as Action) the writers that are there tend to be better than the average man in the same genre.

As well as tracking the gender of the writers, I also looked at the gender of the major characters of each script (where it was possible to do so).

In all but one genre, female screenwriters were more likely to create female leading characters. This was particularly pronounced in Historical films, where female characters in male-penned scripts account for only 39% of leading characters whereas the figure was 74% for scripts written by women.

This neatly illustrates one of the many reasons why gender inequality within the film industry can have negative outcomes. As well as basic fairness and equal opportunities, we also have to consider what characters we are seeing in movies. Culture can be defined as the stories we tell ourselves about ourselves, and so an overly-male writing community is likely to lead to a culture which overemphasises the plight of male characters, thereby undervaluing female characters, stories and perspectives.

4. Number of characters

The dataset allowed me to look at the number of unique characters who speak in each script, from our principal hero/heroine right through to background characters with single perfunctory lines.

Historical scripts have the greatest number of speaking characters (an average of 45.7) and Horror scripts have the fewest (25.8). Sadly, I was unable to track how many of those characters were still alive by the final page.

5. Number of Scenes

The average script has 110 scenes – just over one scene per page. Action scripts have the greatest number of scenes (an average of 131.2 scenes) with Comedies having the fewest (just 98.5).

6. Location places and times of day

Each scene heading starts with an indication as to whether the scene takes place inside (“INT” for interior), outside (“EXT” for exterior) or a hybrid (“INT/EXT”).

Across all scripts, 60.2% of scenes are interiors, 38.9% are exteriors and 0.9% are hybrid locations.

Westerns are mostly set outside, with 64.4% of their scenes taking place in exterior locations. At the opposite end of the scale, we see 65.2% of Comedy scenes taking place indoors.

Something that will make producers wince is that the average location only appears in 1.5 scenes.

58.3% of scenes take place during the day and 41.7% take place at night. Perhaps unsurprisingly, Horror scripts are much more likely to be set at night (56.5% of scenes) whereas Historical scripts are the most nyctophobic, with only 28.9% taking place at night.

7. Age of primary characters

The average specific age of the top five characters across all the scripts is 31.8 years old.

The character who speaks most often is typically a little younger (average age: 28.3) and as we move down to characters who speak less frequently the age increases slightly. The average age of the fifth most frequently-speaking character is 35.4.

The median age is 30 years old, with 15.4% of all characters being listed as exactly 30.

Notes

Today’s research is riding on the coat-tails of my ‘Judging Screenplays By Their Coverage’ report and so comes with the same notes, definitions and caveats.

I would suggest either reading last week’s article or the full 67-page report for details. This is particularly relevant to explain our methodologies on complicated topics such as gender.

Lovely, so helpful for big-picture thinking! #4. I think the title of the chart is meant to say “per script”, rather than per scene. But I would be quite interested to know how many characters per scene there are, broken down by genre.

reaally nice stephen ! loved the casual ‘out of the box’ way of thinking story development… that quantitative study like yours reveals. wonderful to check and make associations…

would you do a study of settings x genre ? by settings I mean locales ( could be foreign, close to main action x far from main action… in geography they are now saying that there are ‘origin locales’ x ‘attraction locales ( touristic attractions like madame toussaut wax museum) as destination locales. by the way, touristic spots in movies are a huge add on to help extra finance the movie in various countries (like Italy) cause each touristic place depicted in movies correlate directly to higher increase in tourism.

also there are ‘intermediate locales or transition locales too. in game design there is ‘level design’ as route design for the ‘user/player’ to move accross the ‘game geography’ from starting point to end point… in detective genre story, for instance, in screenplay school we tend to teach that detective investigate clue that leads from place locale A to B…

Hats off to you. Simply amazing! And of course, some academic myths fall: – the use of V.O. – the overall importance of such elements as format, hook, originality, structure, theme, pacing, and even conflict!!! – etc. Having a comparison between non-produced and produced scripts would be great too, and I bet it should break even more myths.

So when you refer to “basic fairness and equal opportunities,” (above) given that Screencraft is one of the biggest entry-level conduits, do you think that Male entrants outnumbering Female entrants by over 3 to 1 could be having a negative impact on gender representation in the film industry? Or do you still feel this is down to ‘unconscious bias’ (from your report “Gender Inequality and Screenwriters”)?

I wonder if this data has affected your thought process on this issue?

I’d say that there a few things to unpack here. Firstly, yes, it’s a valid data point which shows the level of people of each (self-reported) gender who submit to ScreenCraft script competitions and script reports. It would be nice to have comparable data for other script competitions / report suppliers before we extrapolate this data across the whole industry. I’ve not seen anything to suggest this is a skewed dataset but I’m often surprised by data so try not to presume.

Secondly, this new research is tracking actions taking place at an early stage of the industry journey, as opposed to, say, writing for big-budget Hollywood movies, and so we can conclude that the hand of the industry (biased or not) is much weaker here. Throughout the gender report research, we saw that as the industry got more involved (i.e. bigger budgets, more prestigious shows/films, etc) that female representation dropped. Therefore, the script data adds further evidence to the belief that men and women do not have the same preferences in all situations. (N.B. I don’t think this is a controversial idea and not one I have subscribed to).

It also adds evidence to the idea that if we were instantly and magically all bias, that the industry wouldn’t end up at exactly 50:50. Quotas, targets and goals of increasing gender representation are not primarily about getting to some magic, Platonic ideal number. It’s about combating decades of entrenched beliefs and normalising something which was once rare or unlikely. As we showed in the gender report, there is a vicious cycle whereby if one class of people are rare, they are seen as a risky choice and in a risk-averse industry, they are much less likely to get hired. To break this still-perpetuating cycle, we need to increase representation in the short-term and change the perceptions of certain classes of people (such as women).

It’s worth noting that this script data cannot prove that bias is wholly absent. These people will all have been influenced by the perceptions of the industry and the vast majority will have had guidance and support, either formal or informal. It’s just that we would expect any such bias to be weaker than further in the heart of the industry.

Finally, another reason that it’s important that we have a fair representation among key creatives is that a relativity small number of people have a huge degree of control over our culture. Movies, and the characters shown in them, have a massive influence on how we all see the world. As this new report shows, female writers are much more likely to write about the lives of female characters than male writers. Therefore, it matters if one group of people have a disproportionate effect on the stories we get to see and hear.

This extends way beyond just gender. We were not able to measure other aspects of the writers, such as class, race, socio-economic status, etc. We couldn’t full measure age, but we did get an indication as ScreenCraft told us that the average age of writers was 32, and our research found that that average age of lead characters was 28. So for both the two factors we have – gender and age – we can see that writers write what they know. This is not in itself a problem (and arguably a good route to factually and emotionally true stories) but does underline the need for diverse storytellers.

Thank you for the question. The main thing I am seeking to create on this site is fact-driven debate. There are no ideas or beliefs that are beyond challenge, and new data should be used to update our understanding.

S

P.S. I am talking in simple terms about two genders here, but only because the data we had on writers self-reported as either male or female, and the character data could only detect male/female skewed names.

My point is that in your gender equality report, for the phase 1 entry into the industry stage you used applicants to, and students of, Film and Screenwriting courses. For the latter measure, the female applicants were 43% and successful applicants 39%. This is what you took to be representative of the whole industry entry stage.

If you are measuring, say, female representation of Doctors (the profession, not the show ha), then this approach – studying the gender balance of ie medical students – would indeed make sense, because it is mandatory to study to be a doctor. But Screenwriting is different, as your own report noted, only a minority enter Screenwriting through formal education.

So my challenge to you was that, is it not more accurate for phase 1 entry level to be measured by the gender balance of people actually sending their scripts out to others, of which competitions form a significant part?

The basic thrust of your gender inequality report was that the percentage of female screenwriters starts off close to 50% and depletes the higher up the industry you go. Your conclusion was that ‘unconscious bias’ is a significant factor.

However, your theory only holds water if you indeed believe that the 43% represents the whole of stage 1 entry level. If you were to use, say, the 23.7% from your report, (which is broadly in line with what available data from the Black List, BBC Writersroom and the Nicholls Fellowship), to represent stage 1, then it looks like the career phases 2 and 3 are no longer broadly out of kilter with phase 1.

Also, you say that “It’s worth noting that this script data cannot prove that bias is wholly absent.” But at least in the Screencraft data it seems to be absent, considering female screenwriters average slightly higher scores, (though this does not necessarily have an impact on the gender of winners, do you have data on that?)

I totally follow your argument and I’m not saying that you’re wrong. Your suggestion is a fair claim and one which does follow from the numbers.

What I would say is that there is no one place we can go to get an objective number which one and for all tells us what the true intent of new entrants is. Every number is a proxy, and open to interpretation. On the one hand, the film school numbers are great because they’re over a long time series and across many schools and courses. On the other, you’re totally right that it’s not a requirement to enter the industry, and isn’t even the most common route in. So far from perfect.

I would contend that these ScreenCraft numbers are not obviously a better proxy. The come from one source (for which we cannot know the bias or not) and are no more of a requiring or common element of career progression than schools. Different, but not automatically better or worse.

To be clear, I do agree with you that this is an indicator, and one which runs counter to the theory suggested by the film school data. The gender report brought together a large number of data points, and the argument did not hang solely on the film school data.

In answer to your point about the dataset having no bias, I’m not sure we can be so certain. There may be other differences between how men and women respond to the nature of script competitions which need to be factored in. This could be both the propensity to apply in the first case and the willingness to keep going after repeated rejections. This isn’t an argument I’m making but just an example of how bias can take many forms. This particular example would be more a matter of wider gender perceptions rather than something the film industry is doing. (Although of course it may be it wants to take into account if it wants to have more diverse new talent reaching the big leagues).