Stat Trek II - The Wrath of Cant

Recently the State of the Union speeches were (once again) scrutinized due to their low "reading levels". In an awesome looking figure, The Guardian succinctly demonstrated that the grade levels of these annual addresses were steadily declining over the past century. In simpler terms: the speeches appeared to be getting "dumber". This may be historically due in part to changing American dialects, as well as an evolving style of public speaking. Perhaps "reading level" tests don't always map well to spoken word, and they certainly don't capture the rhythmic or rhetorical value of oration.

Last year I looked at the "readability score" for the Star Trek movie franchise. This basically entailed finding the closed-caption transcripts for all 11 films and sending them through an online Flesch-Kincaid Readability calculator. Such calculators use simple formulas based on the # of syllables per word and # of words per sentence to assign an approximate US grade level to the text. Basically, long sentences with big words equates to higher grade levels.

For part 2 in this series, I have examined the reading level for every episode of the Star Trek television franchise! Check it out...

I scripted the grade level calculations using the Natural Language Toolkit (nltk) in Python. This made cleaning the text and breaking up the words and sentences trivial, and I was able to run it on all the closed caption transcripts (over 600 of them!) in a matter of minutes. The grade levels are systematically higher than those found for the Star Trek films, but this may well be due to different programs calculating the readability that could count (e.g.) syllables differently.

The data themselves are cool to look at, and three trends in readability stood out to me:
1) TNG rises over its run
2) DS9 and VOY decrease over their runs
3) DS9 had the lowest scores of any series

To understand why these trends occur it's important to consider exactly how this score is computed, as well as the impact of scoring television with such indexes. The Flesch-Kincaid grade level index is defined as (from wikipedia):

There are only two components to the score: average length of sentences and average length of words. Since this is television virtually all of the words are dialog-based, which typically uses shorter sentences. Secondly, people also don't frequently use 6-syllable words in casual conversation... except in Star Trek!

You see, Star Trek has long been in the business of making science and math cool and accessible. In science we tend to use big words to describe things, often by combining obscure or foreign sounding words. That's just part of the scientific jargon. (Fun fact: the very first class I took in college was titled "Bio-scientific Etymology"). Star Trek naturally imitates this, and characters regularly engage in "technobabble", verbal texture meant to be both descriptive and dense. This adds to the whole futuristic je ne sais quoi, helping to remind the viewer that these people are indeed spacefaring badasses.

So my hypothesis is then: in the case of Star Trek, and maybe sci-fi as a whole, the so-called reading grade level score is driven by the amount of technobabble in a script. In this scenario, we could explain the three observed trends by the nature of the stories being told. For example, consider just Deep Space Nine's slow reading level decay:

Around Season 3 (~1995) the "Dominion War" story line became a (the?) central part of the DS9 plot. As the "War" raged on, the need for tech-heavy scripts with fancy science-talk declined. We instead had loads of (albeit enjoyable) episodes dealing with religion and faith, with war and death, with conflict and humanity. This makes total sense for the Old-West style story being told about a frontier space station/town, which was extremely character driven.

By contrast, The Next Generation and Voyager were both traveling ship-based shows. The action came from the Crew's ability to outwit and out-think new obstacles every week. This in turn often required complex schemes of re-wiring/hacking their ship, and pushing the laws of physics to their apparent limits.

For the curious: the highest grade level episode was "Inter Arma Enim Silent Leges", a timely latin phrase from Cicero, usually rendered in English as "In times of war, the law falls silent". The lowest grade level episode was "It's Only a Paper Moon", probably because of all the singing...

Lastly, given the scrutiny that POTUS has received for his "low-grade speech(es)", I wondered how the people actually felt about the wide range of reading levels in Star Trek. I matched fan ratings (1=worst, 10=best) of DS9 episodes to the readability scores shown above:

The take away result: people don't seem to care much. In other words, higher readability scores don't necessarily translate to more enjoyability. This finally brings me to the title for this article: The Wrath of Cant. Jargon will elevate your reading score, but you still have to tell a good story!