Main navigation

Idea Density and the Decline of Modern Literature

How has literature changed over the millennia? We like to think that it has evolved over time and that we are currently the most educated and sophisticated readers and writers that have ever existed. But this may not be the case. The dataset here demonstrates that the “idea density” of literary works has fallen since the Middle Ages. Idea density is the ratio of the number of “ideas” to the number of words in a given sentence. A straightforward introduction to the concept is available here. A more comprehensive description is given here

Why should it matter that idea density has fallen over time? The reason is that the concept has been linked to a number of cognitive phenomena. If the average idea density of literary works has fallen then it suggests that certain cognitive elements of literature have fallen and modern readers are reading works that lack important elements of the experience of literature.

The most prominent application of the concept of idea density was in the study of the cognitive history a group of nuns. It was found that the nuns in the cohort who wrote essays with high idea densities in their youth were significantly less likely to develop dementia in later life. A high idea density in early life was also associated with greater verbal fluency in later life as well as better performance on several other tests of cognitive ability. The idea here is that idea density is closely related to some underlying verbal skill or set of abilities.

To tie the idea density into the literary context, consider the literary output before and after the Axial Age (approximately 800 BCE – 200 CE). It has been established that, prior to the Axial Age, literary works tended to depict individuals as being driven by cause and effect, largely at the instigation of gods who were considered to be physically present in the external world. Individuals were little more than objects who were compelled to act in the same way an object is “compelled” to fall to earth. There was little reflective thought and introspection was largely absent. The Axial Age marked the emergence of “consciousness” in literature such that individuals were depicted as being able to stop and reflect on their mental states and the influence of their mental states on their actions. The point here is that if there was such a significant shift in this kind of cognition, then idea density is an ideal way of detecting it. Using a dataset of prominent prose fiction works derived from the list used in Steven Moore’s The Novel, An Alternative History: Beginnings to 1600 (see note below on construction of the corpus), we can see that idea density did indeed jump significantly at the time of the Axial Age. Taking the start of the Axial Age as 800 BCE, we can see that literary works prior to this time had an average idea density of 0.517 while those post 200 CE had an average idea density of 0.544. This difference is highly significant (p<.001). Thus, the kinds of changes that are seen as pivotal in the evolution of literary culture are very strongly tracked by the idea density measure.

What this tells us is that the thing we generally like about literature – its ability to enable access to another individual’s thought and experience – can be tracked by idea density. This means that it can be used to compare literary works. Let us consider Proust, who has a reputation for highly dense and difficult writing. The explanation for this is that he is examining the contents of the mind and their associations in minute detail. This is reflected in the idea density of his Swan’s Way of 0.561. The average for the corpus is 0.539 and the standard deviation is 0.0256 so Swan’s Way is one standard deviation above the corpus average. On the other hand, Kipling’s adventure story, Captains Courageous, scores lower at 0.531. It seems that describing “boys’ own” adventures requires a lower level of ideation than musing about a madeleine.

The interesting thing for the modern writer (and reader) is how idea density has changed over time. Figure 1 shows the idea density for each of the works in the corpus by year/century.

The highest scoring text in the corpus is The Elegy of Lady Fiammetta (chapter 1 only) written in 1342 by Boccaccio which scores 0.602. The novel depicts a love affair between a noblewoman and a merchant. It was described by John Addington Symonds as “… the first attempt in any literature to portray subjective emotion exterior to the writer”. Although a love story, it is no mere depiction of sensuality but an exemplification of interiority. It is to 50 Shades of Grey as Harold Pinter is to Harold Robbins.

Interestingly, no modern novel comes close to its score. Swan’s Way and Beckett’s Unnamable score the highest in the cohort for the last 200 years at 0.561. The highest score for a “modern” (1800-) novel is Children of the World by Heyse, who won the Nobel Prize for literature in 1910. It is interesting to note that Joyce’s Ulysses scores at a “pre-Axial” 0.498. The most obvious reason for this is that its subject matter is extremely sensual and concrete. The depiction of olfactory, visual, tactile, visual and audial phenomena infuse the work, crowding out the interior. It is interesting to note that Beckett was a protegé of Joyce and yet they went in quite different directions in terms of their dealings with the interior.

But why should any of this matter? Surely the important thing is enjoyment! The fact is that the brain reacts quite differently to “literature” in comparison to popular fiction. A study by Kid and Castano (Kidd, D.C. & Castano, E. (2013). ‘Reading literary fiction improves Theory of Mind’. Science, 342, 377-380), demonstrated that “literary” fiction has a very different effect on the brain than “popular” fiction. The study involved assigning participants to one of two groups. One group read a short story by Chekhov or an excerpt from a book that had won the National Book Award while the other group read popular fiction by such authors as Danielle Steele. After reading, it was found that those in the literary fiction group scored significantly higher on tests of empathy, social perception and emotional intelligence.

Thus, the difference between literature an popular fiction may not be qualitative. That is, it may not be that popular fiction lacks a quality which literary fiction possesses. Rather, it may be that literature exists on a continuum such that those works which have a greater level of some measure, such as idea density, are more literary.

To put these observations into a modern context, consider Figure 2 which shows the average idea density for literary works by century. Each column represents the average idea density of the literary works produced in that century.

In the sixth century BCE, the Deuteronomic History appeared which arrested a declining trend in Idea density from approximately the fourteen hundreds to the eight hundreds BCE. Thereafter, the average level rose apart from a small fall in the three hundreds CE. The high point was in the 400s CE which, interestingly, represents the end stages of the Roman Empire. During the ‘dark ages’, as education levels and time devoted to literary pursuits declined, idea density of literary works fell from the high point of 400 CE. Even in the 14th Century, when Boccaccio was writing, the general level did not rise to the heights of 400 CE. However, the thing that should concern modern readers is that the quality of the work they are reading has fallen since that time. Certainly the levels are higher than they were in the 19th Century and in the 15th Century. But we tend to like to think that we have come a long way since the Middle Ages and are vastly more sophisticated than the writers of the Roman Empire. Apparently not.

Note on the Corpus:
The basic list of narratives (prose fiction) was drawn from Moore’s list of novels discussed in The Novel: an Alternative History – Beginnings to 1600 (2011). This source provides a list of 235 prominent works of narrative fiction from 2000 BCE to 1600 CE. Not all novels from this source were included in the data set. The main reason for this is that to be included in the final data set, the novel had to be available in digital form. The Gutenberg project (www.gutenberg.org) was invaluable in providing access to a large number of the texts in question. However, a large number of works had to be excluded due to unavailability of digital versions. Furthermore, this sample represents the “Western” instantiation of the Axial Age. The same cognitive changes that occurred during the Axial Age in the West and Middle East occurred in the Far East. The focus of this study has been the West, the Middle East and India so that a line of development can be traced to the modern novel. China has been excluded because there was not a great deal of influence from this country on the development of narrative fiction in the West.
Although Moore’s list of novels continues until 1600, and thereby provides a good sample of narrative texts, there are several issues about the data set that need to be highlighted-
Non Axial Cultures: Several cultures represented in Moore’s list are not relevant to discussions of the Axial Age so these have not been included. Thus, Japanese, Tibetan and Amerindian novels have been excluded.

Where an appropriate date of publication was not available the document was excluded. The Arabian Nights, for example, consists of tales collected tales originally written/published over the period 800 – 1600. Thus, it is not possible to assign a date.

Almost all novels in early modern English have been excluded because the version of English used is not amenable to the method of computer aided text analysis being used. An exception is the version of Mallory’s Mort D’Arthur (1497) used in the dataset which is from the 1897 “Temple Classics” edition and as such is rendered into modern English. Beware the Cat by William Baldwin (1593) was included because a ‘modern’ version was available. However, several other early modern English novels were excluded due to the text and language in the digital versions not being amenable to analysis using the software.

Moore’s list was supplemented by novels discussed in The Cambridge Companion to European Novelists (Bell 2012). Another group of novels was taken from novels written by novelists who had won the Nobel Prize for Literature. Thus, Moore’s list is extended using highly prominent novels ending in 1955 with Beckett’s Molloy Trilogy. It should be noted that the caveats associated with the selection from Moore’s list also apply to the selection from these other two sources.

An important question is whether the above criteria for exclusion could result in any systematic bias in relation to the variable being examined. The answer would seem to be that the above exclusions do not seem to directly relate to the variables under consideration. The exclusion of documents that were not available in digital form, for example, would not introduce bias because we can assume that those that are chosen to be transformed into machine readable format have been chosen because they are considered important and interesting. This is so over the whole period under consideration so any diachronic pattern in idea density can be seen to be due to actual changes not associated with the selection of texts.