The riddle of literary quality

Brief summary of the project
The Riddle of Literary Quality investigates the differences between novels published or translated into Dutch that readers consider to be high literature and those novels that have not been given this quality stamp. We analyze the textual qualities or textual attributes that can be discerned in low-level and high-level patterns. Our working hypothesis is that correlations exist between formal qualities and readers’ opinions. These correlations cannot be discerned with the naked eye, but might become visible using computational methods. It could be the case that books described as ‘literary’ by readers have a more complex sentence structure on average. Or that novels that are considered to be ‘not very literary’ contain a smaller number of different words. We aim to reveal the connection between literary judgments and patterns of textual commonalities or deviations. For the readers’ opinions we draw on the results of a large online survey we did in 2013: Het Nationale Lezersonderzoek (The National Reader Survey). Almost 14,000 respondents went through our list of 401 novels and indicated which of these they had read, and for a subset of these gave their opinion as to how literary they found the book and how they rated it on general quality. They also motivated one of their scores, and provided general information about themselves and the type of reader they are. The results of this survey are used as reference for our textual analysis of the same 401 novels. This unique combination means that we can offer some very concrete and interesting results of our project. We can solve a significant part of the riddle.

One of the surprises we encountered is the role that author gender plays. The first thing noticeable from the mean literary ratings is that the top ten only contains male authors and the bottom only contains female authors. In general, the female author is seen as less literary. Through researching the comments on highly appreciated novels, we found that respondents describe novels by female authors differently from novels by male authors – even if we only examine comments by female respondents. Female authors are judged on content, whereas male authors are judged on more abstract concepts such as style and structure. This is a good starting point for research of the texts of these novels: do they actually differ?

Next to gender, genre plays a big role in the appreciation. In fact, gender and genre seem to be intertwined. The literary thriller, for instance, is referred to as a ‘female’ genre. Most of these thrillers are written by female authors and they get low to mediocre ratings, but the ones written by men are rated as more literary. Although stylistic analysis does not differentiate the literary thrillers from the thrillers, there is actually a significant difference in themes, the amount of dialogue and and the way characters communicate non-verbally. As suspected, in the thrillers a lot of themes with regard to suspense play an import role, whereas these “suspense themes” are not prominent in the literary thrillers (in both the originally Dutch ones and the translations). Instead, the interaction between characters as well as their
self-development, is prominent in the literary thrillers and not so much in the thrillers With regard to this particular textual characteristic the literary thrillers do in fact gear towards the literary novels.

The biggest success so far has been an experiment with a predictive model of literary ratings, using a combination of lexical and syntactic features. The former consist of word counts, the latter of recurring phrases and syntactic fragments extracted from the corpus. Models trained on these textual features and metadata such as genre and author gender explain 72 % of the variance in literary ratings of the corpus; this shows that the variance in the literary judgments that is random or can only be explained by other factors is less than 30 %. For a given novel, its mean rating (on a scale of 1-7) can be predicted with an expected error of 0.5 using a model trained on other novels; for example if the predicted rating is 6.5, the actual rating tends to fall between 6 and 7. The results indicate that genre is a strong predictor. However, textual features still form a significant additional contribution and the model predicts differences within genres as well. Within the textual features we can see that word counts alone are strong predictors, but combining these with rich syntactic features can still improve the results. In sum, we found that perceptions of literary ratings can be explained to a large extent from the text itself: literary judgments are not arbitrary, and there is an intrinsic literariness to texts that are rated as literary.

Future directions
We are preparing several follow-up projects. For a start, it’s extremely difficult to automatically establish the plot structure of a text. What we would like to find out, for example, is whether plot development in novels that are considered to be highly literary differs from genre fiction, and if so, in which ways. Another plan under construction is to explore what the influence of editors is on the end product put on the market by publishers. For that, we would need to compare manuscripts as they were submitted to a publisher with the published version. There may be a difference, for instance, in how editors edit the text depending on their choice of genre – how to ‘label’ a manuscript. Another idea we would like to test is how writers deal with formal literary conventions during the actual writing process. This can be researched using keylogging software; a pilot experiment has been done with four young Dutch authors (see http://literatuurmuseum.nl/verhalen/vier-schrijvers). A grant proposal for follow-up research has just been submitted.