Limits of the Bing, AFINN, and NRC Lexicons with the Tidytext Package in R

I am currently working on my conference paper, “Digital Analysis of Dickensian Characters and the Language of Emotion,” which I am supposed to present at the 2018 Dicken Society Symposium in Tübigen, Germany this summer. This paper concerns performing sentiment analysis using a computer programming tool. I decided to use R programming for sentiment analysis, because R provided the Tidytext and Syuzhet[i] packages for sentiment analysis based on “AFINN,” “Bing,” and “NRC.”[ii]

My research mostly focuses on what words Dickens used, rather than the relationship of the words in sentences. This makes it less dangerous to analyze literary texts by drawing upon a big data tool, because I can shirk the drawbacks of text mining packages. Nonetheless, while I was working on the sentiment analysis of Dickens’ works, I found limitations in the Tidytext package for sentiment analysis, similar to how Joanna Swafford discovered drawbacks in the Syuzhet package when analyzing literary texts as sentences, since both the Tidytext and Syuzhet packages use the “Bing,” “AFINN,” and “NRC” lexicons.

Classification of gender and appellation.

The Bing and AFINN lexicons perceive the word “miss” as a negative word, considering that the word can be used as a verb. However, the NRC lexicon does not classify “miss” as a negative word. If “Miss. Amy” was classified as a negative word in Little Dorrit, the sentiment analysis would be inaccurate. If the analysis was solely based on word counts, gender classification would not matter. The most frequent word in many novels is “miss” because of the appellation, “Miss.” This is a problem in the tidytext package, rather than in the lexicons. The tidytext package needs to develop a function for classifying gender.

The classification of the NRC lexicon.

The words in the NRC lexicon are often duplicated. As you can see in the chart above, “abandon” is contained in three categories, and “abandoned” in four. This duplication often results in difficulties during sentiment analysis. In the sentiment analysis chart for Dickens’ Little Dorrit, according to the NRC lexicon, “mother” ranks number 1 in “joy,” “negative,” and “sadness” categories, whereas in the Bing and AFINN lexicons, “mother” is not classified as an emotional word. The word “mother” should not be considered an emotional word, but the NRC lexicon identifies it as so.

The classification of human names in the NRC lexicon.

Human names should not be classified as emotional words. Surprisingly, I found “John” is classified as a negative word in the NRC lexicon; It is filtered in the Bing and AFINN lexicons.

The three lexicons were created by different developer teams, each aiming for sentiment analysis of all texts. Since the lexicons are not tailored to literary analysis, there are some issues for the sentiment analysis of literary texts, especially for the NRC lexicon, which seems to have been developed separately from the Bing and AFINN lexicons. For Digital Humanists using the three lexicons at the same time in their research, maintaining uniformity in their research methodology could become problematic.

Since the Tidytext package shares the three lexicons “Bing,” “AFINN,” and “NRC” with the Syuzhet package, both packages have the same limitation for sentiment analysis. So far, there are very few options to choose from for sentiment analysis. In order to ameliorate the issues, we need to tailor the algorithms of the code to the purpose of our research. To enhance the field of sentiment analysis, these packages should be routinely updated to implement improvements based on feedback from experts like Swafford.

Saif Mohammad and Peter Turney. “Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon.” In Proceedings of the NAACL-HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, June 2010, LA, California. See: http://saifmohammad.com/WebPages/lexicons.html