Crystalace - Sarcasm Detection and Beyond

Detecting sarcasm is among one of the toughest natural language understanding problems in AI. In computational linguistics and NLP, sarcasm detection is receiving increasing research interest. While recent studies recognized the linkage between sarcasm and sentiment and have proposed various techniques for detecting sarcasm, none directly and systematically studied the impact of sarcasm detection on sentiment analysis.

Sarcasm is a complex communication phenomenon. It is often expressed in a seemingly positive way in the literal sense which involves a negative emotional connotation.

Sentiment analysis, also known as opinion mining, is a popular topic of study of the feelings and opinions from social media user-generated content. Sarcasm detection, though very related, is a different topic of interest to sentiment analysis. As a classification task, the primary objective of sentiment analysis is to determine if a message is positive, negative, or neutral. In contrast, the objective of sarcasm detection is to determine if a message is sarcastic or not sarcastic.

To illustrate, let us look at two short text examples.

Example 1. Love my new phone! Only that the battery runs out very fast.
Example 2. Love my new phone that runs out battery so fast!

Failure to recognize sarcasm may lead to miscommunication (see examples of misinterpreted sarcastic tweets). For social media analytics and communication, the associated risk can be amplified due to the sheer volume and velocity of potentially sarcastic expressions falsely considered as positive expressions.

Our InnovationIn order to capture discriminative and explainable sarcasm features, we sought to design a feature model based on review and synthesis across related studies such as natural language processing, linguistics, psychology, speech and communication, as well as neuroscience.

The figure below presents an overview of the proposed sarcasm detection method that we name it as "Crystalace".

To train and evaluate our sarcasm classifier, we downloaded the annotated tweets dataset from Riloff et al. (2013), pre-processed the tweets, and trained a linear SVM classifier using our features model. The results show that our method obtained F1-score of .60, which gained an additional .09 as compared to the best condition reported in Riloff et al.'s original study. Based on the results, we trained the final Crystalace sarcasm classifier using the full dataset.

To the best of our knowledge, no major sentiment analysis systems developed to date have incorporated the capability to recognize sarcasm. In our latest research, we designed a sarcasm detection enhanced sentiment analysis system that we call it "CrystalNest" and evaluated its performance. The results using official SemEval-2017 Task 4A-4D test data provided evidence on the value of embedding sarcasm detection in sentiment analysis systems.

Detecting Sarcastic Five Star Reviews
Many online view sites provide a function to allow users to give star ratings on a product, a service or an employer. Sometimes, however, people do not follow these ratings unintentionally and intentionally. We applied Crystalace to analyze 15 Amazon product reviews and detected eight sarcastic reviews that were actually marked as four or five stars.

World "Cyber Sarcasm" Profile
How would users leverage sarcasm detection for business cases? Let's start to explore, say, which countries are most sarcastic? We depicted a word sarcasm profile based on a collection of tweets.

A very recent knowledge we know about sarcasm is that it is highly associated with creativity. In a study published in Organizational Behavior and Human Decision Processes, researchers tested a novel theoretical model in which both the construction and interpretation of sarcasm lead to greater creativity following a simulated sarcastic conversation or after recalling a sarcastic exchange. They found that both sarcasm expressers and recipients reported more conflict but also demonstrated enhanced creativity, and these are because they activate abstract thinking. Therefore, it would be very interesting to extend this simple world sarcasm profile analysis to a sociolinguistic study.

Emotion Intensity (EI) Lexicon

The Emotion Intensity (EI) Lexicon is a tab-delimited list of 3,204 emotion-related English words, common emoticons and Internet slangs labelled in two dimensions: strength and intensity. The lexicon is built with a general emotion-feature extraction purpose, and hence could be useful for other NLP tasks or behavior prediction research.

Complete a simple registration to receive a link to download the lexicon.

Tweet Pre-Processing (TweetCrystalizer) Script

Social media content such as tweets contain unlimited amount of untraditional expressions, such as user-created hashtags (e.g., #shitnooneeversay), misspelt or elongated words (e.g., greaaat, awwww), and unusual expressions or Internet slangs (e.g., lolz, SMDH). This makes it difficult for direct processing. We developed a Tweet Pre-Processing (TweetCrystalizer) script that can pre-process a tweet to a normalized text. This module is found to be helpful in enhancing the efficacy of subsequent analysis.

Complete a simple registration to receive a link to download the Python script.

We are considering to release Crystalace in the form of the API and/or SDK. It'll help to speed this up if you let us know you have such need.