design pattern-based features thatindicate presence of discriminative patterns as extracted from a large sarcasm-labeled corpus. To allow generalized patterns to be spotted by the classiﬁers, these pattern-based features take real values based on three situations: exact match, partial overlap and no match.

Gonz´alezIb´anez et al. [2011]

use sentiment lexicon-based features. In addition, pragmatic features like emoticons and user mentions are also used.

use a set of patterns,specifically positive verbs and negative situation phrases, as features for a classiﬁer (in addition to a rule-based classiﬁer).

Liebrecht et al. [2013]

introduce bigrams and trigrams as features.

Reyes et al. [2013]

explore skip-gram and character n-gram-based features

Maynard and Greenwood [2014]

include seven sets of features. Some of these are maximum/minimum/gap of intensity of adjectives and adverbs, max/min/average number of synonyms and synsets for words in the target text, etc. Apart from a subset of these,

Barbieri et al. [2014a]

use frequency and rarity of words as indicators.

Buschmeier et al. [2014]

incorporate ellipsis, hyperbole and imbalance in their set of features.

Joshi et al. [2015]

use features corresponding to the linguistic theory of incongruity. The features are classiﬁed into two sets: implicit and explicit incongruity based features.

Pt´acek et al. [2014]

use word-shape and pointedness features given in the form of 24 classes.

Rajadesingan et al. [2015]

use extensions of words, number of ﬂips, readability features in addition to others.

Hern´andez-Far´ıas et al. [2015]

present features that measure semantic relatedness between words using Wordnet-based similarity.

Liu et al. [2014]

introduce POS sequences and semantic imbalance as features. Since they also experiment with Chinese datasets, they use language-typical features like use of homophony, use of honoriﬁcs, etc.

Abhijit Mishra and Bhattacharyya [2016]

conduct additional experiments with human annotators where they record their eye movements. Based on these eye movements, they design a set of gaze based features such as average ﬁxation duration, regression count, skip count, etc. In addition, they also use complex gaze-based features based on saliency graphs which connect words in a sentence with edges representing saccade between the words.

LEARNING ALGORITHMS

.Gonz´alez-Ib´anezetal.[2011]

SVM with SMO and logistic regression. Chi-squared test is used to identify discriminating features

Reyes and Rosso [2012]

Naive Bayes and SVM. They also show Jaccard similarity between labels and the features

Riloff et al. [2013]

compare rule-based techniques with a SVM-based classiﬁer.

Liebrecht et al. [2013]

use balanced window algorithm in order to determine high-ranking features.

Reyes et al. [2013]

use Naive Bayes and decision trees for multiple pairs of labels among irony, humor, politics and education

Bamman and Smith [2015]

use binary logistic regression

Wang et al. [2015]

use SVM HMM in order to incorporate sequence nature of output labels in a conversation.

Liu et al. [2014]

compare several classiﬁcation approaches including bagging, boosting, etc. and show results on ﬁve datasets.

Joshi et al. [2016a] , On the contrary of Liu et al

experimentally validate that for conversational data, sequence labeling algorithms perform better than classiﬁcation algorithms. They use SVM-HMM and SEARN as the sequence labeling algorithms.

DEEP LEARNING

Joshi et al. [2016b]

use similarity between word embeddings as features for sarcasm detection. They augment features based on similarity of word embeddings related to most congruent and incongruent word pairs and report an improvement in performance. The augmentation is key because they observe that using these features alone does not sufﬁce.

Silvio Amir et al. [2016]

present a novel convolutional network-based that learns user embeddings in addition to utterance-based embeddings. The authors state that it allows them to learn user-speciﬁc context. They report an improvement of 2% in performance

Ghosh and Veale [2016]

use a combination of convolutional neural network, LSTM followed by a DNN. They compare their approach against recursive SVM, and show an improvement in case of deep learning architecture.

RULE-BASED

Veale and Hao [2010]

focus on identifying whether a given simile (of the form ‘ as a ’) is intended to be sarcastic. They use Google search in order to determine how likely a simile is. They present a 9-step approach where at each step/rule, a simile is validated using the number of search results. A strength of this approach is that they present an error analysis corresponding to multiple rules.

Maynard and Greenwood [2014]

propose that hashtag sentiment is a key indicator of sarcasm. Hashtags are often used by tweet authors to highlight sarcasm, and hence, if the sentiment expressed by a hashtag does not agree with rest of the tweet, the tweet is predicted as sarcastic. They use a hashtag tokenizer to split hashtags made of concatenated words

Bharti et al. [2015]

present two rule-based classiﬁers. The ﬁrst uses a parse–based lexicon generation algorithm that creates parse trees of sentences and identiﬁes situation phrases that bear sentiment. If a negative phrase occurs in a positive sentence, it is predicted as sarcastic. The second algorithm aims to capture hyperboles by using interjection and intensiﬁers occur together

Riloff et al. [2013]

present rule-based classiﬁers that look for a positive verb and a negative situation phrase in a sentence. The set of negative situation phrases are extracted using a well-structured, iterative algorithm that begins with a bootstrapped set of positive verbs and iteratively expands both the sets (positive verbs and negative situation phrases). They experiment with different conﬁgurations of rules such as restricting the order of the verb and situation phrase.

FUTURE DIRECTIONS

(1) Implicit sentiment detection & sarcasm

[Joshi et al. 2015].

Several related works exist for detection of implicit sentiment in sentences, as in the case of ‘The phone gets heated quickly’ v/s ‘The induction cooktop gets heated quickly’. This will help sarcasm detection, following the line of semi-supervised pattern discovery

Based on past work, it is well-established that sarcasm is closely linked to sentiment incongruity

(4) Culture-speciﬁc aspects of sarcasm detection

Liu et al. [2014]

sarcasm is closely related to language/culture-speciﬁc traits. Future approaches to sarcasm detection in new languages will beneﬁt from understanding such traits, and incorporating them into their classiﬁcation frameworks

Joshi et al. [2016]

show that American and Indian annotators may have substantial disagreement in their sarcasm annotations-however, this seesanon-signiﬁcant degradation in the performance of sarcasm detection

(5) Deep learning-based architectures

Very few approaches have explored deep learning-based architectures so far. Future work that uses these architecture may show promise.

(2) Incongruity in numbers

Joshi et al. [2015]

point out how numerical values convey sentiment and hence, is related to sarcasm. Consider the example of ‘Took 6 hours to reach work today. #yay’. This sentence is sarcastic, as opposed to ‘Took 10 minutes to reach work today. #yay’.

(3) Coverage of different forms of sarcasm

,we described four species of sarcasm: propositional, lexical, like-preﬁxed and illocutionary sarcasm. We observe that current approaches are limited in handling the last two forms of sarcasm: like-preﬁxed and illocutionary. Future work may focus on these forms of sarcasm.