"...low performance of human coders in the classification task of sarcastic tweets suggests that gold standards built by using labels given by human coders other than tweets’ authors may not be reliable."

Not only is it difficult to build an accurate, representative sample, but the complexity of sarcasm can often lead to algorithm overfit.

Potential solution - Context aware weighting

Let's go back to context.

One of the most promising solutions for training machines to identify sarcasm is a method constructed by David Bamman and Noah A. Smith that looks for context around tweets.

Rather than just analyzing tweets on their own, the model constructed by Bamman and Noah also looks at attributes of the author (author features), attributes of the intended recipient of a tweet (audience features), and the attributes of responses to potentially sarcastic tweets (response features).

On their sample set of data Bamman and Noah found that adding each feature-set incrementally improved the accuracy of the model.

When using all features, baseline accuracy increased from 75.4% to 85.1%.

One very important thing to note is that the data sample used in this study was created using self-identified sarcastic tweets that included the hashtags #sarcastic or #sarcasm.

The hashtags were used for classification but removed during testing.

Bringing context into the equation is probably the best approach out there to getting more reliable results. Yet, the question remains if the success of this model can be replicated on less overtly sarcastic datasets.

Though promising approaches exist, fundamental problems remain

"We found that automatic classification can be as good as human classification; however, the accuracy is still low. Our results demonstrate the difficulty of sarcasm classification for both humans and machine learning methods."

Sure, when somebody identifies something as sarcastic with a hashtag or an explicit statement, it's easy to deal with. The problem is when it isn't stated explicitly.

Given all that we now know, it makes perfect sense why software providers with a sentiment analysis component would tell you that their software can detect sarcasm. Right?... #Sarcasm

The bottom line is this: Using natural language processing to detect sarcasm on the internet still has a long way to go and may never be particularly reliable.