The Impact of Ambiguity in Social Listening and Analytics

There are many forms of ambiguity in social media posts, with the most popular being sarcasm. Sometimes it is confused or used in an interchangeable way with irony. Here is a definition for the two terms from stackexchange.com :

“Irony is used to convey, usually, the opposite meaning of the actual things you say, but its purpose is not intended to hurt the other person. Sarcasm, while still keeping the "characteristic" that you mean the opposite of what you say, unlike irony it is used to hurt the other person.”

For the purposes of this blog post, both irony and sarcasm are responsible for and present the same problem when trying to automatically annotate a post with sentiment or an emotion. The author of a social media post may write something positive about a brand e.g. “I love the new flavour” but if it’s sarcastic, then it is really a negative post, and vice versa e.g. “don’t you hate this ice cream flavour?”.

DigitalMR’s claim to fame, since 2014, is that its R&D focus to solve the problem of low accuracy in automated sentiment analysis in any language, has produced a solution – listening247 – that delivers over 80% sentiment and semantic precision (precision is one of the accuracy metrics in big data analytics). The reason why it is not and cannot really be 100% is because of ambiguity. The outcome of ambiguity in this context is that humans will not agree amongst themselves about the sentiment of a sarcastic or ironic post. Some will think it is positive, some may think it is negative, and in some cases others will think it is neutral for the brand mentioned in the post – i.e. the sentiment is not towards the brand but something else (see Fig.1); it follows that we cannot expect an algorithm to produce a result that everyone agrees with in a case like this. In our research, we have found that on average 10%-30% of posts about a category contain some form of ambiguity. In the example below, 43% was the highest level of agreement among 30 market research practitioners; this is why 80% precision is an excellent result for automated sentiment analysis.

Figure 1: Manual Sentiment Curation of an Ambiguous Tweet (Base n=30)

Some of you are already aware that DigitalMR uses machine learning to annotate sentiment in an automated way. Machine learning implies that there is an algorithm or a combination of multiple algorithms which are trained with the use of a training dataset, to create a model that does the job. There is one possibility to expect 100% sentiment precision; If supervised machine learning is used (as opposed to semi-supervised or unsupervised), it means that humans create the training dataset manually. If only one human is responsible for creating a training dataset, then the model will only use this person’s judgement to annotate posts for sentiment. In a case as such, because only one person has to agree with the sentiment annotated by the model, if that person is the judge of the model’s precision – then 100% precision is achievable – because that person will not disagree with herself.

It is needless to say that when machine learning is used for automated sentiment analysis, by definition, the identification of sarcasm/irony is a solved problem. “Why?” you may ask. Because a human curator (the person who creates the training dataset) has an understanding of sarcasm/irony, and more often than not, he or she will detect it and annotate a post accordingly.