Abstract

Firms are turning their eye towards social media analytics to get to know what people are really talking about their firm or their product. With the huge amount of buzz being created online about anything and everything social media has become ‘the' platform of the day to understand what public on a whole are talking about a particular product and the process of converting all the talking into valuable information is called Sentiment Analysis. Sentiment Analysis is a process of identifying and categorizing a piece of text into positive or negative so as to understand the sentiment of the users. This chapter would take the reader through basic sentiment classifiers like building word clouds, commonality clouds, dendrograms and comparison clouds to advanced algorithms like K Nearest Neighbour, Naïve Biased Algorithm and Support Vector Machine.

Introduction

Web 2.0 model lets the free flow of content, allowing the user to communicate with each other, expressing their ideas and opinions with each other on the platform of Internet, the classic example being Facebook. There are hundreds of communities/groups/pages online set up by the experts. These communities can include groups set up by physical training experts, cooking experts, and any other. They also have a huge following, where the followers follow the topics shared by the experts. Other examples include the IMDB, where the critics share the review of the movie online. Twitter has been a great platform for people to share their opinions in limited words, these are called tweets. Right from celebrities to politicians everyone has been pretty active on twitter in terms of sharing their opinions and this makes twitter a great place to analyse sentiments.

Sentiment Analysis is basically analysing a piece of text and understand what that text really means for a company. We all have the habit of reading movie reviews before we actually go and watch a movie, movie reviews sometimes form the basis of our perception towards our attitude towards a movie. In this case, sentiment analysis can be used to understand if a user is having a positive inclination towards a movie or a negative on and accordingly the fate of the movie can be decided. Such is the power of Sentiment Analysis. This is why marketers are investing a lot in assessing the sentiment of the customers towards their products, and the results from these analyses can be used to make strategies for the further marketing plans.

As there are various mediums to express the sentiments of the users, there are also different ways in which the sentiments can be analysed. The basic technique involves text analytics to analyse the words on the basis of their occurrence, for now let us assume that the sentiment of the user is directly proportional to the number of good words or bad words used in a sentence. This will actually make the work of the analyst easy but there is a problem of ambiguity with this technique. Consider the following tweets posted online “Hillary is not good”, “Hillary might not be a good candidate for President Elections”, “Hillary is good”, and so on. Now we can see that the word ‘good’ is the most frequent, which means that the overall sentiment of the people must be positive towards Hillary (according to the technique), but it is not actually so. As we can see the first tweet is negative, second tweet is neutral and the third tweet is positive, making the overall sentiment neutral. Hence the problem of ambiguity occurs, because the technique works on the frequency of words, and does not consider other factors. To resolve this, there are advanced algorithms for classification, which work on different criteria, hence making the overall task of classification more efficient.

This chapter covers two basic applications of Social Media Analytics, the first being using Text Analytics to determine the sentiments of the people of US for the presidential elections between Trump and Hillary. The second application is a part of advanced analytics, i.e. by using the advanced algorithms like k-nearest neighbour, Support Vector Machine, Naïve Bayes to classify the sentiments of the students towards the services provided in the education Institution into positive, negative and neutral, and then using the best classification method to build a model using logistic Regression.