Contents

Sentiment analysis may employ machine learning techniques.
One often apply method is naïve Bayes classifier where the algorithm is trained on a labeled data set.
Within the Python package NLTK is a classic sentiment analysis data set (movie reviews) as well as general machine learning methods for sentiment classification. Some of the earliest papers on this approach are probably

Another approach is to use a word list where each word has been scored for positivity/negativity or sentiment strength. There exists several word lists: ANEW is the oldest and has around 1000 words, AFINN is newer and has around 2.500, while labMT has over 10.000 words scored.

One way to extended word lists is to use word co-occurence or a word ontology such as WordNet.[3] The method may go back to 1957.[4]

Sentiment analysis may use word lists annotated for their arousal and their valence, i.e., whether they are positive or negative.
Some word lists are listed and commented on in setion 7.3 of the Pang/Lee monograph.
Some of the word lists are:

Linguistic Inquiry and Word Count[18] Commercial ($90) word lists with computer program to extract basic counts / ratios. Contains dictionaries for English, German, Spanish, Dutch, and Italian. Extracts around 60 different word categories, including "positive emotions" and "negative emotions". The program can be purchased; their site also allows you to analyze texts one by one.

Performance of a sentiment analysis system may depend on corpus and of annotation.
Annotation may be a sentiment strength for each text or an categorical variable, 2-class: positive/negative, 3-class: positive/negative/neutral or 4-class: positive/negative/both/neutral.[32]

Sentiment analysis performance of humans have ben reported to be 82-90%.[33]