ABSTRACT

Products reviews are one of the major resources to determine the public
sentiment. The existing literature on reviews sentiment analysis mainly
utilizes supervised paradigm, which needs labeled data to be trained on and
suffers from domain-dependency. This article addresses these issues by
describes a completely automatic approach for sentiment analysis based on
unsupervised ensemble learning. The method consists of two phases. The first
phase is contextual analysis, which has five processes, namely (1) data
preparation; (2) spelling correction; (3) intensifier handling; (4) negation
handling and (5) contrast handling. The second phase comprises the unsupervised
learning approach, which is an ensemble of clustering classifiers using a
majority voting mechanism with different weight schemes. The base classifier of
the ensemble method is a modified k-means algorithm. The base classifier is
modified by extracting initial centroids from the feature set via using
SentWordNet (SWN). We also introduce new sentiment analysis problems of
Australian airlines and home builders which offer potential benchmark problems
in the sentiment analysis field. Our experiments on datasets from different
domains show that contextual analysis and the ensemble phases improve the
clustering performance in term of accuracy, stability and generalization
ability.