Content-based prediction of temporal boundaries for events in Twitter

Book Title: Proceedings of the Third IEEE International Conference on Social Computing

Date: October 09, 2011

Abstract: Social media services like Twitter, Flickr and YouTube publish high volumes of user generated content as a major event occurs, making them a potential data source for event analysis. The large volume and noisy content of social media makes automatic preprocessing essential. Intuitively, the eventrelated data falls into three major phases: the buildup to the event, the event itself, and the post-event effects and repercussions. We describe an approach to automatically determine when an anticipated event started and ended by analyzing the content of tweets using an SVM classifier and hidden Markov model. We evaluate our performance by predicting event boundaries on Twitter data for a set of events in the domains of sports, weather and social activities.