Title

Author

Defense Date

2014

Document Type

Thesis

Degree Name

Master of Science

Department

Computer Science

First Advisor

Carol Fung

Abstract

Twitter generates the majority of its revenue from advertising. Third parties pay to have their products advertised on Twitter through: tweets, accounts and trends. However, spammers can use Sybil accounts (fake accounts) [21] to advertise and avoid paying for it. Sybil accounts are highly active on Twitter performing advertising campaigns to serve their clients [5]. They aggressively try to reach a large audience to maximize their influence. These accounts have similar behavior if controlled by the same master. Most of their spam tweets include a shortened URL to trick users into clicking on it. Also, since they share resources with each other, they tend to tweet similar trending topics to attract a larger audience. However, some Sybil accounts do not spam aggressively to avoid being detected [22], rendering it difficult for traditional spam detectors to be effective in detecting low spamming Sybil accounts. In this paper, I investigate additional criteria to measure the similarity between accounts on Twitter. I propose an algorithm to define the correlation among accounts by investigating their tweeting habits and content. Given known labeled accounts by spam detectors, this approach can detect hidden accounts that are closely related to labeled accounts but are not detected by traditional spam detection approaches.