Classification of Malicious Domain Names Using Support Vector Machine and Bi-Gram Method

Everyday there are millions of domains registered and some of them are related to malicious activities. Recently, domain names have been used to operate malicious networks such as botnet and other types of malicious software (malware). Studies have revealed that it was challenging to keep track of malicious domains by Web content analysis or human observation because of the large number of domains. Legitimate domain names usually consist of English words or other meaningful sequences and can be easy to understand by humans, while malicious domains are generated randomly and do not include meaningful words or are not otherwise readable. Recently, a classification method has been proposed to classify malicious domain names. They used many features from DNS queries, including some textual features.