ABSTRACT

Botnets, as one of the most aggressive threats, has used different techniques, topologies, and communication protocols in different stages of their lifecycle since 2003. Hence, identifying botnets has become very challenging specifically given that they can upgrade their methodology at any time. Various detection approaches have been proposed by the cyber-security researchers, focusing on different aspects of these threats. In this work, 5 different botnet detection approaches are investigated. These systems are selected based on the technique used and type of data used where 2 are public rule–based systems (BotHunter and Snort) and the other 3 use machine learning algorithm with different feature extraction methods (packet payload based and traffic flow based). On the other hand, 4 of these systems are based on a priori knowledge while one is using minimum a priori information. The objective in this analysis is to evaluate the effectiveness of these approaches under different scenarios (eg, multi-botnet and single-botnet classifications) as well as exploring how a system with minimum a priori information would perform. The goal is to investigate if a system with minimum a priori information could result in a competitive performance compared to systems using a priori knowledge. The evaluation is shown on 24 publicly available botnet data sets. Results indicate that a machine learning–based system with minimum a priori information not only achieves a very high performance but also generalizes much better than the other systems evaluated on a wide range of botnet structures (from centralized to decentralized botnets).