Data Analytics of Hacker IRC Information

By Jiakai Yu

Cyber security is a challenging research problem especially when one considers exponential growth in information technologies. As individuals, businesses, and government rely heavily on cyber infrastructure to meet their advanced information services and applications, cyber attacks have also been intensified in number, complexity and impacts. The main cyber security challenge involves the human side, where there has been a continuous growth and advancements in the technologies that can be exploited by hackers to commit cybercrime. Traditional cyber security techniques and tools have focused on the cyber infrastructures, protocols and applications. For example, vulnerability scanning tools (e.g., OpenVAS) analyze the vulnerabilities in computers, networks and applications. On the other hand, little work is done to protect against cybercriminal by focusing on the human side, cognitive behaviors and goals. Specifically, designing a system to protect our cyber infrastructure and information services against cyber adversaries is one of the unfulfilled tasks as highlighted in a 2011 report on cyber security published by the National Science and Technology Council (NIST). By focusing on the behavior and understanding the goals of cybercriminals, we can build comprehensive protection techniques and algorithms against cybercriminals.

Recent reports on the behavior of hacker groups that use IRC forums gave noteworthy data to cyber security experts. Investigation of the hacker IRC information helped discovering cybercriminal operations, their near-future activities, and enabled proactive cyber security measures against cyber-assaults; for example, researchers were able to recognize botnet administrators, check the spread of malicious tools and skills, and identify key members in hacker communities. The goal of our research is to expand the capabilities available to collect and analyze hacker IRC information. In this project, we are developing an automated approach to collect information about hackers, and attempt to understand their behaviors and goals. Internet Relay Chat (IRC) forums have been widely used by hackers to exchange data, tools and train new novice hackers. We present our approach to implement an automated framework that uses several bots to collect IRC messages from malicious forums and analyze them. A resilient botnet mechanism is utilized to ensure complete IRC data collection. In addition, we developed an intelligent hacking language module based on Stanford CoreNLP to analyze hacker activity. Our experimental results show that our botnets can be used to effectively monitor, analyze, and predict hacker activities and goals.