scikit-learn, machine learning and cybercrime attribution

Summary

The scikit-learn library is a rapidly growing open source toolkit for machine learning in python. It allows for practitioners and researchers to apply machine learning in a variety of applications and is used by companies worldwide. Developed by programmers from around the world, the project has a large (and increasing) number of machine learning algorithms, a very useful set of utility functions and has also spawned a set of detailed tutorials. Written in python with the aid of numpy, scipy and cython, this library is featured, fast and extensible.

In this talk, I will introduce scikit-learn, giving an overview of the library, its features and how to use it for a number of different applications. Next talk about some of the tutorials that are actively being developed for learning machine learning and scikit-learn and also how to contribute. Through this, I'll introduce some key machine learning concepts and how you can apply them to a variety of tasks. The focus will be on practical uses, rather than theoretical advances.

To end the presentation, I'll briefly overview the research I perform at the Internet Commerce Security Laboratory (University of Ballarat) in cybercrime attribution, where I work with our industry partners to disrupt cybercrime. While it can be very difficult to do direct network based attribution, indirect methods through criminal profiling may assist in stopping crimes such as phishing or online fraud. I'll walk through some of our results in identifying the size and scope of the operations behind some of these attacks.