Real-time threat intelligence using Hadoop

Now that you are familiar with Hadoop and big data you might ask the question “Who uses Hadoop for real-time cyber-security?”.

One example might be McAfee Global Threat Intelligence a product from McAfee (part of Intel) which collects data from millions of sensors world-wide, correlates this data to provide real-time reputation scoring and threat intelligence. If you are a McAfee customer and need a way to get reputation scores about reputed “bad actors” on the internet, you could deploy a GTI proxy appliance in your location and have every McAfee end-point node in your location use the proxy appliance to query the GTI application in the cloud. The GTI application runs over a Hadoop cluster. Such access to real-time threat intelligence helps McAfee end point products deliver more effective cyber-security.

Another example is IpTrust (Endgame systems) a cloud based service whose reputation scoring system collects data, runs it through MapReduce and then hands it over to Cassandra (a NoSQL distributed database mgmt. system) running over Hadoop Distributed File System (HDFS). Apparently they have a good business model as their customers include HP and IBM. Why use Hadoop? Simply because if your goal is to mine millions or billions of log files to look for botnet activity what better and more scalable platform could there be than open source Hadoop?

In conclusion if you are a security vendor deploying a cloud based reputation scoring service and you have a need to process and store way more data than traditional databases can handle then you should consider Hadoop as the foundation for your solution.