Use ELK to Visualise Security Data: IPTables and KippoSSH Honeypot

Among the countless possible use cases where ELK can help save the day, displaying security-relevant data is certainly a very interesting one. In this blog post, using a virtual machine sitting on the cloud, we're going to show how to quickly set up a clustered instance of Elasticsearch to visualise firewall and honeypot datasources, namely IPtables and KippoSSH, focusing on the ELK-relevant configuration bits.

KippoSSH is a medium interaction honeypot capable of recording plenty of information about the attacker, including interactive TTY sessions recordings; for the purpose of this blog post, we'll leave that latter piece of info aside, and focus on making sense of some brute force data.
Starting from the live raw data, we have logs containing:

We'll need to tell Logstash where the logs are located and what to do with them, below is our logstash config file ($LOGSTASH_HOME/config/logstash.conf). Notice the three different sections Input, Filter, Output:

Notice in the above output section that the index names both start with logstash-*.

This is in order to leverage logstash-* index template mapping, which will allow us to make use of both an 'analyzed' and 'not_analyzed' versions of the fields, in order to be able to correctly draw our dashboards.See more on this aspect here

Now let's take a closer look at the Grok definitions we will be using.

We can get our raw data recognised and parsed by adding the 4 lines below to a pattern file in $LOGSTASH_HOME/patterns:

Here we are just telling Grok to be aware of these new definitions that we have referenced in Logstash main config file earlier. Grok will match the event types we have configured and use the regular expression above to extract our fields of interest.

You can see that some definitions are based on already existing patterns - for e.g. IPTABLES_DENIED reuses Grok patterns SYSLOGTIMESTAMP, HOSTNAME and IP. This is one of the key strengths of Grok, aimed at making a better use of your time, other than writing and re-writing regex after regex.

Using good online regex tools can help speed up on this task. Grok Debugger is a great tool you can rely on to construct and debug Grok patterns. Some other resources worth mentioning for pure regex testing are regex101.com, regextester.com.

See Grok Debugger in action below:

Now we're good to index as much data as we like (and our hardware can handle), so we will move to the next layer on the ELK stack.

Store the data and make it searchable: Elasticsearch

Setting up an Elasticsearch cluster is straightforward. In this demo we setup 2 elasticsearch nodes on a single host. You could also do this in production, if you have enough capacity on a single machine in some specific scenarios. See here for more details.

We could leave default settings and get started by:

extract the elasticsearch tar archive

launch each instance as a daemon

and these instance would just talk to each other like good old friends using multicast zen discovery and form a cluster with no configuration needed!

As we'd like to have a bit more control over the cluster behaviour we proceed to amend some defaults for each of the 2 nodes.For example we set node 1 to use:

#Set a name for each node, for readability and manageability
node.name: "node-1"
#Disable multicast discovery, you never know what happens in a network you don't own
discovery.zen.ping.multicast.enabled: false
#Configure an initial list of master nodes in the cluster, we know who we are
discovery.zen.ping.unicast.hosts: ["localhost:9301"]
#Set a cluster name, elasticsearch is a nice name though, we like here to set our own
cluster.name: joinus

And that is it!

Cluster 'joinus' is ready to accept data and make it searchable, resilient and all the magic Elasticsearch will do for us.

We can now create two indexes in our Elasticsearch cluster by issuing:

$ curl -XPUT 'localhost:9200/logstash-os?pretty'

and

$ curl -XPUT 'localhost:9200/logstash-honey?pretty'

For each of these requests, we will receive the below answer, saying index creation was successful:

{
"acknowledged" : true
}

We could also tell Elasticsearch explicitly how to interpret our fields (is this an integer or a date or a string), but for this simple demo we're happy to let Elasticsearch automatically determine the field types.

Open my eyes, show me what you've got: Kibana

While we impatiently await Kibana 4 to go GA let's use Kibana 3 to plot some data.

We have now left this running for a while to allow the bad guys to feed us with some events and make our dashboards nice and pretty.

Notice our document types are showing up in 'Document Types' panel. Contextually on the left handside we also have a list of available searchable field:

Now that we have validated the data looks good, we can go ahead and start building our first dashboard from scratch!

Let's go one step back, and choose option 3, 'Blank Dashboard', and let's set a title for our new dashboard

And point it towards the index 'logstash-os' where we are storing IPTables events (by default '_all' indexes are set to be queried)

Then let's add a row to our empty 'Denied Connections' dashboard.

Finally let's add our first panel and choose panel type 'Terms' to show some nice aggregations

Notice that as you type your first char in Field form, Kibana will show you all the possible field matching the string as you type it. We have also here available fields named with extension .raw , this is because we are leveraging logstash index mapping template for having 'not_analyzed' fields.

We choose now 'src_ip' for this panel and let's set up some more options, we want to see the top 20 IPs, sorted by count, using bars as the visualization format.

Just hit 'Save' Et voilà !

You know, Kibana rocks!
Reiterate the above and have fun!

We have now plenty of data to leverage, manipulate and visualise from many different angles.

We can ask all the questions we want, just add a few queries to slice your data as you please, each panel can display all or a selection of the queries.

Then we can plot beautiful charts giving us lots of insight on the bad guys trying to access our host:

Likewise, for our KippoSSH honeypot data

You can click on any dashboard, drilldown and start your investigations: just ask the questions, ELK will give you the answers - easy peasy!"