Going beyond Wireshark: experiments in visualising network traffic

IntroductionAt NVISO Labs, we are constantly trying to find better ways of understanding the data our analysts are looking at. This ranges from our SOC analysts looking at millions of collected data points per day all the way to the malware analyst tearing apart a malware sample and trying to make sense of its behaviour.

In this context, our work often involves investigating raw network traffic logs. Analysing these often takes a lot of time: a 1MB network traffic capture (PCAP) can easily contain several hundred different packets (and most are much larger!). Going through these network events in traditional tools such as Wireshark is extremely valuable; however they are not always the best to quickly understand from a higher level what is actually going on.

In this blog post we want to perform a series of experiments to try and improve our understanding of captured network traffic as intuitively as possible, by exclusively using interactive visualisations.

A screen we are all familiar with – our beloved Wireshark! Unmatched capabilities to analyse even the most exotic protocols, but scrolling & filtering through events can be daunting if we want to quickly understand what is actually happening inside the PCAP. In the screenshot a 15Kb sample containing 112 packets.

For this blog post we will use this simple 112 packet PCAP to experiment with novel ways of visualising and understanding our network data. Let’s go!

Experiment 1 – Visualising network traffic using graph nodesAs a first step, we simply represent all IP packets in our PCAP as unconnected graph nodes. Each dot in the visualisation represents the source of a packet. A packet being sent from source A to destination B is visualised as the dot visually traveling from A to B. This simple principle is highlighted below. For our experiments, the time dimension is normalised: each packet traveling from A to B is visualised in the order they took place, but we don’t distinguish the duration between packets for now.

IP traffic illustrated as interactive nodes.

This visualisation already allows us to quickly see and understand a few things:

We quickly see which IP addresses are most actively communicating with each other (172.29.0.111 and 172.29.0.255)

It’s quickly visible which hosts account for the “bulk” of the traffic

We see how interactions between systems change as time moves on.

A few shortcomings of this first experiment include:

We have no clue on the actual content of the network communication (is this DNS? HTTP? Something else?)

IP addresses are made to be processed by a computer, not by a human; adding additional context to make them easier to classify by a human analyst would definitely help.

Experiment 2 – Adding context to IP address informationBy using basic information to enrich the nodes in our graph, we can aggregate all LAN traffic into a single node. This lets us quickly see which external systems our LAN hosts are communicating with:

By doing this, we have improved our understanding of the data in a few ways:

We can very quickly see which traffic is leaving our LAN, and which external (internet facing) systems are involved in the communication.

All the internal LAN traffic is now represented as 1 node in our graph; this can be interesting in case our analyst wants to quickly check which network segments are involved in the communication.

However, we still face a few shortcomings:

We still don’t really have a clue on the actual content of the network communication (is this DNS? HTTP? Something else?)

We don’t know much about the external systems that are being contacted.

Experiment 3 – Isolating specific types of trafficBy applying simple visual filters to our simulation (just like we do in tools like Wireshark), we can make a selection of the packets we want to investigate. The benefit of doing this is that we can easily focus on the type of traffic we want to investigate without being burdened with things we don’t care about at that point in time of the analysis.

In the example below, we have isolated DNS traffic; we quickly see that the PCAP contains communication between hosts in our LAN (remember, the LAN dot now represents traffic from multiple hosts!) and 2 different DNS servers.

When isolating DNS traffic in our graph, we clearly see communication with a non-corporate DNS server.

Once we notice that the rogue DNS server is being contacted by a LAN host, we can change our visualisation to see which domain name is being queried by which server.We also conveniently attached the context tag “Suspicious DNS server” to host 68.87.71.230 (the result of our previous experiment). The result is illustrated below. It also shows that we are not only limited by showing relations between IP addresses; we can for example illustrate the link between the DNS server and the hosts they query.

We clearly see the suspicious DNS server making request to 2 different domain names. For each request made by the suspicious DNS server, we see an interaction from a host in the LAN.

Even with larger network captures, we can use this technique to quickly visualise connectivity to suspicious systems. In the example below, we can quickly see that the bulk of all DNS traffic is being sent to the trusted corporate DNS server, whereas a single hosts is interacting with the suspicious DNS server we identified before.

So what’s next?Nothing keeps us from entirely changing the relation between two nodes; typically, we are used to visualising packets as going from IP address A to IP address B; however, more exotic visualisations as possible (think about the relations between user-agents and domains, query lengths and DNS requests, etc.); in addition, there is plenty of opportunity to add more context to our graph nodes (link an IP address to a geographical location, correlate domain names with Indicators of Compromise, use whitelists and blacklists to more quickly distinguish baseline vs. other traffic, etc.). These are topics we want to further explore.

Going forward, we plan on continuing to share our experiments and insights with the community around this topic! Depending on where this takes us, we plan on releasing part of a more complete toolset to the community, too.

Making our lab rats happy, 1 piece of cheese at a time!

Squeesh out!

About the author

Daan Raman is in charge of NVISO Labs, the research arm of NVISO. Together with the team, he drives initiatives around innovation to ensure we stay on top of our game; innovating the things we do, the technology we use and the way we work form an essential part of this. Daan doesn’t like to write about himself in third-person. You can contact him at draman@nviso.be or find Daan online on Twitter and LinkedIn.

Informative Eye candy…very cool
Maybe a dragable time bar at the bottom and link the dots to clickable summary packet info? Or a couple configurable or auto (top 4 types plus other) shapes to ID port (sq-dns, tri-ssh, cir-https, etc). Can’t wait to see where you guys take it!!!

Hi there! Thanks for the response! Very interesting approach, the packet summary is already implemented (not covered in the blog post :-)) & the “time slicing” is also being worked on – two very useful features indeed! An additional challenge we are working on: abstracting on much higher volumes of events (where displaying _all_ info over time no longer makes sense & correlating information or visualizing on higher level signals is more interesting). Thanks so much for the encouraging words & for reaching out! Best wishes, Daan

Hi Branson – thanks a lot for the feedback & the link; I’m going to check out the tool. I watched the video and the concept is indeed similar in visualising the data; 1 thing I really want to focus on however is making it “more than eye candy” and focusing 100% on the interactiveness of the tools. Allowing an analyst to always being able to grasp what’s on the screen & manipulate whatever is visible is vital to this. We have a long way to go to achieve all that but you can be assured our updates will be shared here over time!

Very interesting! I would love to see something like this that also employs sound, like good old https://sourceforge.net/projects/peep/ Peep: The Network Auralizer — conveying additional data via sound could help consume more complex detail quickly. I vaguely recall a background sound of a stream that would increase volume as traffic flow increased, then certain nature sounds indicating certain services being accessed or dropped. (frogs, birds, etc.) Cool idea.

Hi Johannes
Thanks for the comment! We don’t have the tool available at the moment, as the PoC is still quite rough around the edges; however, we do have plans for a shared version once we know a bit more which direction to take this, and there is a bit more polish. When we do, you will for sure read about it on this blog!
Daan