Fosdem 2016, part 4: what netflows tells us

Now in part 4 we can combine the the IP to MAC address tables together with the user agents, captured by NBAR2 and exported using netflow. The result of all this logging is a list of MAC addresses, the IPs a particular MAC address was using at a certain time, and the user agents we saw for this IP during this time period. This data was loaded into a sqlite database for easy manipulation and querying.

What can we learn from this?

At Fosdem 2016 we saw 9711 unique MAC addresses. Of these we can, using the user-agent, classify 7623 or 78%. The others simply either did not send HTTP traffic or used non-informative user-agent strings like “Apache-HttpClient/UNAVAILABLE (java 1.4)” or my favourite: “(null)/(null) ((null))“. When we talk about ‘devices’ below, we are in fact saying ‘MAC address’ as we assume that every devices has only one MAC address, which does not change. However the data suggests that this is not true, as we will discuss later.

Of these 9711 MAC addresses we saw that 4451 (45%) where only active on the IPv6-only wifi, 1948 (20%) where only active on the legacy dualstack wifi, which provided both IPv4 and IPv6 connectivity, and 3152 (32%) were active on both networks. This seems to indicate that almost half of people where happy to be on the IPv6 only network and remain there.

Of these 9711 MAC addresses we can attribute 9568 to known manufacturers. The rest seems fake, especially MAC addresses like d2deadbeef02 or deadbabecaff, some people told me that they generate a new MAC address for their device every time it boots, and it seems that a few people are doing this.

First the big reveal: from the data we can see 4686 unique smartphone devices on the Fosdem network over the whole weekend. These are all the devices who identify themselves as either iOS iPhones, Firefox OS, Symbian, Blackberry, MeeGo, Jolla, Tizen, Windows Phone or Android non-tablets.

One thing to note on the analysis of the http user agents, is that the data is very noisy. A large proportion of Android devices for example also claim to be running iOS, a certain MAC address even sends traffic claiming to be 19 different operating systems. I took to manually manipulating the data to try to improve the quality of the data, for example I took the drastic action that no device not made by Apple (as determined by the mac address) can run Darwin/iOS or Mac OS X.

So this analysis is less of a science and more educated guessing.

Besides some devices having an identity crisis, we also noticed that some people used a lot of IP addresses. A particular android phone used up 77 IPv4, and a few blackberries used up to a thousand IPv6 addresses over the course of the two day event.

If we look at the distribution of detected operating systems:

We can see the Android clearly dominates, but notice that the second largest group is ‘unknown’.

Let’s group the operating systems together in ‘families’ and ignore the unknowns:

Fosdem is clearly an Android/Apple/Linux world.

An interesting question was: can Android work on a IPv6-only network? With the information we collected, we can now check which Android versions were used on the IPv6 only network segment. First we can check what versions of Android we detected across the board:

We seems to have a representation of almost all possible version… If then we look at the IPv6 only statistics we notice that almost all versions below 5 are missing or only a few examples remain:

This information is based on the adjacencies on the router for the IPv6-only network, even if we identified the nature of that MAC address later on the dual-stack network. So no successful connection was needed, just getting IPv6 working enough to get an adjacency on the router.

People who did not connect at all to the IPv6 only network, were primarily running the older versions:

But still a lot of people with modern versions went to both the legacy and the IPv6-only network:

For reference, the IOS version distribution shows that most iOS devices seem to be running the more recent releases.

With this in mind we might want to enable the firewall on the router next year, as quite a number of people are using older operating systems who might not be secure on an open Wifi network, directly connected to the internet.

Given that we now know what operating system was running on a particular MAC address, and we know when we saw this MAC address, we can plot the operating system family distribution over time:

The sudden decline on Sunday was me clearing the adjacency table as it seemed ‘too full’. Turns out it was normal and recovered quickly.
We can then look at the mobile Linux family section in more detail:

There were some people claiming that the FreeBSD people would come on Sunday, but the data does not show this:

A similar view for the Windows users, who seemed equally represented on Saturday and Sunday:

Finally we can check if people fled to the legacy network after hitting the IPv6 network first with Android:
But it seems that the people that could be detected remained on the IPv6 only network, the green section declines gradually at the end of the day like all other graphs.

In fact graphing the number of the IPv4 versus IPv6 addresses in use is boringly similar to the previous graphs.

Some of the individuals posting to this site, including the moderators, work for Cisco Systems. Opinions expressed here and in any corresponding comments are the personal opinions of the original authors, not of Cisco. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Cisco or any other party. This site is available to the public. No information you consider confidential should be posted to this site. By posting you agree to be solely responsible for the content of all information you contribute, link to, or otherwise upload to the Website and release Cisco from any liability related to your use of the Website. You also grant to Cisco a worldwide, perpetual, irrevocable, royalty-free and fully-paid, transferable (including rights to sublicense) right to exercise all copyright, publicity, and moral rights with respect to any original content you provide. The comments are moderated. Comments will appear as soon as they are approved by the moderator.