Join the Community! Creating your account only takes a few minutes.

This is the 165th article in the Spotlight on IT series. If you'd be interested in writing an article on the subject of backup, security, storage, virtualization, mobile, networking, wireless, DNS, or MSPs for the series PM Eric to get started.

Unlike most Spotlight on IT articles, this one isn’t so much based on a story as it is a question — one I frequently see in the Spiceworks Community: “How do I monitor bandwidth usage?” This version of the question is the broad form of it, but it can have many nuances and caveats to it. It’s important to determine what you really need to accomplish this task.

Let’s start by looking back at this article by Jay6111, which focuses on remembering and using the OSI model to diagnose and resolve network issues. Knowing the OSI model is not only important to troubleshooting network issues; it’s also an important thing to understand when approaching monitoring network activity.

My brush with the elusive “network is slow” issue

I know what it’s like to have the elusive “network is slow” issue or need to have insight into the goings-on in the network.

When I started my last IT position, we had a T1 Internet connection, and the Internet started to become really, really, really slow without an increase in users. Our first step was to go to the ISP usage stats; as we didn’t have any of our own, all their stats could show though was that our line for down bandwidth was saturated. My internal infrastructure consisted of a Juniper Router for T1 Internet, six unmanaged switches, and two routers for a dedicated P-t-P (point-to-point) T1 connection. What I had for resources were one managed Linksys switch in storage, a handful of hubs, a few old PCs and servers, and, of course, my brains.

I first researched monitoring tools. My budget was zero, and so I found what I think were the best tools freely available — some of them are still better than many paid options. (I’ll dive into which tools I deployed and why in just a minute. There were many others that didn’t make the cut.)

The second step I took was setting up a PC and server. I made the server a proxy using Squid and Linux, and I pushed out a GPO to have all web traffic go through the proxy. On the PC, I installed promiscuous-monitoring tools and data-collecting tools. I also installed a second NIC in the PC. I first tried to set up the managed Linksys switch so I could port mirror the uplink to the Internet router, but the model we had didn’t support it so I ended up using a 10Mb hub between the switch’s LAN and the router.

What did I gain from my insight? First SNMP showed that the Internet router’s T1 port matched the data stats provided by the ISP. I also learned that the LAN port stats did not match the WAN or ISP stats.

It was eventually discovered that zombie Adobe update processes on our XP machines were requesting update traffic from Adobe and the router was dropping these packets as they came down. Blocking adobe.com resolved that particular issue and LAN and WAN stats normalized and matched. But this didn’t fully solve the issue.

While it was better, the stats showed we were using more than 70% of bandwidth during operational hours, which caused higher latency and slow network speeds. What the promiscuous monitoring and proxy monitoring showed me was that no one was “abusing” the system, we weren’t streaming videos or music — it was all business traffic (well, 99% of it). People weren’t even wasting time on Facebook or the like. This helped us justify changing ISPs and connection type, which ultimately saved us about $200 a month and increased our speed to 15 down and 3 up. So, I had to work with what we had, but I was able to gain valuable insight to the network by using my knowledge of networking and a little hard work.

Monitoring and the OSI model

What we want to monitor may fall at different level of the OSI model, which means we need to know exactly what we want to monitor in order to know what layer the tools we need have to function at. In addition to knowing the layer we want to monitor, we need to know a little about the protocols in each layer.

Layer 1, the physical layer, really is electrical signals on the line, so we don’t have to mess with that.

Layer 2, the data link layer, is MAC addresses and Ethernet frames (or another protocol if using something other than Ethernet).

Then you have layer 4, the transport layer. This is where our TCP/UDP suite of protocols reside in TCP/IP networking.

We then have layer 5. This is actually the last layer in the TCP/IP model we use and maps to the last 3 layers of the OSI model in the TCP/IP model. We just call this the application layer, HTTP, SSH, FTP, etc. all reside at this layer. (As a side note in mappings between the OSI and TCP/IP models you will often see the physical and data link layer merged into one, creating only 4 layers instead of the 5 I’ve described.)​

Now that we know where the protocols are in the model, what do we need to know about them?

Well, for Ethernet we need to know that all traffic is encapsulated in Ethernet frames, and that this is only a transport protocol. If we capture these, we see everything going on. There is no security in this layer — it’s all clear and readable. However, the encapsulated information may not be unencrypted.

Additionally, Ethernet itself is broadcast and will go everywhere unless stopped by a circuit of some kind. That’s why switches are an important advancement in networking over hubs. With a hub the traffic is broadcast and every node on the hub or segment sees the traffic. On the other hand, a switch at its core creates virtual circuits by examining the packets MAC addresses and creates virtual circuits — only connections that go to the destination MAC address get forwarded to the frames. A key piece of information to note from this paragraph is that in order to use promiscuous monitoring, which looks at all traffic, you need to place it a physical location that allows it to see all the traffic.

In layer 3 we have the logical addressing of nodes via the Internet Protocol. This is how we generally interact with networking with IP addresses and subnet masks, etc. This layer makes logical sense to use and this is likely how we will want to view our network usage statistics. We are darn right unlikely to make much sense of identifying everything by MAC addresses, so we will want software that can at least identify traffic to and from based upon the logical addresses. A good thing to remember about this layer is that there can be some inherent security via encryption, such as IPSec.

In layer 4 we have transport protocols, which are very closely tied to layer 5 in that TCP suite of protocols that identify ports and transports application protocols from software in layer 5. Ports are how we can (generally) identify the protocol being used in network communication — primarily because there are a standard set of ports used for certain application protocols. For instance, it can generally be assumed traffic of port 80 is HTTP; traffic of port 443 is HTTPS; port 21 is FTP; and port 22 is SSH. We actually wouldn’t know the protocol with 100% assurance without performing an analysis of the Ethernet frame packets and seeing what’s encapsulated in them, but since the ports are standard they are a very good gauge of the protocol being used.

Fun fact: One of the main ways that Spiceworks detects unmanaged cloud services is by identifying the service based upon the port number being used for network communication. So, what does all this mean? Well, we’ll want to have some sort of breakdown as to the type of TCP/UDP traffic in order to know what applications are using our bandwidth.

Lastly, we have layer 5. While tied to layer 4 by port number and the underlying transmission mechanism, this is still a beast of its own if we want to know more specific information. For example, if you want to see what websites are visited or the usernames of who did something, this is the layer we have to look at.

Wow, let’s take a deep breath. Are you still with me? I know this is a lot of text, but I hope you’ll understand — it was created with the –vv switch.

What are the different types of monitoring?

There are three main types of monitoring:

Proxy – This type of monitoring is at the application and transport layer for the most part. The information gleaned from a proxy will reflect this higher-level information best. You may be able to get more specific usage stats on things like data used per website. This will also let you go as far as identifying network usage by username.

Data collection – This type of monitoring relies on mostly layer-3 devices reporting via a protocol like SNMP, NetFlow or SFlow. The actual device records the data and then a collector application queries the device(s) for this information.

Promiscuous – This type of monitoring is exactly what it sounds like: It gets around. This monitoring actually works at layer 2 of the model and captures every Ethernet frame that it sees whether intended for it or not.

​For proxy monitoring, we know that it must operate at layer 5, so how do we capture information on this layer to monitor it? Realistically, the only way to do this is to literally proxy the data flow. All application communication (at least the types we want to monitor) must pass through our proxy — it’s like a man in the middle.

So how do we accomplish this? We have two primary types of proxies — inline and not inline.

An inline proxy has all traffic funnel to one point to go in or out of a controlled network — there is no way around it. Routers that have built-in Internet usage stats use a form of this, since all traffic runs through it in and out of your network. (Most of your dedicated Linux gateway distributions also do this if the feature is enabled.)

A proxy that is not inline requires that you instruct the application on clients to go through the proxy in order to communicate with other nodes. Most commonly this is seen in web browsing. If you go to settings in your browser you should find a place to set proxy addresses for web browsing as well as some other protocols.

An inline proxy can also be what we call a “transparent proxy,” which is the most common with inline proxies. In this scenario, it’s called transparent because the end-user generally has no idea the proxy exists as their normal operations are not affected.

In addition to information gathering, proxies are a great way to block undesired traffic as well, especially in combination with some form of content filtering. The 800lb. gorilla of proxy servers is probably Squid, as it’s cross platform, open source (free), well documented, and the base for many other devices and software that proxies traffic. That’s not to say other good proxies don’t exist and don’t have their place — K9 proxy is another good one for Windows, especially for use on a single machine for blocking purposes.

In data collection, as I stated, the devices themselves support a protocol, like SNMP, and collect the data for short-term periods providing it to devices that query for it. This doesn’t really exist per se at one of the layers. SNMP specifically reports the data on a layer-2 level based upon network port usage. Protocols like NetFlow and its derivative are able to provide some higher-level information like ports/protocol traffic types. There is almost no end to collectors for this information, and SNMP is not limited only to network data stats.

A good open source tool for SNMP collection and graphing is CACTI (cross platform PHP based). For NetFlow and SFlow (as well as promiscuous monitoring) ntop (Linux only) is a good tool.

What are the pitfalls of data collection monitoring?

The primary one is what traffic you’re seeing. As there is no address information, generally it means it’s an aggregate data total, so if you just want to know Internet traffic on a per device basis this is not the ideal tool. Also if you don’t have collection down to the last leg the stats cannot be tied to a specific device. For instance, a daisy-chained unmanaged five-port switch: The traffic stats for the port it is connected to will indicate the aggregate data stats for all devices on the unmanaged switch.

Lastly, as I said, promiscuous monitoring resides in layer 2 capturing Ethernet frames. Of course, this does us little good in and of itself, so the tools that are used in promiscuous monitoring operate at higher-model layers, taking the Ethernet frames and reading the information encapsulated in them. This type of monitoring can provide a range of information. Two of my favorite tools for use in this type of monitoring are BandwidthD (cross platform PHP based) and iftop (Linux command line only). BandwidthD provides me nice, pretty graphs on a originating IP basis (with reverse DN lookup for friendly names) with a rough breakdown of traffic types, such as FTP and HTTP. iftop on the other hand gives me an almost real-time look at traffic on the network. It shows me originating IP and receiving IP address (with reverse DNS lookup) as well as ports being used for traffic (with common port identification) and how much data is being used upstream and downstream.

One tool in this category that I’m sure everyone in IT has heard of that I want to touch on is Wireshark. Personally, I don’t recommend this for normal monitoring of your network traffic. First, it collects and stored the actual Ethernet frames, and the size of files can add up quick. Second, it’s not designed for monitoring; it’s designed for diagnosis and troubleshooting. While this is a great tool and provides some in-depth analysis of the traffic — I recommend everyone try it as some point to at least see a traffic capture — it’s the wrong tool for long-term monitoring purposes.

What tools you use to monitor your network will depend in your needs, your infrastructure and you resources.

I hope this is valuable to you and helps you better understand how to monitor your network traffic. Shameless plug: Here are some how-to articles I’ve done in the Spiceworks Community that may be of use:

Unfamiliar with Linux? Here’s a follow-up to the previous how-to on installing software to make use of your Linux machine. Much of the software mentioned can easily be installed after reading this how-to.

I hope to get some time to execute a setup like this in our office. We just installed a new Lync phone system, deployed the server in Dallas with offices in Houston and Maine with another coming in Texas soon. There are intermittent quality issues that everyone points to the network as the problem. This type of monitoring would pinpoint the problem wouldn't it?

1st Post

We have HP DL360s with quad network cards in them using port spanning from our swithces, at out 3 main sites. We use Colasoft Capsa as the sniffer software (there is a free version which is great). These machines are invaluable for troubleshooting network issues!!

I hope to get some time to execute a setup like this in our office. We just installed a new Lync phone system, deployed the server in Dallas with offices in Houston and Maine with another coming in Texas soon. There are intermittent quality issues that everyone points to the network as the problem. This type of monitoring would pinpoint the problem wouldn't it?

It would help see what's going on and where you might have bandwidth issues, but it will still take some investigating. I'm sure you have all ready but make sure your equipment is configured properly with QoS to prioritize the voice traffic, also keep an eye on you Internet connection(s) usage when you use monitoring, you might need bigger pipes. If you are using it primarily for internal communication you may also want to look at some form of dedicated circuits between the sites.

The article explained quite a bit, as another reader said, what, where and why. But it's missing a key component: How. You never explain how you utilized the tools or provide examples or any information on what is presented by the tools or how to read the data from the different tools described. No examples, few links to good how-tos, and no diagnosis of real data on any of the suggested tools.

It would be tremendously helpful to know if there's something specific you look for in one tool or another, how to use it, or even provide a link to a useful getting started guide for each tool. Sure I can search for guides, but it would be nice to know what you look for and any guides you found useful since you had success with the project...

This sentence is confusing: "For NetFlow and SFlow (as well as promiscuous monitoring) ntop (Linux only) is a good tool."

I've also read your other articles and you leave out important details. I.E. why are 2 nics needed for iftop? Why don't you want an IP on the 2nd iftop nic? What do you connect it to? etc...I was able to ascertain the answers to these questions, but my point is your articles, while helpful, could use some polish... That said, thanks for taking the time to write them.

KyferEz, I didn't want to get into such minute details that many felt bogged down in extraneous information. I also didn't feel that if you were trying to monitor your network traffic that you would need a detailed outline of how to interpret the gathered data. I would think that analysis of collected data in and of itself could be an article of its own, I see it as a separate task from how to do the actual monitoring.

I suppose I could have done a better job providing background information for each of the tools. I can say with each of these tools I didn't think any of these tools required any in depth knowledge to setup. But perhaps I'm making assumptions I shouldn't be.

I am sorry about the quoted sentence, I did think it was a little confusing after reading the posted article myself, so please allow me to clarify it. A free software that works well to collect and present NetFlow and SFlow data is ntop, ntop also can be used for promiscuous monitoring. Of all the tools I mentioned this one probably provides the most in depth and is one that would require some explanation of data interpretation, even to those that might know what they are looking at. One catch though is that ntop is a Linux only tool, and unless additional advanced configuration is done data is not permanent or stored long term, very bad for historical data collection. This was not one of the tools I use regularly, and is certainly not an easy one to implement for those without some experience behind them. It's also not as ideal for data usage overview, because the interface is a lot more cluttered with data than the other applications I have mentioned.

Lastly to address not mentioning why 2 NICs should be used with iftop, you are correct, something I left out of that How-To, but I noticed I did mention it in the BandwidthD How-To. Although if you look at the diagram attached int he how-to I would feel it would make the reason for 2 NICs pretty self evident when you see that one is for monitoring and one is for the LAN connection.

Despite some oversights I hope the write-up have still been helpful for some.

I would think that analysis of collected data in and of itself could be an article of its own, I see it as a separate task from how to do the actual monitoring. I suppose I could have done a better job providing background information for each of the tools. I can say with each of these tools I didn't think any of these tools required any in depth knowledge to setup. But perhaps I'm making assumptions I shouldn't be.

Despite some oversights I hope the write-up have still been helpful for some.

Alex, the writeup was certainly helpful, and I agree that creating a writeup all it's own for an analysis of the collected data would be a worthwhile post.

You are correct, some of the tools are somewhat self-explanatory and surely most of us reading the post will understand the data, however, your explanation and interpretation is likely to assist readers in further developing their skills and it could bring some insight into something that they otherwise might not have considered.

Alex, the writeup was certainly helpful, and I agree that creating a writeup all it's own for an analysis of the collected data would be a worthwhile post.

You are correct, some of the tools are somewhat self-explanatory and surely most of us reading the post will understand the data, however, your explanation and interpretation is likely to assist readers in further developing their skills and it could bring some insight into something that they otherwise might not have considered.

Thanks again

Maybe I'll see if I can find some time on analyzing and explaining the data collected from tools like this in more depth.

Will need to flag this one for further reading (just had time to browse it quickly), but one thing that might confuse people is that you seem to be starting by describing it as the OSI model but then shifting more into using the TCP/IP model and its layers. Apologies if I've misread that or missed something.

I used the OSI model as it's the theoretical model used in the design of modern networks, TCP/IP was actually around before that and really served partly as the model for the OSI model. But you have to shift from the theoretcial OSI to the actual TCP/IP to really discuss this in the real world application of monitoring in our current TCP/IP networking world. I also used the OSI to start because it's the same Model Jay referenced in his article. But for practicality I had to tranisition and map OSI to TCP/IP.

Mobile, external, and corporate LAN users, desktop apps, on-prem web apps, cloud apps, BYOD and company issued devices: Do you manage them all with a single solution with a slick end-user experience, or is the implementation not as good as the proposals suggested?