Why Netflow, What is it, and Why it’s Important to Security

Many of us have been working in technology so long that we’ve become jaded, and readily dismiss what appears as new, yet minor advancements. Back in the 1990s, we were focused on the emergence of the web server, web site creation, and the DOTCOM boom. We barely took notice when Cisco released a router utilizing a new technique called Netflow. It quickly flopped as a method of routing, and gave way to Express Forwarding, but as is often the case it didn’t die. The new IPFIX standard is essentially Netflow v10 since it followed Netflow v9. Regardless, why has a technology that failed so early on, gone through nine subsequent major version releases? Primarily because network administrators wanted to know how their infrastructure was being utilized. Netflow morphed into a network accounting mechanism. The real benefit though has come in the past five or so years as it has risen in importance as solutions to mitigate advanced persistent threats (APT) have become critical. Packet capture is fine, but the volume of data retained can be crushing, often all you need is the intelligence captured in a Netflow record. So what are Netflows and why are they important?

In simple terms, a Netflow is a record of a connection between two systems during a period of time. Essentially a path of collected digital foot prints showing every detail of the connection between two systems. Netflow records, in fact, are so trusted that they are admissible in court as evidence. So what is a Netflow record? While Netflow v9 or IPFIX, the formalized standard derived from it, is the latest these are both pretty complex to digest this early on so we’ll look at the most commonly used version Netflow v5. There are two parts to a Netflow v5 event, the flow header, and the flow record. The flow header contains mainly time stamp information specific to the record when the record was generated down to the nanosecond, as well as some basic info about what generated the record. The record is where the real data lives. It contains the source and destination IP addresses, next hop router address, SNMP interface data, number of packets in the flow, the total number of Layer 3 bytes in the flow, TCP/UDP source and destination ports, protocol type, service type, and a series of flags. Records are generated when a flow is finished, or periodically for persistent flows. Whatever is exporting the flow can be configured to generate records periodically for open flows. A flow is considered finished when a TCP session termination is received or if the flow exceeds a predetermined age since the last packet in the flow was received. So what can these flow records be used for?

Have you ever watched a crime drama, and seen the police pull the suspect’s vehicle E-ZPass toll records or onboard GPS history. They are looking at real world flow data to determine if the suspect could have been on the scene of the crime at the time it occurred. In the cyber world Netflow, data would be the equivalent. It will tell you when an attacker connected with your system, how they connected, and how much data they stole. It will not tell you what specifically they stole, but knowing the time, how it was taken, and how much was stolen is often enough of a clue to enable you to explore other system specific records or logs to determine exactly what was taken. Looking at patterns of Net flows in near-real time (remember when net flow records are created) is a new technique for catching attackers before they can make off with your enterprise’s crown jewels.

Netflow data can we used for a great many things from balancing loads across a cloud application or infrastructure to tracking cyber crime or cyber warfare in near-real time. So how might you use Netflow to look for a cyber attack? One method is to employ the dark space within your companies Internet and internal corporate address space to setup decoy systems. The decoys could be simple stand-alone RaspberryPi systems configured to run Apache, Nginx, Memcached, MySQL, etc… and loaded with useless dummy data. These systems should typically see NO traffic because they serve no purpose to your employees or customers. Mark my words though, once they exist they will soon be discovered and interrogated. By monitoring the Netflows into these systems you can then use that source data as the key to search all the flows into all your production systems and desktops. Also, you could pro-actively use the data from these systems to strengthen both your perimeter and endpoint defenses.

Another method would be to look for Netflow patterns within your enterprise that are indicative of a cyber attack. All of the common penetration testing tools can be profiled in such a way as to yield a unique pattern of Netflows that they generate into a system they are preparing to attack. Often these tools work like a common home burglar doing reconnaissance before a job. If you can catch the burglar during his recon effort you can thwart his attack. Netflows provide you this data.

4 thoughts on “Why Netflow, What is it, and Why it’s Important to Security”

There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this. Hadoop Training in Chennai | Big Data Training in Chennai