Flow records are metadata (information) about each IP conversation (collection of related packets) that traverses a device such as a router, switch, or host. If a given device is configured to enable it, a flow record (data about a flow) can be collected in a cache and exported by sending it to a specified destination (e.g. Kentik Detect) at a specified interval. For example if IP 1.1.1.1 is sending packets to 2.2.2.2 through a flow-enabled device, information about that conversation can be collected in a flow record that includes the following basic flow fields:

Four primary protocols exist for flow data. Kentik Detect accepts all four protocols. The protocol used to send flow data to Kentik from a given device should be chosen based on what that router/switch supports and handles most efficiently. Where multiple protocols are supported, Kentik recommends them in the following order:

sFlow: designed by the sFlow.org consortium as a statistical monitoring tool for networks; configurable through SNMP.

Flow sampling means exporting a flow record for only one in every X flows. When X is 1 then flow is unsampled (a flow record is generated for every flow), but when X is 10,000 then a flow record is generated for one out of every 10,000 flows. As a result, there’s an inverse relationship between the sampling rate (the ratio of total flows to sampled flows) and the resolution, meaning that a lower sampling rate (100) is higher resolution than a higher rate (10,000).

While it’s tempting to assume that accuracy requires setting each device to the lowest possible rate (e.g. 1), testing by Kentik and others has established that even when the sampling rate is high it’s possible to measure small “needle-in-a-haystack” traffic flows with accuracy that is adequate for all common use cases (see our blog post Accuracy in Low-Volume Flow Sampling).

The advantage of flow sampling is that it vastly improves the efficiency with which resources devoted to the collection, transport, ingest, and storage of flow records are utilized, enabling much more network infrastructure to be covered for a given resource expenditure. Specifically, flow sampling is recommended (regardless of the size of your operation) for the following reasons:

Sampling reduces the device cycles required for processing and collection of flow.

Sampling makes it easier to see flow data in real time, because when flow is unsampled some routers hold the flow data for minutes or longer before sending it to the collector.

Sampling is especially important during unexpected high traffic volumes. During an attack or high PPS event, for example, a router that may otherwise be able to handle unsampled flow can be overwhelmed by the sheer volume of data and may stop processing flow. That can cause a lack of visibility at precisely the moment when it is most critical to be able to collect and analyze traffic data.

For a given network device, the ideal sample rate is low enough to capture critical information but high enough to efficiently handle peaks. Optimal rates vary by the device role, the desired resolution of the flow record dataset (which is dependent on use case), and the total active throughput of the device. The following table provides recommended sample rate ranges for individual devices, which are based on Kentik’s analysis of hundreds of devices in live production accounts:

Sampling rate (flows per sample), by max device throughput

Device role

Resolution

1 Gbps

10 Gbps

100 Gbps

1 Tbps

Edge/Internet-facing

Standard

N.A.

3000 - 7000

8000 - 10,000

11,000 - 15,000

Edge/Internet-facing

Enhanced

N.A.

2000 - 4000

5000 - 7000

8000 - 10,000

Data Center and Core

Standard

400 - 800

1000 - 1500

10,000 - 20,000

25,000 - 50,000

Data Center and Core

Enhanced

200 - 400

500 - 800

5000 - 14,000

15,000 - 30,000

Notes:
- Device vendors use a variety of algorithms (random, consistent, etc.) for flow sampling; Kentik Detect works with all such algorithms in current use.
- Please consult with Kentik support for answers to questions or for help calculating the proper sampling rate for your unique network environment.

Depending on how a given device is configured, flow may be created by examining traffic at either of the following points:

Ingress — as traffic comes into an interface;

Egress — as traffic exits an interface.

It is recommended that you enable flow on all interfaces and configure all devices for ingress flow creation only (NetFlow was originally designed for this scenario but has since expanded to allow egress flow creation as well to handle special cases such as compression and VPN services).

Enabling ingress flow creation on all interfaces will give you a full picture of all traffic traversing the router. The Kentik Detect system will, for example, allow you to examine traffic that has left an interface by grouping all flows that were destined to that interface as they traversed the ingress from other interfaces.

Enabling both ingress and egress flow creation may result in the same traffic being counted twice. Kentik Detect allows you to verify that flows are not being double-counted, because whenever you view traffic for an individual interface Kentik Detect will report both the flow traffic and the SNMP traffic (use the device/interface tab and select traffic or use the Data Explorer and filter by interface). When the flow traffic on an interface (or set of interfaces) is compared to the SNMP recorded interface traffic the two metrics should be within 20 percent of one another.

Note: Kentik Detect does not currently remove duplicate flows resulting from enabling both ingress and egress flow creation (it will in the future).