Kentik provides a free NetFlow proxy agent called “kproxy” that enables data collected from network devices to be sent securely to Kentik Detect. The machine running kproxy isn’t actually handling traffic directly, but instead collects flow records (NetFlow v5/v9, IPFIX, and sFlow) and SNMP and encrypts it locally before forwarding it to Kentik. A single instance of the kproxy executable can redirect flow for multiple routers and switches. Multiple servers across the network can run kproxy to distribute traffic and load.

In addition to the collection and encryption of device data, kproxy also performs the following functions:

Rate limiting and resampling to keep maximum flows per second (FPS) within applicable plan limits (see About Plans).

kproxy connects to any customer devices sending NetFlow telemetry to Kentik Detect, and also to Kentik Detect itself to send the encrypted data as well as to receive configuration information.

In order to send traffic to Kentik Detect, kproxy will build, for each NetFlow-sending device, two HTTPS sessions, one for flow traffic and the other for SNMP. All traffic is sent to Kentik Detect using such HTTPS sessions, with a Kentik Detect flow ingest server certificate.

SNMP is converted into JSON format and NetFlow/sFlow/IPFIX is converted to a Kentik proprietary binary format.

When used for encryption, kproxy pulls information from the Kentik system to determine which routers it will talk to. The routing of flow and SNMP to kproxy is enabled with the following steps:

Create a device in the Kentik Detect portal (Devices » Add Device; see Adding a Device) for each router that you want to send flow from:
- Set a device’s Name and Description.
- Set the device type as “Router” (even though you are sending through the agent).
- Set the Device IP as the IP of the router that the agent will see the flow being received from. You may enter multiple IPs, comma separated, if there is the possibility that flow may source from multiple IPs for the said router. Private IPs are acceptable.
- Set Device SNMP IP to the router IP that the agent will poll for SNMP.
- Set Flow Type to the type of flow that the router is configured to export from.
- Set Sample Rate to the rate at which the router is set to sample.
- Save (Add) the device.

Check the system clock and time zone settings on the server, which must be accurate to within a minute for kproxy to function correctly. Kentik recommends that hosts running kproxy use Network Time Protocol (NTP).

Determine which of your organization’s users to use for the authentication that enables kproxy to talk to Kentik servers. The designated user may be any user that has been configured in the Users section of the Kentik Detect portal. You may wish to create a user specifically for agent/flow authentication so that this functionality is not tied to a user that is later deactivated (e.g. the person leaves your organization). You’ll need the user’s e-mail address and your organization’s KDE password, available in the User Profile.

Determine the IP that kproxy will bind to on your server in order to receive flow.

Determine the port that your server will accept flow on (i.e. where you will point your routers too).

Run kproxy, specifying the arguments described in kproxy Command Line. You’ll likely want to test it initially from the command line and then place it into one of your startup scripts so that it begins on boot. If not placed in the background, the agent will run in the foreground and adhere to standard kill/end signals (e.g. run “nohup kproxy +options &” if running from the command line and exiting shell).

On the kproxy Packages page, confirm that a version exists for your Linux distribution and version.

Click Installation in the sidebar at left.

On the resulting Installation Instructions page, use the Bash Scripts column to choose the tab corresponding to the package type (deb, rpm, node, python, or gem) that you need for your distribution of Linux.

At the top of the resulting tab, click the Copy button to copy the quick install cURL to the clipboard.

Run the quick install cURL in Terminal. The package script will run, downloading and installing the packages.

Once installed, kproxy must be kept up to date to ensure correct performance. For best results, Kentik recommends using your OS’s package manager to upgrade existing instances of kproxy automatically. If using the package manager is not possible, then you can follow the steps below to perform a manual upgrade.

Get your API Token:
- In the Kentik Detect portal, click your user name at the right of the main navbar, then choose My Profile from the drop-down menu.
- On the My Profile page, choose Authentication from the sidebar at left.
- On the Authentication page, copy the contents of the field in the API Token pane at bottom.

Create an environment variable named KENTIK_API_TOKEN whose value is the token you copied from the portal Authentication page.Note: Once you have an environment variable for the token, do not include the token in the command line with the api_token parameter. Storing the token as an environment variable prevents it from being exposed via a process (ps) command and is consistent with security best practices.

Set up kproxy using the following command (for parameter definitions, see kproxy CLI Arguments):kproxy -api_email=api_email -host=interface_ip

Notes:
- If your organization is registered with Kentik in the EU you must set the -region argument to eu.
- If kproxy fails to launch, add the -verbose flag and try again so that you can provide the output to support@kentik.com in order to facilitate troubleshooting.
- Use -h to return a list of arguments.

The command line arguments below are used when configuring kproxy as a NetFlow proxy agent.

-port (optional): Set the port to listen on. If omitted, kproxy defaults to listening on port 9995.

-dns (optional): The address, expressed as IP:Port (e.g. 127.0.0.1:5353), of an alternate (private) DNS server that is to be used for reverse DNS lookups instead of Kentik’s default server. At ingest the IP addresses in the flow records from this kproxy are looked up on the alternate DNS server and the returned host names are stored in KDE as Custom Dimensions. These source and destination Hostname dimensions can be used for group-by and filtering in queries.Notes:
- This parameter is distinct from Custom DNS, which affects only the display of hostnames in the portal, doesn’t involve the creation of custom dimensions, and specifies an alternate DNS server for flows from all sources rather than just from an individual instance of kproxy.
- If an alternate DNS server is specified with this argument, the hostname returned for a given IP address won’t be checked again for 24 hours.
- Kentik looks up host names from alternate servers on a “best effort” basis; if the rate of flow records from a given device is high then it may not be possible to resolve all of the IP addresses in those flows to host names.

-site_id (optional): The ID of a Kentik-registered site (see About Sites). If a different site ID is specified for two or more instances of kproxy, then a device sending flow to Kentik via one such instance will be able to use the same Sending IP address (see Device General Settings) as a device sending flow via another instance (see IP Overloading).

-proxy_host (optional): Sets the IP that kproxy will use for child processes. The kprobe parent process spawns a child process to handle flow from each individual combination of device and time-slice; default is 127.0.0.1.

-healthcheck (optional): The IP to use for the healthcheck service; default is 127.0.0.1.

-auto_update (optional; defaults to false): If specified as true then once every 24 hours kproxy will check for a new version and update itself.

-syslog_config (optional): Specifies the path to a JSON configuration file that defines a schema which determines how KDE will, at ingest, parse the flow data contained in any syslog that is sent to this instance of kproxy (see kproxy Syslog Parsing).

-bootstrap_devices (optional): A comma-separated list of device ids (e.g. “12002,1221”) for which this instance of kproxy should initiate a child process without first waiting to receive flow. Used in scenarios such as using kproxy on a given device for SNMP only (no flow collection).

-tee_kproxy (optional): In addition the IP address for KDE ingest, all received flow data will be sent to the IP address:port combination (if any) specified with this argument.Example:-tee_kproxy=10.10.1.1:4545 will send flow data to port 4545 at the IP address 10.10.1.1.Note: Available for kproxy 7.4 and higher.

If you choose not to store the api_token as an environment variable (step 2 above), despite the risk of exposure via a process (ps) command, your command line will need to include the token using the following additional parameter:

-api_token: A Kentik-generated string that kproxy will use to authenticate a registered user (must be the same user as for -api_email).

The following arguments are specified only when using kproxy to receive streaming telemetry data from devices and forward it to KDE (see SNMP and Streaming Telemetry). Leave these arguments unspecified in all other scenarios:

-st_dialout_listener (required for ST on Junos or Cisco): Set to auto to enable receipt by kproxy of streaming telemetry data collected and sent either by devices running Junos or by Cisco IOS-XRv 9000 routers.

-st_udp_bind (required for ST on Junos): Specifies the IP address and port on which kproxy should listen for streaming telemetry data collected and sent as UDP by devices running Junos (version 18.4R2.7 or later).

-st_tcp_bind (required for ST on Cisco): Specifies the IP address and port on which kproxy should listen for streaming telemetry data collected and sent as TCP by devices such as Cisco IOS-XRv 9000 routers (version 6.2.3 or later).

The combination of above arguments used to enable Streaming Telemetry with Kentik varies depending on the data source:

To enable ST via kproxy from devices running Junos:
- Always use st_dialout_listener, set to auto.
- Also use st_udp_bind, typically set to 0.0.0.0:9555.

To enable ST via kproxy from Cisco IOS-XRv 9000 routers:
- Always use st_dialout_listener, set to auto.
- Also use st_tcp_bind, typically set to 0.0.0.0:9555.

To enable ST from both Junos devices and Cisco IOS-XRv 9000 routers to the same instance of kproxy:
- Always use st_dialout_listener, set to auto.
- Also use both st_tcp_bind and st_udp_bind, both typically set to 0.0.0.0:9555.

Notes: Setting the binding arguments above to the Kentik-recommended value of 0.0.0.0:9995 enables kproxy to listen on the specified port on all IPs on the server.

To enable faster, more efficient setup of kproxy, especially when you need to deploy many instances, the kproxy installation process includes a kproxy service definition that can be run with Linux’s systemd system and service manager:

To use kproxy for flow encryption, run the default version of the service definition (replace placeholders with actual authentication credentials).

To take advantage of other kproxy features, also edit the service definition to specify the desired kproxy behavior using the arguments covered in kproxy CLI Arguments.

The kproxy install process results in installation of two systemd-related files:

/etc/systemd/system/kproxy.service

/etc/default/kentik.env.sample

To run the service definition:

Copy the config file at /etc/default/kentik.env.sample to /etc/default/kentik.env

In the new /etc/default/kentik.env service definition file, replace the following placeholders with actual values associated with a Kentik-registered user:
- KENTIK_API_TOKEN=XXX_FILL_ME_IN: The API token from the user’s Authentication Page.
- KENTIK_API_EMAIL=foo@example.com: The user’s email address.

To use this kproxy instance for tasks other than flow encryption, you can make additional edits in the service definition file, specifying the desired behavior using the arguments in kproxy CLI Arguments.

Choose the Linux command corresponding to how you want kproxy to start:
- To start manually, use systemctl start kproxy
- To start upon boot, use systemctl enable kproxy

To get logs (confirm that kproxy is running) run the following Linux command:journalctl -u kproxy

Note: A local config file should be used to specify SNMP settings only when customer information security policies prohibit the configuration of SNMP settings in the Kentik Detect portal.

By default, the SNMP configuration (SNMP IP and SNMP Community) for a given device that sends flow to Kentik Detect (e.g. router; see Supported Device Types) is learned by kproxy from that device’s settings in the Kentik Detect portal (see Device IP & SNMP Settings). There may be circumstances, however, in which it is necessary (e.g. for security compliance) not to specify SNMP settings for a given router in the portal. In this case it is possible instead to specify the settings through kproxy configuration, using the optional -snmp_file command line argument to direct kproxy to get that information from a local config file.

When SNMP is configured with an external file, the required SNMP parameters are set from the values in that file. These values are described in the following table:

Parameter

Description

device_id

Required: The Kentik assigned ID of the device.

snmp_comm

Required: The device’s SNMP community.

snmp_ip

Required: The IP address that should be used to poll the router.

minimize_snmp

Required when device type is router:
- If false (standard), interface counter will be polled every 5 minutes and interface description every 30 minutes;
- If true (minimized), interface counter won’t be polled and interface description will be polled every 6 hours.

The required settings are stored in the config file as JSON key/value pairs. The following example shows a local configuration file for two devices using SNMP V2, with minimize_snmp set to true for the second device:

The following tips may be useful in debugging issues related to the use of kproxy:

Our article on Router Configuration will guide you through the general setup of routers to work with Kentik Detect.

The IPs allowed in the Agent tile on the Admin » Access Control page (see Access Control) must include the public IP of the server running kproxy. If the server is behind a NAT gateway you can get its public IP by running wget -qO- ifconfig.co on the server.

If the kproxy command line argument -metrics was set to stderr then you will receive a checkpoint every minute that indicates how much flow you are receiving from the router. If that count is not increasing then there is an issue between your router and kproxy, either router configuration or kproxy config of communication between them.

It may take 2-3 minutes for the agent to download flow templates and begin to process flow. You can expect to receive errors (“[ipfix_parse_msg] no template for 256, skip data set”) for the first few minutes, after which the errors should stop.

Errors will be logged in stdout by default, but if the -syslog flag was used in the kproxy command line then instead they are logged in syslog (see kproxy Command Line for details).

A kproxy debug and health-check port is opened by default on the loopback address (127.0.0.1), one port higher than the configured flow ingest port (e.g. for the default flow port of tcp/9995, heath-check is tcp/9996). For further information, contact support@kentik.com.

In addition to its role as a collector and encryptor of flow data from routers and other network infrastructure, kproxy can also act as a parser for syslogs, enabling log data to be collected and stored in KDE alongside data from other Kentik-supported sources. This capability enables us to support monitoring and analytics in Kentik Detect on a variety of information that may not be available directly from flow records. It works by matching patterns in the syslog text and assigning the resulting values to fields in a flow record that is ingested into KDE. Once the syslog data is stored in KDE, queries can filter and group-by on the dimensions corresponding to the fields populated from syslogs.

Because there is no universal standard for the structure of syslogs, Kentik enables the syslog parsing capability via a JSON configuration file that you use to tell us what patterns you want to match. If the path to such a file is specified with the syslog_config argument in the command line of a given kproxy instance (see kproxy Proxy Agent Arguments) then every syslog file collected by that instance will be evaluated for matches with the patterns defined in the config file.

A different set of patterns may be specified for each of the devices that send syslogs to a given instance of kproxy.

As shown in the commented configuration file example below:

The patterns object allows you to specify custom (non-default) patterns that may be used to evaluate incoming syslogs from any device.

The devices object identifies a device by its Kentik Device ID (generated when the device is onboarded; see Device List Common Columns) and determines how patterns are applied to that device.

The types array in the devices object lists the patterns (default and custom) to apply for that device.

{"patterns":{/* A set of custom patterns to match (not needed when using only default patterns). */"IRCUSER": "\\A@(\\w+)","IRCBODY": ".*","IRCMSG": "%{IRCUSER:user}.* : %{IRCBODY:message}"
},"devices":{/* A collection of objects that each represent the parsing settings for one device. */"1001":{/* An object, identified by Kentik device ID, containing the parsing settings for an individual device. */"skip_default": false, /* Determines whether matching of default patterns should (true) or should not (false) be skipped during parsing. */"types":[/* An array of patterns to match for this device. If a pattern is not a default, it must be listed in the patterns object above. */"%{IRCMSG}", /* A pattern defined in the patterns section above. */"%{SHOREWALL}" /* A default pattern. */ ]
}
}
}

Note: To use the above example as a configuration template, remove all comments.

As noted earlier, the purpose of parsing syslogs is to extract useful information that can be stored in the flow records of KDE. When a string in a parsed line of syslog is matched via the Grok configuration described above, the way that data is incorporated into a flow record depends on the factors described in the following topics.

The matched value will be entered into a standard KDE column (see Main Table Columns) in the following situations:

The match is on a custom pattern that includes a label that matches a KDE column name. For example, if a pattern is defined as “(%{MY_PATTERN:patternlabel})” then Kentik would attempt to match the label “patternlabel” to a KDE column name. If a column by that name is not found then an attempt will be made to match the pattern as described in UDR Column Mapping.

The match is on one of the default patterns for which there is both a label and a Kentik-defined mapping from the label to a KDE column. These currently include the following (more coming soon):
- label “bytes” to column InBytes;
- label “packets” to column InPkts;
- label “clientip” to column DstIP;
- label “localip” to column SrcIP.

If neither of the Standard Column Mapping situations apply, then the matching, if any, on the contents of a syslog line will result in data being entered into a UDR (custom) column (see Universal Data Records) rather than a standard KDE column:

The labels of the patterns for which there are matches in a given syslog line will be ordered (ascending alphabetical).

In the flow record for each syslog line, both the label and the matching data for each matched pattern will be assigned in order to one of the 11 available UDR fields, str00 through str10. For example if the first (alphabetically) pattern matched in a given syslog line is specified in the config file as “(%{MY_PATTERN:patternlabel})” then the UDR field str00 would be populated with “patternlabel = “ followed by the text that matched the pattern MY_PATTERN.

Once data from a syslog line has been ingested into a KDE flow record it is available for group-by and filtering in Kentik Detect queries. The method used to access the data depends on how it was assigned to the fields of the flow record:

UDR dimensions: If the matched data was put into a UDR KDE column (str00 through str10) via either UDR Column Mapping or Patternless Mapping, then it can be accessed via the UDR dimension corresponding to that column. These dimensions are named Field00 through Field10.

Note: When filtering for data ingested via UDR Column Mapping you can use the label name that was inserted at the start of the field during ingest.

The setup workflow for using syslog data in Kentik Detect is as follows:

Register with Kentik Detect the devices that will be sending syslogs to kproxy (see kproxy Setup). The Type drop-down on the General tab of the Add Device dialog should be set to “Generic Syslog.”

Create a configuration file that defines the patterns that you want to match and conforms to the structure described in Syslog Parsing Configuration:
- If you are trying to ingest data that corresponds to an existing KDE column, see Standard Column Mapping.
- If you are trying to ingest data for which there is no existing KDE column, see UDR Column Mapping.

Assign the configuration file to a given instance of kproxy using the -syslog_config command line argument (see kproxy Proxy Agent Arguments).

Once the devices begin sending syslogs to kproxy, the patterns defined in the configuration file will be used to match the contents of each syslog line and to map matching data to the fields of flow records.

When the flow records are ingested into KDE the syslog-derived data can be accessed via group-by and filtering on Syslog Data Dimensions.