statsd is a system to collect data from any application. Applications are sending metrics to it, usually via non-blocking UDP communication, and statsd servers collect these metrics, perform a few simple calculations on them and push them to backend time-series databases.

There is a plethora of client libraries for embedding statsd metrics to any application framework. This makes statsd quite popular for custom application metrics.

netdata is a fully featured statsd server. It can collect statsd formatted metrics, visualize them on its dashboards, stream them to other netdata servers or archive them to backend time-series databases.

Since statsd is embedded in Netdata, it means you now have a statsd server embedded on all your servers. So, the application can send its metrics to localhost:8125. This provides a distributed statsd implementation.

Netdata statsd is fast. It can collect more than 1.200.000 metrics per second on modern hardware, more than 200Mbps of sustained statsd traffic, using 1 CPU core (yes, it is single threaded - actually double-threaded, one thread collects metrics, another one updates the charts from the collected data).

Netdata fully supports the statsd protocol. All statsd client libraries can be used with Netdata too.

Gauges

The application sends name:value|g, where value is any decimal/fractional number, statsd reports the latest value collected and the number of times it was updated (events).

The application may increment or decrement a previous value, by setting the first character of the value to + or - (so, the only way to set a gauge to an absolute negative value, is to first set it to zero).

Sampling rate is supported (check below).

When a gauge is not collected and the setting is not to show gaps on the charts (the default), the last value will be shown, until a data collection event changes it.

Counters and Meters

The application sends name:value|c, name:value|C or name:value|m, where value is a positive or negative integer number of events occurred, statsd reports the rate and the number of times it was updated (events).

:value can be omitted and statsd will assume it is 1. |c, |C and |m can be omitted an statsd will assume it is |m. So, the application may send just name and statsd will parse it as name:1|m.

For counters use |c (esty/statsd compatible) or |C (brubeck compatible), for meters use |m.

Sampling rate is supported (check below).

When a counter or meter is not collected and the setting is not to show gaps on the charts (the default), zero will be shown, until a data collection event changes it.

Timers and Histograms

The application sends name:value|ms or name:value|h, where value is any decimal/fractional number, statsd reports min, max, average, sum, 95th percentile, median and standard deviation and the total number of times it was updated (events).

For timers use |ms, or histograms use |h. The only difference between the two, is the units of the charts (timers report milliseconds).

Sampling rate is supported (check below).

When a timer or histogram is not collected and the setting is not to show gaps on the charts (the default), zero will be shown, until a data collection event changes it.

Sets

The application sends name:value|s, where value is anything (number or text, leading and trailing spaces are removed), statsd reports the number of unique values sent and the number of times it was updated (events).

Sampling rate is not supported for Sets. value is always considered text.

When a set is not collected and the setting is not to show gaps on the charts (the default), zero will be shown, until a data collection event changes it.

The application may append |@sampling_rate, where sampling_rate is a number from 0.0 to 1.0, to have statsd extrapolate the value, to predict to total for the whole period. So, if the application reports to statsd a value for 1/10th of the time, it can append |@0.1 to the metrics it sends to statsd.

netdata listens for both TCP and UDP packets. For TCP though, is it important to always append \n on each metric. netdata uses this to detect if a metric is split into multiple TCP packets. On disconnect, even the remaining (non terminated with \n) buffer, is processed.

When sending multiple packets over UDP, it is important not to exceed the network MTU (usually 1500 bytes minus a few bytes for the headers). netdata will accept UDP packets up to 9000 bytes, but the underlying network will not exceed MTU.

controls if statsd will be enabled for this netdata. The default is enabled.

default port = 8125

controls the port statsd will use. This is the default, since the next line, allows defining ports too.

bind to = udp:localhost tcp:localhost

is a space separated list of IPs and ports to listen to. The format is PROTOCOL:IP:PORT - if PORT is omitted, the default port will be used. If IP is IPv6, it needs to be enclosed in []. IP can also be * (to listen on all IPs) or even a hostname.

decimal detail = 1000 controls the number of fractional digits in gauges and histograms. netdata collects metrics using signed 64 bit integers and their fractional detail is controlled using multipliers and divisors. This setting is used to multiply all collected values to convert them to integers and is also set as the divisors, so that the final data will be a floating point number with this fractional detail (1000 = X.0 - X.999, 10000 = X.0 - X.9999, etc).

Each metric gets its own private chart. This is the default and does not require any configuration (although there are a few options to tweak).

Synthetic charts can be created, combining multiple metrics, independently of their metric types. For this type of charts, special configuration is required, to define the chart title, type, units, its dimensions, etc.

Private charts are controlled with create private charts for metrics matching = *. This setting accepts a space separated list of simple patterns (use * as wildcard, prepend a pattern with ! for a negative match, the order of patterns is important).

So to render charts for all myapp.* metrics, except myapp.*.badmetric, use:

The memory mode of the round robin database and the history of private metric charts are controlled with private charts memory mode and private charts history. The defaults for both settings is to use the global netdata settings. So, you need to edit them only when you want statsd to use different settings compared to the global ones.

If you have thousands of metrics, each with its own private chart, you may notice that your web browser becomes slow when you view the netdata dashboard (this is a web browser issue we need to address at the netdata UI). So, netdata has a protection to stop creating charts when max private charts allowed = 200 (soft limit) is reached.

The metrics above this soft limit are still processed by netdata and will be available to be sent to backend time-series databases, up to max private charts hard limit = 1000. So, between 200 and 1000 charts, netdata will still generate charts, but they will automatically be created with memory mode = none (netdata will not maintain a database for them). These metrics will be sent to backend time series databases, if the backend configuration is set to as collected.

Metrics above the hard limit are still collected, but they can only be used in synthetic charts (once a metric is added to chart, it will be sent to backend servers too).

Using the above configuration myapp should get its own section on the dashboard, having one chart with 2 dimensions.

[app] starts a new application definition. The supported settings in this section are:

name defines the name of the app.

metrics is a netdata simple pattern (space separated patterns, using * for wildcard, possibly starting with ! for negative match). This pattern should match all the possible statsd metrics that will be participating in the application myapp.

gaps when not collected = yes|no, enables or disables gaps on the charts of the application, when metrics are not collected.

memory mode sets the memory mode for all charts of the application. The default is the global default for netdata (not the global default for statsd private charts).

history sets the size of the round robin database for this application. The default is the global default for netdata (not the global default for statsd private charts).

[dictionary] defines name-value associations. These are used to renaming metrics, when added to synthetic charts. Metric names are also defined at each dimension line. However, using the dictionary dimension names can be declared globally, for each app and is the only way to rename dimensions when using patterns. Of course the dictionary can be empty or missing.

Then, you can add any number of charts. Each chart should start with [id]. The chart will be called app_name.id. family controls the submenu on the dashboard. context controls the alarm templates. priority controls the ordering of the charts on the dashboard. The rest of the settings are informational.

You can add any number of metrics to a chart, using dimension lines. These lines accept 5 space separated parameters:

the metric name, as it is collected (it has to be matched by the metrics = pattern of the app)

the dimension name, as it should be shown on the chart

an optional selector (type) of the value to shown (see below)

an optional multiplier

an optional divider

optional flags, space separated and enclosed in quotes. All the external plugins DIMENSION flags can be used. Currently the only usable flag is hidden, to add the dimension, but not show it on the dashboard. This is usually needed to have the values available for percentage calculation, or use them in alarms.

So, the format is this:

dimension = [pattern] METRIC NAME TYPE MULTIPLIER DIVIDER OPTIONS

pattern is a keyword. When set, METRIC is expected to be a netdata simple pattern that will be used to match all the statsd metrics to be added to the chart. So, pattern automatically matches any number of statsd metrics, all of which will be added as separate chart dimensions.

TYPE, MUTLIPLIER, DIVIDER and OPTIONS are optional.

TYPE can be:

events to show the number of events received by statsd for this metric

last to show the last value, as calculated at the flush interval of the metric (the default)

Then for histograms and timers the following types are also supported:

min, show the minimum value

max, show the maximum value

sum, show the sum of all values

average (same as last)

percentile, show the 95th percentile (or any other percentile, as configured at statsd global config)

median, show the median of all values (i.e. sort all values and get the middle value)

When a dimension has a non-empty NAME, that name is looked up at the dictionary.

If the above lookup gives nothing, or the dimension has an empty NAME, the original statsd metric name is looked up at the dictionary.

If any of the above succeeds, netdata uses the value of the dictionary, to set the name of the dimension. The dimensions will have as ID the original statsd metric name, and as name, the dictionary value.

So, you can use the dictionary in 2 ways:

set dimension = myapp.metric1 '' and have at the dictionary myapp.metric1 = metric1 name

set dimension = myapp.metric1 'm1' and have at the dictionary m1 = metric1 name

In both cases, the dimension will be added with ID myapp.metric1 and will be named metric1 name. So, in alarms you can use either of the 2 as ${myapp.metric1} or ${metric1name}.

keep in mind that if you add multiple times the same statsd metric to a chart, netdata will append TYPE to the dimension ID, so myapp.metric1 will be added as myapp.metric1_last or myapp.metric1_events, etc. If you add multiple times the same metric with the same TYPE to a chart, netdata will also append an incremental counter to the dimension ID, i.e. myapp.metric1_last1, myapp.metric1_last2, etc.

The above will add dimension named 200, 400 and 500 (yes, netdata extracts the wildcarded part of the metric name - so the dimensions will be named with whatever the * matched). You can rename the dimensions with this:

Note that we added a NAME to the dimension line with get.. This is prefixed to the wildcarded part of the metric name, to compose the key for looking up the dictionary. So 500 became get.500 which was looked up to the dictionary to find value 500 cannot connect to db. This way we can have different dimension names, for each of the API methods (i.e. get.500 = 500 cannot connect to db while post.500 = 500 cannot write to disk).

To add all API methods to a chart, do this:

[ok_by_method]...dimension=pattern 'myapp.api.*.200 '' last 1 1

The above will add get, post, del and all to the chart.

If all is not wanted (a stacked chart does not need the all dimension, since the sum of the dimensions provides the total), the line should be:

If you send just one value to statsd, you will notice that the chart is created but no value is shown. The reason is that netdata interpolates all values at second boundaries. For incremental values (counters and meters in statsd terminology), if you send 10 at 00:00:00.500, 20 at 00:00:01.500 and 30 at 00:00:02.500, netdata will show 15 at 00:00:01 and 25 at 00:00:02.

This interpolation is automatic and global in netdata for all charts, for incremental values. This means that for the chart to start showing values you need to send 2 values across 2 flush intervals.

(although this is required for incremental values, netdata allows mixing incremental and absolute values on the same charts, so this little limitation [i.e. 2 values to start visualization], is applied on all netdata dimensions).

(statsd metrics do not loose their first data collection due to interpolation anymore - fixed with PR #2411)