Navigation

Provides a Prometheus exporter to pass on Ceph performance counters
from the collection point in ceph-mgr. Ceph-mgr receives MMgrReport
messages from all MgrClient processes (mons and OSDs, for instance)
with performance counter schema data and actual counter data, and keeps
a circular buffer of the last N samples. This plugin creates an HTTP
endpoint (like all Prometheus exporters) and retrieves the latest sample
of every counter when polled (or “scraped” in Prometheus terminology).
The HTTP path and query parameters are ignored; all extant counters
for all reporting entities are returned in text exposition format.
(See the Prometheus documentation.)

By default the module will accept HTTP requests on port 9283 on all
IPv4 and IPv6 addresses on the host. The port and listen address are both
configurable with cephconfig-keyset, with keys
mgr/prometheus/server_addr and mgr/prometheus/server_port.
This port is registered with Prometheus’s registry.

The names of the stats are exactly as Ceph names them, with
illegal characters ., - and :: translated to _,
and ceph_ prefixed to all names.

All daemon statistics have a ceph_daemon label such as “osd.123”
that identifies the type and ID of the daemon they come from. Some
statistics can come from different types of daemon, so when querying
e.g. an OSD’s RocksDB stats, you would probably want to filter
on ceph_daemon starting with “osd” to avoid mixing in the monitor
rocksdb stats.

The cluster statistics (i.e. those global to the Ceph cluster)
have labels appropriate to what they report on. For example,
metrics relating to pools have a pool_id label.

See the prometheus documentation for more information about constructing
queries.

Note that for this mechanism to work, Ceph and node_exporter must agree
about the values of the instance label. See the following section
for guidance about to to set up Prometheus in a way that sets
instance properly.

See the prometheus documentation for full details of how to add
scrape endpoints: the notes
in this section are tips on how to configure Prometheus to capture
the Ceph statistics in the most usefully-labelled form.

This configuration is necessary because Ceph is reporting metrics
from many hosts and services via a single endpoint, and some
metrics that relate to no physical host (such as pool statistics).

By default, Prometheus applies an instance label that includes
the hostname and port of the endpoint that the series game from. Because
Ceph clusters have multiple manager daemons, this results in an instance
label that changes spuriously when the active manager daemon changes.

Set a custom instance label in your Prometheus target configuration:
you might wish to set it to the hostname of your first monitor, or something
completely arbitrary like “ceph_cluster”.

Counters and gauges are exported; currently histograms and long-running
averages are not. It’s possible that Ceph’s 2-D histograms could be
reduced to two separate 1-D histograms, and that long-running averages
could be exported as Prometheus’ Summary type.

Timestamps, as with many Prometheus exporters, are established by
the server’s scrape time (Prometheus expects that it is polling the
actual counter process synchronously). It is possible to supply a
timestamp along with the stat report, but the Prometheus team strongly
advises against this. This means that timestamps will be delayed by
an unpredictable amount; it’s not clear if this will be problematic,
but it’s worth knowing about.