Admin

Alerting

This guide assumes you have a basic understanding of the Prometheus resource and have read the getting started.

Besides the Prometheus and ServiceMonitor resource the Prometheus Operator also introduces the Alertmanager. It allows declaratively describing an Alertmanager cluster. Before diving into deploying an Alertmanager cluster, it is important to understand the contract between Prometheus and Alertmanager.

Prometheus' configuration includes so called rule files, which contain the alerting rules. When an alerting rule triggers it fires that alert against all Alertmanager instances, on every rule evaluation interval. The Alertmanager instances communicate to each other which notifications have already been sent out. You can read more about why these systems have been designed this way in the High Availability scheme description.

The Alertmanager instances will not be able to start up, unless a valid configuration is given. This is an example configuration, that does not actually do anything as it sends notifications against a non existent webhook, but will allow the Alertmanager to start up. Read more about how to configure the Alertmanager on the upstream documentation.

Save the above alertmanager config in a file called alertmanager.yaml and create a secret from it using kubectl.

The Alertmanager instance needs an special naming on the secret resource in order to be able to grab it. It looks for
alertmanager-{ALERTMANAGER_NAME} pattern, in this example the name of the Alertmanager is example so the secret name
should be alertmanager-example and the name of the config file alertmanager.yaml

Once created it allows the web UI to be accessible via a node's IP and the port 30903.

Now this is a fully functional highly available Alertmanager cluster, but it does not get any alerts fired against it. Let's setup Prometheus instances that will actually fire alerts to our alertmanagers.

The above configuration specifies a Prometheus that finds all of the alertmanagers behind the Service we just created. The name and port fields under alertmanagers, should match those of our Service to allow this to occur.

Prometheus rule files are held in ConfigMaps. The ConfigMaps to mount rule files from are selected with a label selector field called ruleSelector in the Prometheus object, as seen above. All top level files that end with the .rules extension will be loaded.

The best practice is to label the ConfigMaps containing rule files with role: prometheus-rulefiles as well as the name of the Prometheus object, prometheus: example in this case.

Heading to the Alertmanager web UI now shows one active alert, although all Prometheus instances are firing it. Configuring the Alertmanager further allows custom alert routing, grouping and notification mechanisms.