Monitoring + securing containers + microservices.

How 6 of the world’s largest companies use Kub + Sysdig.

Integrating Prometheus alerts and events with Sysdig Monitor

Prometheus alerts: Sysdig ♥ Prometheus (part II)

If you already use (or plan to use) Prometheus alerts and events for application performance monitoring in your Docker / Kubernetes containers, you can easily integrate them with Sysdig Monitor via the Alertmanager daemon, we will showcase the integration in this post.

Prometheus provides its own alerting system using a separate daemon called Alertmanager. What happens if you already have a Prometheus monitoring infrastructure for APM and plan to integrate the Sysdig container intelligence platform? (Or the other way around).

They can work together without any migration or complex adaptation efforts, actually, there is a lot to be gained from the combination of application-specific custom Prometheus monitoring that your developers love and deep container and service visibility provided by Sysdig.

These two contexts together add more dimensions to your monitoring data. To illustrate what we mean: You can easily detect that the MapReduce function on your backend container is taking longer than usual because your kubernetes.replicas.running < kubernetes.replicas.desired, the horizontal container scaling is failing and thus, the container that fired the alarm is receiving an order of magnitude more work.

Metrics Exporters, Prometheus, Alertmanager & Sysdig integration

Docker monitoring scenario

To put things in context let’s assume that you already have a Docker environment instrumented with Prometheus:

It could be Swarm, Kubernetes, Amazon ECS… whatever you’re using our integration with prometheus works the same way.

Simplifying, you have several exporters that emit metrics, the Prometheus server aggregates them and checks the alert conditions, if those conditions are met, it sends the configured alert to Alertmanager. Alertmanager is in charge of alert filtering, silencing, cooldown times and also sending the alert notifications to its receivers, mail and slack chat in our example.

One of the available receivers for Alertmanager is a webhook, this method boils down to HTTP POSTing a JSON data structure. Its simplicity and standard format provide a lot of flexibility to integrate any pair of producer / consumer software.

Accordingly, this is what we want to deploy:

A new webhook AlertManager receiver that retrieves the JSON, reformats it to adapt to the Sysdig API function and uploads the alert data to Sysdig Monitor, is really just that. The interesting bit is that you don’t have to modify the monitoring / alerting infrastructure you already have. It’s just a new data output.

This configuration requires that the Prometheus container is able to resolve the pythonmetrics and alertmanager hostnames, don’t worry much about it, using the docker-compose file we provide below, everything should work out of the box.

Nothing too surprising here, it has a name, a description, a condition to fire expressed in the internal Prometheus language and a time bucket to evaluate.

We are going to use the annotations as scope for our Sysdig alert, our webhook script will also translate the ‘_’ to ‘.’ characters (they are not valid as annotation name), so we can have the exact same names than native Sysdig alerts.

Alertmanager & webhook receivers – Prometheus alerts integration

The third piece of this puzzle is the Alertmanager, you can read about its configuration here. Particularly, you can use the different routes and receivers in the routing tree to filter and classify the alerts and, for example, deliver alerts from different parts of your infrastructure to separate Sysdig teams.

In this example we are just going to configure a default webhook receiver: