Application Monitoring

A clear overview of the health of an infrastructure requires a mix of cluster monitoring and application monitoring. While most Kubernetes clusters have similar monitoring needs, all applications running as workload on Kubernetes are different. Custom applications have custom metrics, and therefore have special monitoring needs.

Instrumentation

For a Prometheus server to collect metrics from an application, that application must be instrumented. A target must expose an HTTP endpoint for a Prometheus server to request when collecting the metrics. Client libraries which transform the in-memory metric values to the text format Prometheus expects exist for many different languages.

Prometheus requires access to the Kubernetes API for discovering the Pods. Therefore it requires sufficient RBAC roles for this. Tectonic Monitoring ships a ClusterRole to provide this role. Therefore all that is needed is a ServiceAccount and a ClusterRoleBinding connecting them.

As part of Tectonic Monitoring, an Alertmanager cluster is deployed in the tectonic-system namespace. All alerts generated from all Prometheus instances are meant to be fired against that Alertmanager cluster, thus all Prometheus objects should have the same alerting section as above.

Alerting

In the Prometheus object described above, there is a ruleSelector field, which selects the ConfigMaps from which .rules files are loaded. This means that for a ConfigMap to be loaded by the Prometheus object the prometheus=frontend label must exist.

This alert fires if over ten HTTP requests have resulted in a 404 response code. Note that the thresholds and other characteristics of this alerting rule may not reflect your need. This is merely an example.

The sample application has the http_requests_total metrics, which is increased on every HTTP request. To generate some data, sample requests can be made. For example by running kubectl proxy and then requesting http://localhost:8001/api/v1/proxy/namespaces/frontend/services/example-app:web/ to generate responses with a 200 response code and http://localhost:8001/api/v1/proxy/namespaces/frontend/services/example-app:web/err to generate responses with a 404 response code.