Configuring DataDog on Qubole

If entered in the account level it will start Datadog on non-hadoop clusters as well. So, till we support all type of clusters, we should recommend customers use cluster level only.

Our Datadog program works as follows:

Whenever the cluster is started from the ui, we create a dashboard and a set of alerts for a cluster. The dashboard is Named "Cluster #id Dashboard". Unfortunately there is no feedback today that Datadog dashboard is created or a link to it, something we will add in the future. To look at the dashboard the customer has to go to Datadog website/ dashboards list and open the dashboard named "Cluster #id Dashboard".

Configuration:

We re-create the dashboard every time a cluster is created, if a user does some change to the dashboard, we advise them to clone the dashboard and perform the changes there.

The alerts are set to send an email to the emails present in the account notification list.

Once the cluster starts, we have a cron job on master polling ganglia every 4 minutes to get metrics to push to ganglia. The list of metrics which are sent to Datadog can be extended by adding new metrics to file /etc/metrics/ganglia_metrics_file.csv

Ganglia needs to be enabled for Datadog to work. Qubole does not enable ganglia from the backend automatically when Datadog is enabled.

Master if its a master only metric. Empty if it's an aggregated metric like cpu_report/memory_report

/End hadoop1 specifics

Custom metrics--------------

We also want to support services which do not send metrics directly to ganglia.

For this, we have a separate cron job(backed by "/etc/metrics/custom_metrics_file.csv") where users can specify certain commands to be run periodically and the output to be sent to ganglia with a well defined name.

Eg. If one of the lines in the file is:

"active", "echo 1", "int8"

We send a metric to ganglia every 2 minutes with metric name "custom.active", with value "1" (the output of "echo 1") and specify the datatype of the metric to be int.