OpenStack monitoring with Dynatrace is now GA

We’re happy to announce the General Availability (GA) of OpenStack monitoring with Dynatrace, bringing our long Early Access Program (EAP) (which began in February 2017) and analysis of customer requirements to a close. The Dynatrace OpenStack monitoring solution is GA as of the Dynatrace version 1.162 release and the OneAgent version 1.161 release.

This blog post shows you how to get the most value out of Dynatrace monitoring when you use the OpenStack cloud to provision infrastructure components.

Enable full-stack monitoring of OpenStack environments easily

The structure of this article follows the discovery path, from application performance and availability monitoring, through the monitoring of underlying services, all the way to the supporting infrastructure and its management. In true Dynatrace fashion, we’ve placed our bets on ease of deployment and automation of discovery. Your exploration and level of awareness of your OpenStack-managed environment is dependent only on your decision as to how deep you want to drill down, and in particular, your decision about placement of OneAgents within your monitored environment.

Let’s consider the journey through and exploration of different levels of OpenStack awareness in these simple steps, each provides additional insight into OpenStack infrastructure:

Step 1: OneAgent on VMs managed by OpenStack—awareness of the OpenStack as an orchestration layer

Full-stack monitoring of applications with Dynatrace is only possible if you deploy full-stack OneAgents on important hosts in your environment, specifically those that host your applications’ services and resources.

When OneAgent is deployed on virtual machines operated by OpenStack, you can take advantage of the powerful Dynatrace APM value proposition: zero-configuration detection of applications, services, problems, and root cause analysis. Over and above that, we identify OpenStack as acting as an enterprise cloud technology and provide information about OpenStack’s compute node.

Smartscape analysis shows you how your VMs interact with each other and gives you an understanding of the vertical dependencies between your application components—virtual machines, processes, and services.

If needed, OneAgents can also be deployed on OpenStack compute nodes. In such cases, we recommend that OneAgents be configured for cloud infrastructure-only monitoring mode. This is dictated by the fact that there are typically no injectable technologies to monitor, and it helps reduce the cost of host units consumed by OneAgents.

When deployed on compute nodes, OneAgents provide valuable insight into the existence and resource allocation of VMs managed by OpenStack, as well as their availability, responsiveness, associated worker processes, I/O operations, and more.

Additionally, all OpenStack services running on the compute node are properly discovered and measured for availability and resource consumption.

Step 3: OneAgent on OpenStack controller nodes—awareness of services and their resource utilization for important OpenStack services

When OneAgents are deployed on OpenStack controller nodes, it’s possible to detect and monitor the remaining OpenStack services—those that are not typically found on compute nodes but are important elements of OpenStack.

Dynatrace provides out-of-the-box alerting on resource allocation and availability for these processes.

Step 4: Deep insight into OpenStack via plugins

Under the hood of OpenStack, there are several popular technologies that we can also monitor with Dynatrace OneAgents through the use of their respective plugins. These technologies include RabbitMQ, MySQL, HAproxy, and MemCached. The plugins require additional configuration (namely, access to these services’ APIs), but in return, provide technology-specific measurements.

To illustrate the challenges involved in monitoring the technologies that support OpenStack, here’s a problem we ran into within our own OpenStack environment. The RabbitMQ process in the example below was launched using the default file descriptor limit of 1024. Once this limit was exceeded, RabbitMQ stopped accepting new connections. This resulted in a Connectivity problem.

We wouldn’t have known about this problem if it weren’t for the RabbitMQ-specific measurements that Dynatrace provides. All details are included in the same view, so there is no need to use multiple tools to get the complete picture.

Step 5: Log Analytics

Dynatrace comes with a powerful Log Analytics module that can be applied to monitor OpenStack services. When configured, it picks up symptoms of problems specific to OpenStack and takes them into account while performing the root-cause analysis of the solution.

In the example below the Log viewer has uncovered numerous warnings in the keystone.log file indicating that the authentication process has been failing.

In this particular case, the root cause of these problems was related to memory saturation on the controller node. As illustrated below, the memory was indeed exhausted: it had reached almost 100% saturation.

Note further down in the Processes section that all OpenStack services running on the controller are listed. You can click any of these individual processes to analyze their connections and understand their relationship to other processes.

The Log Analytics module is fully configurable. Below are a dozen example configurations that can be easily changed and adapted to your local OpenStack environment. They were tested to work with older versions of OpenStack, so some updates might be required for more recent releases.

Potential improvements and further steps

When we defined the original scope of the OpenStack monitoring EAP, we developed a number of specific plugins for OpenStack services. The goal of these plugins was to provide additional insight into specific metrics for Keystone, Horizon, and Glance. They are currently not part of the out-of-the-box solution, but can be retrofitted and included in OneAgents with some effort related to the exposure of their respective configurations.

The data provided by these plugins can be also analyzed by Dynatrace AI and taken into account during root-cause analysis. It can also be subject to alerting and integrations with external services.

We want to hear from you

We’re always happy to receive your feedback and ideas. Reach out to us via Dynatrace Answers, Dynatrace Support, or your Dynatrace representative to share your thoughts with us. Please let us know how you are using OpenStack infrastructure monitoring by Dynatrace in your environment and how we can improve it to make it even better..

Bartek is Senior Technical Product Manager at Dynatrace. He's worked on APM and ALM products for over 17 years. His background is in software development, but he's also passionate about user experience, process optimization, and delivering value to customers. Outside of work, Bartek enjoys travel, photography, jogging, and is CEO of a retro gaming and computing society that organizes public events and exhibitions. Proud father of three daughters.