End-to-end tracing for additional message queues with OneAgent SDK

Message queues have become an important building block of modern architectures as they enable asynchronous communication between distributed applications. Producers can add requests to the queue without waiting for them to be processed. Consumers process messages only when they are available. Queues also help increase reliability as they make the data persistent, reducing errors that occur when different parts of your system go offline.

We’re committed to providing out-of-the-box support for the most popular messaging systems. Though we sometimes receive requests from users asking for support of messaging systems that aren’t widely used and therefore aren’t yet supported. To meet this demand and enable the end-to-end tracing of any queue and messaging system, we’ve now extended the OneAgent SDK for Java, .NET, and C/C++. Support for this feature in the OneAgent SDKs for Node.js and Python will follow.

Request tracing for any messaging system, end to end

At Dynatrace we use SuperDump, an open source project created internally, to automatically analyze crash dumps generated in our test systems. SuperDump enables us to speed up the first assessment of a crash dump, by automatically preparing the analysis. Our developers can then quickly determine if the issue is already known or if they need to act upon it. Therefore, it’s an important tool for our R&D and needs to be monitored accordingly.

SuperDump uses Hangfire to schedule tasks such as downloading, analyzing, and processing the crash dumps. We’ve been monitoring SuperDump for a long time with OneAgent.Christoph Neumüller, the Dynatrace engineer maintaining this project, decided to use the OneAgent SDK to get visibility into Hangfire queues. Here’s the short snippet he added to his .NET code to monitor the download queue end to end.

Once Christoph redeployed his code, the download queue instantly appeared in Smartscape, our environment topology visualization tool. This information is placed in the context of your overall environment. Smartscape builds an interactive map showing how everything in the environment is interconnected, and you can clearly see which services are sending messages to the queue and which ones are processing the queue.

Of course, the download queue also appears in the service flow. You can see below that 564 requests were sent to that queue and that the average queue time is 27.8 ms.

Best of all, all the data captured by OneAgent is automatically taken into account in the analysis, including, of course, AI-based root-cause analysis. For example, you can analyze the response time on any service page to see the contribution. You’ll see all important queue interactions in addition to how much time they contribute to overall response time.

What’s next?

Stay tuned for the upcoming availability of this feature in OneAgent SDKs for Node.js and Python.

The general availability (GA) of the OneAgent SDK will be announced later this year.

We’d love to hear your feedback! The OneAgent SDK is available on GitHub. All user contributions (from additional language bindings for the C/C++ SDK, to the reporting of minor defects, issues, or typos) are welcome. The best way for you to do this is via the GitHub repository issue tracker. You can also comment on our roadmap thread in AnswerHub.