Advanced RabbitMQ Support Part II: Deeper Insight into Queues

Introduction

The most important and critical elements of any RabbitMQ installation are the Queues. Queues retain messages specific to different use cases across various industrial sectors such as telecommunications, financial systems, automotive, and so forth. Queues, and their adherence to AMQP are essentially “why” RabbitMQ exists. Not only do they retain messages till consumption, but internally, they are also an implementation of some of the most complex mechanisms for guaranteeing efficient message propagation through the fabric, while catering for additional requirements such as high availability, message persistence, regulated memory utilisation, and so forth.

So queues are general, the main focal point of any RabbitMQ installation. Which is why all RabbitMQ users and support engineers often find themselves having to do regular checks around queues, as well ensuring their host Rabbit nodes have been precisely configured to guarantee efficient message queueing operations. Typical questions that tend to arise from RabbitMQ users and support engineers are;

… how many messages are in Queue “A”?

… how many messages are pending acknowledgement in Queue “K”?

… how many consuming clients are subscribed to Queue “R”?

… how much memory is Queue “D” using?

… how many messages in Queue “F” are persisted on disk?

… is Queue “E” alive?

Within RabbitMQ, the implementation of a queue is a combination of multiple aspects such as the behaviour specification governing its operation (e.g. internally, what is known as the backing queue behaviour), the transient/persistent message store components, and most importantly, the queue process dictating all the logic and mechanics involved in the queueing logic. From these, a number of attributes exist, which give an indication of the current state of the queue. Some of these queue attributes are illustrated below:

Fig 1: RabbitMQ Queue Attributes

WombatOAM

As of WombatOAM 2.7.0, the WombatOAM-RabbitMQ plugin now ships with an additional agent, the RabbitMQ Queues agent. This RabbitMQ Queues agent has been precisely designed and developed to allow monitoring and acquisition of metrics specific to Queues, as well as presenting them in a user friendly manner to RabbitMQ users. Two modes of operation are supported:

Dynamic operation: Queues existing on the monitored node, with names matching to a user defined regex are dynamically loaded by WombatOAM for monitoring.
Static operation: Specific queues are configured and monitored as defined in the WombatOAM RabbitMQ

Configuration

The manner in which this agent operates and presents metrics is solely dependant on the way in which it has been configured.

1. Dynamic operation

Dynamic mode of monitoring Queues may be configured by defining a match specification, from which queue names are matched against as follows, and the particular, desired attribute/metric from each matched queue. For example, to monitor memory usage of all queues, the following configuration may be defined in the wombat.config file:

This will capture all queues on the node being monitored and present memory metrics from queues.

Fig 2: RabbitMQ Dynamic Queue Metrics

2. Static operation

In static mode of operation, users explicitly specify Queues and corresponding attribute/metric they would like to monitor in the wombat.config . A complete static configuration entry would consist of the Queue Name, Virtual Host, and the Attribute being measured. For example, to monitor the number of messages , consumers and amount of memory utilisation from the SERVICES.QUEUE , and number of messages only, from the EVENTS.QUEUE, a user may specify the following configuration from the wombat.config file:

Configuring Static Queues is of extreme importance if your mission critical queues which you need continuous visibility of metrics such as messages counts and memory usage

The following illustrates an example of static mode:

Fig 3: RabbitMQ Static Queue Metrics

Taking “things” further!

Coupling together our discussion of monitoring Queues, together with discussion with Part-1 of this series of carrying out advanced alarming operations for RabbitMQ operations, imagine how many alarming cases we could achieve by defining alarms specific to Queue metrics?

Not only does WombatOAM provide us with a huge spectrum of alarming cases we could handle, but useful metrics. Imagine how useful the following alarms would be:

“an alarm which when triggered would send your team email notifications indicating that the number of messages in your most critical SERVICE.QUEUE has just reached the 500 000 message, limit without messages being consumed?”

Or:

“an alarm configured to issue email notifications when the number of consuming clients falls below a certain minimum permissible number, indicating there’s a critical service affecting problem on the client end”

or even more interesting:

“an alarm and email notification issued when a queues individual memory usage exceeds a certain cap value, beyond which would be an indication of one or more problems manifesting in the cluster.”

Defining such alarms could be as simple as configuring the following in wombat.config as illustrated here.

Fig 4: RabbitMQ Queue Alarms

Conclusion

So with these capabilities in mind, imagine the total number Queue specific metrics attainable for monitoring on WombatOAM? The number can be immense, and only limited by the total number of queues you have running, along with the number of attributes you have configured/setup for monitoring. All this is dependant on your configuration. To be precise, a total of 16 attributes are configurable per queue on WombatOAM, meaning a total of 16 x N queue specific metrics are attainable (Wow!). So imagine a queue count of ~50 or more queues on a RabbitMQ installation? The number of attainable metric capabilities becomes crazy! That’s ~50 x 16 = a staggering 800 metrics!!!

WombatOAM also provides ability to order queues as desired since the number of available queue metrics has the potential to be extremely large. The rate at which metrics are acquired is also configurable. If you desire to reduce frequency of which metrics are gathered (which is recommended when you have an extremely large number of queues, and queue metrics configured), this can be carried out by simply updating configuration.