There are various places in Tuleap were message queues are needed but the primary one is to run jobs in background

Historically it was for all system related stuff (creating users, git/svn repositories, etc) that required special unix permissions that we were not eager to grant to a web app.

Back in time, we built a simple queue in Database with a cron that consumes events every minute. I can see your eyes rolling but 10 years ago it was not that bad. As it was for system related stuff, we called that SystemEvents. It was fun and we improved that part giving it more queues (one dedicated to git for instance).

RabbitMQ as a dedicated queue system

More recently we add a need to share events across servers, for distributed Tuleap setup for instance. Our good old system was no longer able to deal with that because we needed a real queue management that works across servers. We choose RabbitMQ because we were looking for a queue system, PHP support quite decent and tutorials were good.

However, we had to setup a quite complicated queue system. We had the need to guaranty that each and every event is consumed only once per worker. To make a long story short, we built an Exchange to Exchange Binding

Exchange to Exchange binding topology (from http://skillachie.com)

We made the following PHP implementation (simplified here).

In PersistentQueue we manage connexion to RabbitMQ and deal with events (push and listening to them).

In term of lines of code that's not too much but the concepts are rather complex (you need to know what an Exchange, a routing_key, a topic and binding is). To be fair, each an everytime we had to touch this code we had to get back to the initial schema.

Enters Redis

In the meantime we learnt about redis RPOPLPUSH and it changed everything.

Ok, it didn't changed everything first because we didn't scroll down on the page and we were like "humm how do you pronounce that?". The real thing is a little bit down on the page Pattern: Reliable queue.

The application in Tuleap is:

one or several workers are waiting (BRPOPLPUSH) for events in event_queue

an agent sends an event to event_queue with some payload

one of the agent gets the event and pushes it automatically into a processing_queue (one queue per worker)

when the agent has completed its work, the event is removed from processing_queue.

if the agent crashes while doing the work, the processing_queue keeps the event

when agent re-starts, all events are re-queued from processing_queue back to event_queue

As you can see the implementation is rather straightforward. The main pitfall we got is that, by default, php sockets timeout is 60s and that raises as RedisException. We choose to let default timeout and to catch the exception to reconnect.

Conclusion

While RabbitMQ did the job and did it well, we decided that we will convert all our RabbitMQ queues to Redis. We switched completeness for simplicity and versatility (obviously for caching & K/V purpose).

The amount of events that can be managed is not a limiting factor (we are far from limits, whatever system is used).

In addition to that, it's far easier to install and to administrate a Redis server than a RabbitMQ one. As most of the Tuleap installations are not under our control that's also a big plus.