Daily Archives: November 7, 2015

Update 21st November 2015: This article explains the logic behind resilient connections, but was not yet a complete solution at the time of writing. Please see updates at the end of the article for the suggested action to take.

That’s straightforward enough. But what happens if the connection breaks? Let’s find out.

To make sure everything’s running fine, we can repeat the test done in the original article: run the program, send a message from the RabbitMQ Management UI, and make sure it is received:

That’s all well and good. Now, let’s restart the RabbitMQ service – that will break the connection. To do this, From Start -> Run, run services.msc to open up the Services running on the system, and restart the one called RabbitMQ:

If you try publishing another message to the queue, you’ll find that the consumer won’t receive it. In fact, if you check the queue from the management UI, you’ll find that the message is still there. If you restart the client program (starting a new connection), the new message will be able to reach the client.

Although restarting services to re-establish connections may be common practice in some companies (*cough *cough*), it’s not something we want to encourage, so we need a mechanism that allows the connection to be re-established once the service is available again.

This doesn’t solve the scenario presented above, but handles an edge case that occurs if a message is taken by a consumer but the connection dies before it gets acknowledged:

“Turning the heartbeat on also makes the server check to see if the connection is still up, which can be very important. If a connection goes bad after a message has been picked up by the subscriber but before it’s been acknowledged, the server just assumes that the client is taking a long time, and the message gets “stuck” on the dead connection until it gets closed. With the heartbeat turned on, the server will recognize when the connection goes bad and close it, putting the message back in the queue so another subscriber can handle it. Without the heartbeat, I’ve had to go in manually and close the connection in the Rabbit management UI so that the stuck message can get passed to a subscriber.”

Since this scenario is a bit tricky to reproduce, I’ll take the author’s word for it. But now, on to our scenario. This part of the answer gives us something to think about:

“Finally, you will have to handle what your consumer does when trying to consume messages from a closed connection. Unfortunately, each different way of consuming messages from a queue in the Rabbit client seems to react differently. QueueingBasicConsumer throws EndOfStreamException if you call QueueingBasicConsumer.Queue.Dequeue on a closed connection. EventingBasicConsumer does nothing, since it’s just waiting for a message.”

In our case we’re using EventingBasicConsumer, and the test we performed earlier showed that in case of disconnection, messages are no longer received, but no exceptions are thrown either. In this case, we need a way to detect when a connection breaks. Fortunately, there’s an event that fires when that happens:

connection.ConnectionShutdown += Connection_ConnectionShutdown;

We’ll need to refactor our code so that we can recreate the connection, channel and consumer when reconnecting. Let’s move our variables outside of the Main() method:

If you test this, you’ll see that it actually works: it reconnects, and messages you send after reconnection are received:

This sorts out what we set out to achieve, but we’re not quite done yet.

For starters, Thread.Sleep() sucks. We can make the reconnect code more efficient by using something like a ManualResetEventSlim. A ManualResetEventSlim is like a semaphore, but only has on and off (Set and Reset) states. Although it is mostly useful in multithreading scenarios, we can use it instead of Thread.Sleep() to periodically reconnect. At face value, the following code should behave the same way as the code above:

That’s nice. But we also have to handle the case where service wasn’t running from the start, and the initial connection attempt failed (the current code will explode in this scenario).

No problem. All we have to do is call the same reconnection logic when connecting the first time. Since we don’t want to call the event handler directly, let’s move the logic into its own method. Note that I’ve also changed the output strings from “Reconnect” to “Connect” so that they apply to all connection attempts.

But you’ll also find that we broke the case where the user presses ENTER to exit the program gracefully, as the cleanup logic will cause the ConnectionShutdown to be fired, triggering the reconnect logic. We can sort this out by disconnecting the event handler before the final cleanup:

All this should work pretty nicely. Additionally, since we’ve neatly separated setup from cleanup, it’s really easy to put this code into a Windows service and allocate the appropriate parts into Start() and Stop() methods.

Update 12th November 2015: In the .NET/C# API Guide, you’ll find that there’s an AutomaticRecoveryEnabled property on the ConnectionFactory which takes care of reconnecting if the connection goes down. However, it does not deal with the case when the initial connection fails (in which case an exception is thrown).

Update 21st November 2015: Apparently handling reconnect as shown here causes problems with acknowledgements when the connection breaks. The best approach is a combination of iterative reconnect for the first connection, and AutomaticRecoveryEnabled for subsequent disconnects.