Hi everyone,
I'm evaluating RabbitMQ to possibly replace our existing solution. We have 2 datacenters, one in the US, one in Europe and sometimes the link shows 1-10% packet loss (our RTT is about 200 ms). We have publishers and subscribers on both US+Europe. Our existing solution does not deal with packet loss very well. Our queues end up filling up and we have huge delays and/or dropped packets. We believe it might be that our solution is using RMI over a single TCP connection (among other things).
I was reading the RabbitMQ docs, but couldn't find many details on what exactly the underlying protocol is for sending messages between cluster nodes.
I 've seen that RabbitMQ relies on distributed Erlang over TCP, but is there more specific information on how links are made reliable in RabbitMQ? More particularly I'm looking to find out more details on how RabbitMQ configures its transports between cluster nodes.
Are there single or multiple TCP connections between nodes?
What happens when a TCP connection fails? Do nodes detect that event and try to reestablish the connection automatically? Distributed Erlang has APIs for detecting a node is down. Is that used in Rabbit?
Do connections rely on TCP to deal with packet loss or is there some Rabbit magic as well?
How are messages serialized into AMQP packets?
I found some useful information about WAN setups in this thread:
http://old.nabble.com/RabbitMQ-and-a-two-site-deployment-connected-via-WAN.-td28445648ef25704.html#a28445648
and I've read the Distributed Erlang docs, but not much info is there either.
I was hoping to get a better understanding of the transport with these questions.
many thanks,
nikos