Sunday, April 10, 2011

I’ve spent an interesting week evaluating various Message Queue products. The motivation behind this is a client that has somewhat high performance requirements. They have bursts of over a million simultaneous messages. Currently they’re using a SQL server based solution, but it’s not ideal, and I’m suggesting they look at Message Queuing products as an alternative.

In order to get a completely unscientific feel for the performance of some likely contenders, I put together a little test. Each queue would be asked to send one million 1K messages and receive them again. The test was done in somewhat of a hurry, so I haven’t tweaked any settings, just installed each MQ and done the simplest send and receive I could manage after a quick glance at the docs. So this is true out-of-the-box performance. I fully accept that this is going to penalise MQ products that have conservative default configurations.

The candidates are:

MSMQ. The default choice for where only the products of Redmond are considered worthy. For my clients, if MSMQ can rise to the challenge, then they should use it. The main points against it are its lack of sophistication; nothing but send and receive; and it’s arbitrary hard limits, such as the 4MB maximum message size. However, with something like MassTransit or NServiceBus layered on top, it’s entirely possible to do serious work with it.

ActiveMQ. The stalwart of the Java world. It has long service and ubiquity going for it. It’s also cross platform and would provide a natural integration point for non-Microsoft platform products. However, it would have to perform better than MSMQ to have a look-in.

RabbitMQ. I’ve been hearing excellent things about this message broker written in Erlang. It supports the open AMQP (Advanced Message Queuing Protocol) with the potential to avoid vendor lock-in and benefiting from a wide range of clients in almost every language. AMQP provides some pretty sophisticated messaging patterns, so there’s less need for a MassTransit or NServiceBus. It also has ‘enterprise’ resilience and durability. That’s something my client is very interested in.

ZeroMQ. I only discovered this MQ product when I was researching AMQP. The company that created it were part of the AMQP group and had a product called OpenAMQ. However, they parted company with AMQP quite dramatically, complaining that it had lost its way and was becoming over complicated. You can read their Dear John letter here. ZeroMQ has a unique broker-less model which means that unlike all the other products under test, you don’t need to install and run a message queuing server, or broker. Simply reference the ZeroMQ library, it’s on NuGet, and you can happily send messages between your applications. Interestingly, they are also placing it as a way of creating Erlang style actors in any language by leveraging ZeroMQ’s blazingly fast in-process messaging.

Getting all four MQ products up and running was fun. There’s a definite overhead when you have to install a product based on a non-Windows platform. ActiveMQ required Java on the target machine and RabbitMQ needed Erlang. Both installed without a hitch, but I’m concerned that it’s another layer that can go wrong in production. I would be asking the infrastructure people to understand and maintain unfamiliar runtimes if we chose either of these. ActiveMQ, RabbitMQ and MSMQ all have server processes that need to be monitored and configured, another support concern.

ZeroMQ, with its brokerless architecture doesn’t require any server process or runtime. In effect, your application endpoints play the server role. This makes deployment simpler, but the worry is that there’s no obvious place to go looking when things go wrong. ZeroMQ, as far as I can tell, only provides non-durable queues. You are expected to provide your own auditing and recovery where you need it. To be honest, I’m not even sure that ZeroMQ should be in this test, it’s such a different concept than the other MQ products.

So without further chit-chat, here are the results. This is messages per second, for send and receive. During the transmission of one million 1K messages. The tests were executed a machine running Windows Vista.

As you can see, there’s ZeroMQ and the others. Its performance is staggering. To be fair, ZeroMQ is quite a different beast from the others, but even so, the results are clear: if you want one application to send messages to another as quickly as possible, you need ZeroMQ. This is especially true if you don’t particularly mind loosing the occasional message.

To be honest, I was hoping for more from Rabbit. However much one tries to be fair in these things, you inevitably have a favourite, and everything I’d read and heard about Rabbit made me think it was probably the best choice. But with these results, I’m going to be hard pressed to sell it over MSMQ.

If you’d like to run the tests for yourself, my test code is on GitHub here. I’d be very (no very very) interested in how the tests can be tweaked, so if you can get substantially better figures, please let me know.

33 comments:

fschwiet
said...

I don't know how mature that project is, but you may want to checkout out Ayende's RavenMQ (https://github.com/ravendb/ravenmq / http://ayende.com/Blog/archive/2010/11/09/raven-mq-ndash-principles.aspx).

The nice thing about RavenMQ is that it should deploy nicely in a .Net environment.

If ZeroMQ doesn't persist to disk, I'd be concerned if it didn't perform much better then the alternatives. Not a fair comparison.

I'm also interested in the admin interfaces associated to these products and functionalities like been able to see the status of my queues, ability to redeploy messages, etc. Did you have a chance to look into that?

When you do tests like this, what you are testing is the speed of theclient. This is because you have one producer and one consumer.

For any broker that is reasonably fast and scalable in being designedto handle more than one client, tests of this type will therefore shownothing much about the broker.

Client performance will be highly language dependent and network i/odependent. So, your results are not surprising.

ZeroMQ basically *is* a fast client for networking, that presents amessaging idiom. We like it too - see the posts about it on our blog.But by not having a broker, you may lose some of the decouplingbenefits eg delivery to a disconnected actor.

I would expect any decent broker to handle several 10k messages persecond, as well as providing additional capability missing from a 100%client implementation.

So, if your use case needs a really fast client and a broker, youmight want to use more than one of the technologies you looked at,together.

Thanks for that. Yes, as I said in the post, the ZeroMQ comparison wasn't really fair... but I did it anyway ;)

My client provides customer contact services in the travel industry, so they are effectively taking bookings and sending out emails/sms/letters etc after some processing. It's important that messages won't be lost due to server failure. For that reason alone I don't think ZeroMQ is really suitable, although there might be opportunities to use it elsewhere. I certainly agree that it's a very interesting technology.

I still think we will choose Rabbit. The flexible architecture and message routing combined with durability are compelling. Purely for my own ability to sell it to my customer, slightly better performance numbers would have been nice, ah well :)

On behalf of Alexis Richardson. (damn you and your flaky comment system blogger!)

Mike

Thank-you for your positive comments about RabbitMQ. The use case doessound like a good fit for a broker. If you want to use 0mq withRabbitMQ you can do that using the rmq-0mq bridge on github (see alsoour blog which describes it). In general I would argue that 'goodsupport for integration with other systems' is a plus point for RMQover MSMQ.

Let us know if we can help in any way.

If you want to show higher performance with RabbitMQ, please pointfolks at this, which is based on a use case in the games industry:http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2011-April/012321.html

Quote from that study: "6000 queues, 6000 channels each pub-subingtheir own queue .. 48k messages/sec ... It was quite remarkable to seethat kind of throughput."

I meant to catch up with you at #DDD this year but I ran out of time after chatting and chasing so many others. I'm planning on attending #DDD_Scotland though if your going to be there..

I wanted to try and get hold of you and ask you if you'd like to present for us on Lidnug, particulaly if you where to do your talk on Monads that you did at #DDD shout either myself @shawty_ds or brian @csharpzealot up on twitter, we'd love to have your wisdom imparted to our users..

You've missed the two big players in that market: IBM WebSphere MQ (or whatever they're calling it this week) and Tibco EMS. Personally, I find EMS to be an excellent product and, for software that can pay it's way, worth the price just for the reduced cost of support & maintenance and the good performance even for resilient (once-and-only-once) messaging.

@jasiek Beanstalkd is not really a message queue, it's more of a work queue. However you can implement a messaging system on top of Beanstalkd, I've done that and it works quite well. I wouldn't use it instead of 0MQ except if you need very large buffers (which you shouldn't) though.

What would be really useful is to do your benchmarking while the client is on another machine. I found the MSMQ is really slow with I/O and jumps the processing service up to 80% of cpu with only a couple hundred requests per second coming in.

It's silly to say "and no care for losing a message". 0mq does guarantee delivery of its messages... that's the whole point. You can deliver a message to server that hasn't started, wait for the reply, and get it. Changing our code from TCP to 0mq vastly improved both performance and reliability. That being said, it is more constraining because of it's low-level implementation. You'll need to check those return codes, etc. But for a little bit of extra work, 100's of times faster is worth it.

"Brokerless messaging" has been used in the financialy area for many years (called low-latency-messaging). There are various mature products such as Ibm WLLM and 29West (now Informatica), mostly outperforming 0mq with ease.

Newer Open Source implementations show that one can get into the millions of messages per second (with reliability). Mostly because cheap memory for buffers is avaiable now.

e.g. https://code.google.com/p/fast-cast/

The use case for brokerless messaging are clusters and high availability. Broker-based messsaging is hardly usable here due to its notorious slowness and network overhead.

Brokerbased messaging has its use when it comes to the "organizational" aspect of connecting systems in complex network topologies.

I'm looking for a persisting queue that caters to 1 pub & 1-3 sub and the performance should be a "pick two optimization" between latency, cpu load, amount of data/interval transferred.

So lets say I want to target max 5% cpu load on a single core system. If the loading would go above 5%, I could configure a preference to trade latency for the other two (eg. by limiting context switches).

If I wanted to keep cpu use <5% and prefer to keep low latency instead, the user would likely need to pro-actively categorize the messages into priorities or reduce the data by sending only eg. deltas of min/max/derivatives. The actual values would be streamed to disk and if the disk bandwidth was not enough, apply the reduction to data going to the disk as well.

Of course you'd want to also configure an alert if the amount of data processed/interval was enough to go within eg. 10% of the hard limits, so if CPU use was over 0,5% then you'd get an alert.

There should be a way to tag/switch messages or channels between reliability and maximum performance.

I haven't found no information whether You used persistent queues while testing MSMQ and RabbitMQ or not. If they were persistant then it is and explanation why ZeroMQ was faster. With persistance turned off both MSMQ and RabbitMQ are at least 10 times faster

Not an expert of message queueing technology but it seems ZeroMQ doesn't use a broker while the others do. This would mean a HUGE difference in performance. You should compare only non-brokers with non-brokers and brokers with brokers.