Archive for the ‘Introductory’ Category

So today I would like to talk about some aspects of RabbitMQ's
performance. There are a huge number of variables that feed into
the overall level of performance you can get from a RabbitMQ
server, and today we're going to try tweaking some of them and
seeing what we can see.

One of the problems we face at the RabbitMQ HQ is that whilst we may
know lots about how the broker works, we don't tend to have a large
pool of experience of designing applications that use RabbitMQ and
which need to work reliably, unattended, for long periods of time. We
spend a lot of time answering questions on the mailing list, and we do
consultancy work here and there, but in some cases it's as a result of
being contacted by users building applications that we're really made
to think about long-term behaviour of RabbitMQ. Recently, we've been
prompted to think long and hard about the basic performance of queues,
and this has lead to some realisations about provisioning Rabbits. (more…)

From time to time, on
our mailing
list and elsewhere, the idea comes up of using a
different backing store within RabbitMQ. The backing store is
the bit that's responsible for writing messages to disk (a message can
be written to disk for a number of reasons) and it's a fairly frequent
suggestion to see what RabbitMQ would look like if its own backing
store was replaced with another storage system.

Such a change would permit functionality that is not currently
possible, for example out-of-band queue browsing, or distributed
storage, but there is a fundamental difference in the nature of data
storage and access patterns between a message broker such as RabbitMQ
and a generic database. Indeed RabbitMQ deliberately does not store
messages in such a database.
(more…)

In the past year development of the AMQP gem was practicaly stagnating, as its original author Aman Gupta (@tmm1) was busy. A lot of bugs stayed unresolved, the code was getting old and out-dated and no new features or documentation were made.

At this point I started to talk with the RabbitMQ guys about possible collaboration on this. Actually originally I contacted VMware when I saw Ezra Zygmuntowicz looking for people to his cloud team, but when I found that VMware recently acquired the RabbitMQ project in London, I got interested. I signed the contract, switched from script/console to Wireshark and the RabbitMQ Tracer and since November I've been happily hacking on the AMQP and AMQ-Protocol gems.

To introduce myself, my name's Jakub Stastny (@botanicus) and I work as a Ruby contractor. I contributed to such projects as RubyGems, Merb and rSpec and I wrote my own framework called Rango, the only Ruby framework with template inheritance. I work with Node.js as well and I created Minitest.js, BDD framework for testing asynchronous code. My other hobbies are photography and travelling.

I asked Aman if I can take over the maintainership over the AMQP gem and he was happy to do so. At this point other two guys, Michael Klishin (michaelklishin) and Ar Vicco (arvicco) showed interest in the development, so we created ruby-amqp organisation at GitHub and forked the original code there, as well as a few other related repositories. The GitHub guys were happy to make our repository to be the main one, instead of just a fork, so since now, everything will be there (except the old issues which are still on tmm1's fork and which we want to solve and close soon).

Soo What's New?

Test Suite

At the beginning, there were barely any tests at all, so it was basically impossible to tell if the changes I made break something or not. So I started to write some. In the later stage, when michaelklishin and arvicco joined the development, we rewrote the few original Bacon specs to rSpec 2 and now arvicco is porting his specs which he happened to write some time ago to the main repository. Arvicco has also written amqp-spec, superset of em-spec for testing the AMQP gem.

AMQP 0.9.1

Currently the gem speaks only AMQP 0.8, which is more than 2 years old version, so probably the most important upcoming feature is support of AMQP 0.9.1. Because this is something what can be beneficial for other clients as well, I decided to create a new library called AMQ-protocol. It's using rabbitmq-codegen as many others client libraries.

One of the main goals of this gem is to be really fast and memory-efficient (not for the sake of memory-efficiency itself, but because the garbage collector of MRI is quite weak). I'm about to create some benchmarks soon to see if the performance is better and how much.

AMQ-Protocol is still work-in-progress. It works, but it still needs some polishing, refactoring and optimizations, as well as documentation and tests.

Other Changes

I fixed a lot of bugs and I merged all the pending pull requests to the main repository. I'm going to write more about the changes once I'll release AMQP 0.7. I released 0.7.pre recently, you can try it by running gem install amqp --pre, which would be greatly appreciated. As the work on the test suite is still in progress now, the release process is kind of russian roulette at the moment.

Backward compatibility

I fixed quite a few bugs and obviously the fixed code is never backward-compatible with the old buggy one. One of the major changes is that MQ#queues (as well as MQ#fanouts etc) is not a hash anymore, but an array-like collection with hash-like behaviour. It does NOT override anonymous instances when another anonymous instance is created (as it used to do before) and it does support server-generated names. So instead of MQ#queues[nil] = <first instance> and then MQ#queues[nil] = <second instance>) it now just adds both instances to the collection and when it receives Queue.Declare-Ok from the server, it updates the name to it.

Future plans

The AMQP gem is very opinionated. If you don't want to use EventMachine, you're out of luck. You might want to use something more low-level like IO.select or just another async library like cool.io. You might not even want to care about the asynchronous code at all.

It'd be great if we could have one really un-opinionated AMQP client library which only job would be to expose low-level API defined by the AMQP protocol without any abstraction like hidding channels etc. Such library would be intended for another library implementators rather than for the end users. AMQP is a complex protocol and because of some design decisions it's pretty hard to design a good and easy-to-use (opinionated) client library for it. So some basic library which doesn't make any assumptions would help others to play around and try to implement their own, opinionated libraries on top of this one without the need to manually implement the hard stuff like encoding/decoding or basic socket communication.

Questions? Ideas? Get in touch!

Are you interested in the AMQP gem development? Do you want to participate or do you have some questions? Feel free to contact me, either by comments under this blog post, or you can drop me an e-mail to [email protected] or drop by to Jabber MUC room at [email protected] where all the current maintainers usually are. And for all the news make sure you are following me on Twitter!

RabbitMQ needs more and better documentation. (And who doesn’t?) In particular, we need more and better introductory material that introduces the reader to various basic concepts, explains why they’re important, and motivates him or her to keep reading and learn more about RabbitMQ. Here’s a cut at Chapter 1 of that introduction. Your comments are welcome, and Chapters 2 and 3 will follow soon.

(You probably already know all of this, but a surprising number of people don’t. This introduction is for them.)

The Old Future

Long, long ago, the American science-fiction writer Isaac Asimov imagined a future world in which one single giant computer, “Multivac,” would control all of mankind’s affairs. Information would flow in from people and businesses and governments across the globe, and Multivac would store it and process it, and send exactly the right important new information right back out. All sorts of futuristic questions would pour in from our future selves, and the right futuristic answers would just pour back out. This future was a great place!

And our present-day world isn’t all that different from Asimov’s future, just without all that shininess. We’ve got the Internet, and it connects people and businesses and governments all over the globe, and information flows in, and information flows out, and questions pour in, and answers pour out. We’ve got our Googles and our Amazons and our eBays and our Facebooks, and our lives keep getting better every day. More and better information; more and better storage and processing; more and better answers.

But Asimov was only a lowly Ph.D. chemist turned science-fiction writer, not any sort of real Computer Scientist like we have now, and he never worked out all (or, really, any!) of the technical details of exactly how you’d build that one giant, all-knowing, all-powerful Multivac at the North Pole, and exactly who’d pay for it, and exactly what uses they’d allow, and so on. He left that part for future generations to figure out, if in fact they could. And as time has gone by, it’s also turned out that any one single computer that anyone can buy at the computer shop down the street is still several orders of magnitude too small and too weak to control all of mankind’s affairs. That’s the bad news.

The New Future

The good news, which Asimov didn’t anticipate (ha!), is that computers here in the future are cheap—almost dirt cheap, being made largely of silicon, which is after all just processed dirt. So if any one computer you can buy at the shop (or rent on the cloud from Amazon, or whatever) has a million times too little storage or processing power for what you want to do to or for mankind, just get a million of them and plug them together! (Some assembly required.) Google is close to doing just that—just as soon as it completes its takeover of the North Pole—and everyone else is trying to follow close behind. Google’s got its own computers to execute its plans for the world, and Facebook’s got its own computers and its own plans too, and the CIA too, and your company or organization too, and everyone cooperates and competes in controlling all of mankind’s affairs. Our old centralized computer systems couldn’t possibly grow big enough, so we’re replacing them with shiny new distributed systems that could presumably grow bigger forever. And our lives keep getting better every day.

But getting your million computers (or even just a thousand, or even fewer!) to work together on their assigned tasks isn’t as easy as it might sound to your upper management. One given server computer might crash once a year due to bad hardware or bad software or bad power or bad whatever—and that’s usually being pretty optimistic. If you have only a thousand server computers, one will crash on the average every 9 hours; if you had a million, one would crash about every 30 seconds; if you had a billion—which not even Google has yet—about 30 would crash every second, and good luck getting the remainder not to crash or otherwise go bonkers too! One centralized computer can be either up or down, and that’s it, but a distributed computer system is more likely to be 99% up and 1% down at any moment, and the 1% that’s down keeps shifting around and further confusing the other 99%. Problems in distributed systems are unavoidable, and they can multiply without bound. Welcome to the future!

You may have just a thousand computers so far, or maybe just a hundred, or maybe even just ten or so, but you’re still going to have problems and bad things are still going to happen. Crashes are one obvious cause, but lost messages or misconfigured systems or subtle race conditions all add to the error rate too. If you can’t think of half a dozen more potential problems with large distributed systems, you’ve probably never built or operated one. It would be impossibly hard to build that one giant Multivac at the North Pole, but it might be even harder to figure out exactly how those zillion smaller computers that you buy instead will ever work together. What to do?

Perfect Reliability* (*Not really)

There’s a great saying: If you ever see a computer system described as “reliable,” look for the asterisk and the footnote that says “Not really.” Perfect reliability is impossible to achieve. Put your computers in an expensive data center in California, and one sufficiently large earthquake can knock them all out. Spread them out across a bunch of expensive data centers on different continental plates, and you just need a few more earthquakes (or tsunamis, or whatever) to knock enough of your computers (or network links, or whatever) to render the others useless. Enough natural or man-made catastrophes can ruin anything, and they can happen a lot more often than you might think—especially the man-made ones! That’s the bad news.

The good news is, while you can’t build perfectly reliable systems, you can build systems that are reliable enough, whatever that happens to be. That is, you can build computer systems that are arbitrarily reliable. You can ensure that if enough of your computers are up and connected and working correctly, then the system as a whole will continue to do the right things, and that even if more fail, then the system as a whole still won’t do anything wrong. (It might not do anything at all, but that’s life.) If you want more reliability, you can buy more computers (maybe a lot more) and connect them properly. If you know how.

Cargo Cults and Banks

Unfortunately, much of the time, it seems that our distributed-system needs are growing faster than our expertise. Distributed systems are hard to build and they may never become all that easy. Right now, it’s often all we can do adopt best practices—to look at distributed systems that got it right, and try to figure out why they succeeded, and to try to duplicate their success. It’s a little like running your own cargo cult, but without all the coconuts.

Banks are in many ways an excellent industry to study and perhaps to imitate. Banks (and other financial institutions) can clearly care very much about reliability, and banks have been building pretty large, pretty reliable distributed systems for some while now. Banks today tend to build their reliable distributed systems atop reliable message-queuing systems, and they’ve even developed an open standard for such message-queuing systems, and that’s worked out pretty well for them, and that’s what we’ll look at next.