I am Anders Karlsson, and I have been working in the RDBMS industry for many, possibly too many, years. In this blog, I write about my thoughts on RDBMS technology, happenings and industry, and also on any wild ideas around that I might think up after a few beers.

Monday, August 27, 2012

Fast and furious!

A few days I wrote a bit on my first results of comparing MySQL with MongoDB as a Key-Value Store, something that has been going on for way to long, but I am not finished yet. Last time I used MySQL Embedded Library to bypass the MySQL Client Server protocol to see what the overhead was, and the result was that it is big (and again, note that the same networking was used with MongoDB and I was also using Unix Domain Sockets, as well as plain TCP/IP, so don't ask me to fix any network issues I might have). Using Embedded Server with InnoDB was actually faster than using MongoDB, some 3 times faster compared to using the client / server protocol.

That one out of the way, I now wanted to see what I could get if I used the storage engine that was fastest in Client / Server mode, MEMORY. That took a while to fix, as to have an Embedded Server application, like my test application here, use the MEMORY engine, I have to load the data into the MEMORY table somehow each time I run the application. No big deal but a slight update to my benchmarking application was needed, as well as some debugging as embedded server is pretty picky with you doing things the right way and in exactly the right order, and is much less forgiving than the MySQL Client library. Anyway, I now have it fixed, and the result. Fast. Real fast and furious: 172 k rows read per second! Compared to 110k rows read per second with MongoDB (but that is MongoDB in Client Server mode of course). Using the MySQL Client, the MEMORY engine achieved 43 k row reads per second, which means that libmysqld is 400% faster! How is that for a performance improvement.

Which is not to say that we all should start building libmysqld applications right now. But what I want to say is that if you want to improve the performance of MySQL, looking into the Client / Server protocol would be a good starting point, there is a lot of performance to get there. The results noted could be interpreted as at least 75% of the time that MySQL processes a query, excluding disk I/O (this is the MEMORY engine after all), is spent in the Client / Server protocol. And looking at it differently: A key value store such as MongoDB might not be as fast as we think, but MongoDB sure does have a more efficient C/S protocol!

7 comments:

I don't know how slow the libmysqld API is, I guess I could bypass that one also. But this exercise was less about proving haw fast libmysqld is and more about proving that the bottleneck was in the MySQL Client. So even though linmysqld might be slow, it's fast enough to prove my point.

Regarding the slowdown that you experienced with MySQL client-server protocol, what were the values for net-buffer-length and max-allowed-packet in your previous measurement(s) for this benchmark with MySQL ????

Last results show that the performance difference isn't that much (some 10% only). Now we are talking engines only. If we are talking sharded MongoDB, things are different and I haven't looked at that. What I DO plan to test is MongoDB vs. MySQL when stuff will not completely fit in memory (the plan isto test with bigger rows, not more, using documents in MongoDB and BLOBs in MySQL).