RethinkDB: Rethinking the Database using Modern Assumptions

Over the past few months I’ve written about a collection of trends that are all pointing toward some new technology that didn’t quite exist yet. The perfect storm of cheap, commodity 64bit processors, abundant (64-128GB per server) and relatively inexpensive RAM, and faster storage. The MySQL community’s reaction to all of this has been a series of patches and hacks that aim to squeeze more performance out of InnoDB and, to a lesser degree, MySQL itself.

All of this work is still ongoing and some of it is fairly invasive–getting at some of the core assumptions present in both MySQL and the InnoDB storage engine. When such radical change are necessary, you start to wonder if maybe it isn’t time for a new design of some sort. That is exactly where the founders of RethinkDB are coming from too.

I recently had the chance to sit down with them over lunch in Mountain View, California and get their perspective on how these changes are influencing their design, understand their goals a bit more, and get a sense of how far along they are in development.

But before I really dive into things, there are a few points to keep in mind:

the team has only been working on this for a few months

the company is very small right now (intentionally)

their plans will change as the potential market reacts to their progress, so don’t take any of this as gospel

Normally I wouldn’t devote an entire article to writing about code that’s not yet available, but it fits in very well with a lot of the themes I’ve tried to hammer on and it’s addressing a need for which I believe there is some real pent-up demand.

It’s also worth pointing out that even in its early state of development, it’s working well enough to power the RethinkDB web site itself. That dogfood mentality no doubt will serve them well as the code moves closer to being feature complete.

The Big Idea

If you were interested in getting great performance for a typical high-volume web application using MySQL (think OLTP), with a 80% read and 20% write workload and could start with a clean slate, what would you do? If you were the RethinkDB team, you’d make some assumptions about that use case and how the engineers building the web application would like to solve it.

First you’d expect the majority of data, or at least the working set, to fit in system RAM. What’s left over would ideally live on a storage-class memory in the form of anSSD, Fusion-io “drive”, or similar technology. This choice has a pretty dramatic impact on the complexity of the code and the data structures you’d choose for various parts of the implementation.

One example is how data is flushed to disk. InnoDB makes attempt to find adjacent dirty pages in the buffer pool and write them to disk in order, thus preventing unnecessary disk seeks. But with most storage-class memory, that’s a non-issue. In fact, you pretty quickly are able to convince yourself that the log-oriented approach used byPBXT (see previous coverage) isn’t such a bad idea.

Once you’ve come to that point, you realize (as Paul did) that some of the data consistency requirements are simplified as well. That means less code and complexity (and less to go wrong).

What about data sets that are far larger than the available RAM? Sharding. Expect that users who really care about performance are going to figure out how to split their data among servers so they can continue to scale without buying big iron.

Lies, Damned Lies, and Statistics

One of the more interesting things that came up in our conversation really highlighted how some of the problems with MySQL and “seekless” storage systems live well above the storage engine layer. The RethinkDB team found that they had to lie to the query optimizer in order to get it to do what they wanted.

You see, normally MySQL parses a query, figures out which indexes may be useful, and queries the underlying storage engine to get information about those indexes in order to produce an execution plan. But the optimizer doesn’t realize that you may have a RAID-0 array of very fast SSDs that make previously poor query plans. And, worse yet, there’s no way for the DBA to provide it with hints in that direction.

So they just lied to it, providing somewhat bogus statistics that trick MySQL into choosing an execution plan that would have otherwise been considered far from optimal.

Speaking of statistics, the RethinkDB team recently posted some benchmark charts that show how their code performs against both MyISAM and InnoDB. While it’s just the first of what is likely to be a never-ending set of benchmarks, the performance they’ve shown certainly justifies the approach they’re taking. It will be interesting to see the performance in future test with high write loads and larger data sets that require much more disk use.

Business and Licensing

You can download RethinkDB binaries using the links on RethinkDB Wiki, but the source code is not publicly available today. Being a young company, they’re still working through the issues of how best to build a business around a MySQL storage engine. Should the code be available to everyone under the GPL (or compatible license)? Should it be given to customers only? Should they charge customers licensing fees, or just for support and future development requests?

None of these are small issues and numerous companies have been down this very road before and made differing choices. It is my hope that the can strike a balance between being able to make a thriving business whlie also building an ecosystem of developers and users around their code (that the intrinsic credibility that brings).

Time will tell, but at the moment it looks like they’re re-thinking of a MySQL storage engine could turn out to be a real player in the future.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62