VoltDB is finally launching today. As is common for companies in sectors I write about, VoltDB — or just “Volt” — has discovered the virtues of embargoes that end 12:01 am. Let’s go straight to the technical highlights:

VoltDB is based on the H-Store technology, which I wrote about in February, 2009. Most of what I said about H-Store then applies to VoltDB today.

VoltDB is a no-apologies ACID relational DBMS, which runs entirely in RAM.

VoltDB has rather limited SQL. (One example: VoltDB can’t do SUMs in SQL.) However, VoltDB guy Tim Callaghan (Mark Callaghan’s lesser-known but nonetheless smart brother) asserts that if you code up the missing functionality, it’s almost as fast as if it were present in the DBMS to begin with, because there’s no added I/O from the handoff between the DBMS and the procedural code. (The data’s in RAM one way or the other.)

VoltDB’s Big Conceptual Performance Story is that it does away with most locks, latches, logs, etc., and also most context switching.

In particular, you’re supposed to partition your data and architect your application so that most transactions execute on a single core. When you can do that, you get VoltDB’s performance benefits. To the extent you can’t, you’re in two-phase-commit performance land. (More precisely, you’re doing 2PC for multi-core writes, which is surely a major reason that multi-core reads are a lot faster in VoltDB than multi-core writes.)

VoltDB has a little less than one DBMS thread per core. When the data partitioning works as it should, you execute a complete transaction in that single thread. Poof. No context switching.

A transaction in VoltDB is a Java stored procedure. (The early idea of Ruby on Rails in lieu of the Java/SQL combo didn’t hold up performance-wise.)

Solid-state memory is not a viable alternative to RAM for VoltDB. Too slow.

Instead, VoltDB lets you snapshot data to disk at tunable intervals. “Continuous” is one of the options, wherein a new snapshot starts being made as soon as the last one completes.

In addition, VoltDB will also spool a kind of transaction log to the target of your choice. (Obvious choice: An analytic DBMS such as Vertica, but there’s no such connectivity partnership actually in place at this time.)

I should also note that when Tim Callaghan described architectural options to get around 2PC performance issues, they sounded a lot like eventual consistency. Maybe tunable RYW consistency isn’t in the cards, but at least there’s a NoSQL-like possibility with VoltDB.

VoltDB’s open source strategy is:

VoltDB will be open sourced.

Community VoltDB will be GPLed. Professional Edition VoltDB has a non-GPL license.

The VoltDB Professional Edition won’t start out with features beyond the Community Edition ones, but will gain such later on. I didn’t get the sense the plans for those features were completely baked yet, but ideas mentioned included:

Management/monitoring tools.

Integration with expense closed-source enterprise software products, such as ones in the management/monitoring area.

Yet more “extreme”/edge-case performance.

Before VoltDB decided for sure that it wasn’t selling licenses, it sold a license to Getco, which also seems to be an investor in the company.

VoltDB had a beta test with about 150 participants. None is in production yet, although at least a few are clearly headed there. Most VoltDB beta testers are in some kind of online business, with a particular concentration in everybody’s new favorite market, online gaming. Most of the rest are in investment/trading — a major target market for at least three different Mike Stonebraker companies — and a few are in telecom. VoltDB assures me that some of the beta users are companies one actually has heard of before, but VoltDB is not in a position to name any of those.

VoltDB is not ideally suited for a classic order management system, since you’d want to partition both on CustomerID and SKU, the latter because you’d constantly updating inventory stock levels. However, this argument doesn’t apply in the case of virtual goods. Virtual goods that are sold for real money — and hence need ACID levels of transaction integrity — are thus a clear target market for VoltDB. (The example that came up was in, you guessed it, online gaming.) The other interesting use case that Tim highlighted was low-latency analytics/ELT. For reasons I didn’t totally grasp, Tim likes to call this “Stateful ELT.” (Given that the data goes into the VoltDB database before much else happens to it, I’m pretty sure I heard “ELT” correctly. But I guess I might have been mishearing “ETL”.)

VoltDB company highlights include:

VoltDB has about a dozen employees, all but two of whom are technical. (However, I’m not sure they’re counting Andy Ellicott against the two. But then, last I heard he wasn’t full time at VoltDB.)

VoltDB does in fact support SUM in SQL queries. We don’t suggest you SUM millions of values in a single query, as VoltDB is optimized for shorter, OLTP-focused transactions. However we offer some materialized view support to maintain a sum on a large table or on large “group by” clauses.

VoltDB does in fact support SUM in SQL queries. We don’t suggest you SUM millions of values in a single query, as VoltDB is optimized for shorter, OLTP-focused transactions. However we offer some materialized view support to maintain a sum on a large table or on large “group by” clauses.
+1

I immensely liked the original paper arguing for the transaction-confined-to-core architecture (“The end of an architectural era: (it’s time for a complete rewrite”).

This architecture gives me pause: queries have to be extremely partition conscious, otherwise they are working on possibly incomplete data. I’m sure I can get used to it, considering the performance benefits.

However, the present incarnation of the db (v 1.0) is a complete non-starter purely for its failure behavior. A failed node cannot restart and automatically join its peers. The entire database and all clients have to be restarted. Are you kidding me?

I benchmarked one of our workloads at myYearbook against VoltDB and most definitely validated their performance claims as far as single-partition accesses go. Thankfully, a good amount of our data can easily be partitioned, so no schema redesign is really necessary. The main drawbacks I see are lack of driver support and some missing SQL features. While we’re not yet running VoltDB in production, it’s definitely a promising system that I can see being adopted by organizations that are developing new OLTP-oriented applications.

Thanks for trying out VoltDB. We’re hard at work on future releases. If you want to let us know what driver support or SQL would be most helpful, please visit our forums and let us hear about your experiences: http://community.voltdb.com/forum.