Mea culpa. It’s way too easy in our industry to set up and knock
down strawmen, as I did, than to convey messages of objective and
constructive criticism. It’s also too easy, when you are passionate
about what you believe in, to ignore the feelings and efforts of
others, which I did. I have great respect for the engineers I have met
from 10gen, Mathias Stern and Kyle Banker. They are friendly,
approachable, helpful and fun to socialize with at conferences. Thanks
for being stand-up guys.

Also, whether we like it or not, these kinds of public embarrassments
have rippling effects across the whole NoSQL ecosystem. While Basho
has tried to distance itself from other players in the NoSQL field, we
cannot deny our origins, and the ecosystem as a “thing” is only about
3 years old. Are developers, technical managers and CTOs more wary of
new database technologies as a result of these embarrassments?
Probably. Should we continue to work hard to develop and promote
alternative data-storage solutions? Absolutely.

Making it constructive

For better or worse, many people consider MongoDB and Riak to be
competitors. In reality, there are very few similarities between the
products. Then why are they in competition? I personally believe this
is because we have largely targeted our products at the same group of
developers, those who work on web applications. So let’s take a moment
and clarify the primary differences — both for understanding the
technologies themselves and for unmuddying the current hoopla.

If I were asked why someone would use MongoDB, there are two clear
reasons in my mind:

MongoDB is fast. Say what you will about its durability (the
context of my comment from JSConf) and the global write-lock (a
consequence of its design, unfortunately), both writes and
reads tend to be of low latency. Why? They are mostly in memory
(via mmap).

MongoDB has very friendly APIs for developers. This is its biggest
strength in my mind. Despite other things you would want to address
before going to production, developers love to think of their data
as lightly-structured documents. It just makes sense.

In contrast, Riak’s strengths appeal more to operations folk, and
developers who are cognizant or experienced in production operations:

Riak is distributed and replicated at its core. There are no
special nodes or services to run to scale out, every node you start
and join acts equally among the cluster.

Riak has a strong focus on availability and durability in the face
of failure. It will gladly sacrifice raw speed and consistency for the
sake of staying available to your write load and making sure your
writes get to disk.

These differences are fundamental design decisions and have associated
trade-offs. Because MongoDB’s design focus is to be a fast
single-system database, other elements of its scale-out story are
necessarily more complex — sharding, replica sets, etc. Because
Riak’s focus is on distributed fault-tolerance and reliability, it
necessarily sacrifices raw single-system performance. That’s not to
say that MongoDB can’t scale out to large clusters well, or that Riak
performs poorly in production, it is simply a recognition of the
sacrifices necessary when designing a database system that addresses
specific needs.

Could Urban Airship have used Riak instead of MongoDB for their
bounded, in-memory dataset? Maybe. Would it have worked better for
them than MongoDB? That is really difficult to tell.

Bringing it back around

Now, if I’m so buddy-buddy with the 10gen guys, why did I say such an
inflammatory thing in the first place? At Basho, we spend a decent
amount of time evaluating and comparing other technologies so that we
can understand where we stand in the market, to learn from others’
perspectives, and to address the concerns and demands of potential
customers. Naturally, this means we have examined MongoDB
closely. MongoDB’s visibility, popularity, and developer-friendliness
are things to be respected, even if we criticize the engineering
decisions made by 10gen.

Shortly before JSConf, I had personally spent some time finding out
ways to demonstrate that MongoDB will lose writes in the face of
failure, to be used in a competitive comparison. Let’s just say that I
was successful in doing so, despite recent improvements that 10gen has
made. Unfortunately, I am not at liberty to share the results, nor do
I think it would be constructive to this discussion. I’m sure 10gen
has its own collection of competitive comparisons that are designed to
shed a positive light on their product in contrast to Riak, it’s just
how business works.

We also both know our system’s weaknesses and are working hard to fix
them. 10gen’s most recent releases have demonstrated this fact, as I
believe Basho’s recent releases have as well. (Have you tried out Riak
1.0? It’s awesome.)

So what now?

The honeymoon phase of NoSQL is over. Will 10gen make the hard
decisions it needs to make MongoDB is easier to scale out and have
greater durability, while maintain its reputation for snappy
performance? I believe they will. Will Basho improve Riak’s
developer-friendliness and raw performance, while maintaining its
reputation for simplicity and reliability in operations? I have no doubt.

So instead of gloating over each others’ failures, let’s toast to the
challenges and all become stronger, more proficient, and more
successful as a result.