I wanted to try something from the "NOSQL" camp and it looked like MongoDB is one of the darlings of the whole idea / "movement". If this is really representative of the whole group, it's very, very unimpressive. (update: there are better ones )

I have created a port for FreeBSD 8 for MongoDB but I won't be using it for anything because of these things I found out about it:

MongoDB has no, worth repeating: NO provisions for on-disk consistency. It doesn't fsync, it doesn't use algorithms that would ensure some degree of safety (e.g. journalling), nothing. It does lazy writes! Which means that in case of server or process crashes (which will happen with 100% probability) the whole database could very trivially be corrupted beyond repair. For someone like me who comes from the "real" database world, this makes it unusable.

Its performance sucks. MongoDB uses mmapped files for databases (i.e. it creates a large 32, 64, etc. MB files which it mmaps and works on the memory region). This kind of architecture is, at least in theory, excellent for many reasons, but apparently MongoDB gets only very mediocre performance out of it. Considering that it is, from the operating system's view, a memory database not unlike memcached, the performance I got from it - around 11,000 simple INSERTs per second (with the Python client) - is inadequate.

Apparently, the developers are willing to work on solving the data consistency issue (which will probably mean they'll have to abandon the pure-mmap approach or at least make very careful use of fsync()s) some time in the future, but I am puzzled by the low performance. Maybe the BSON overhead is too large?

Replication (which MongoDB has built-in) is not a replacement for on-disk consistency, for much the same reasons because RAID is not a replacement for backups and vice-versa: they simply solve different problems. To be fair, MongoDB's site does contain a page which lists usages which it isn't well suited for - which will hopefully help users who are not aware of these problems.

Of course, MongoDB isn't the only NOSQL database - OTOH there is CouchDB which apparently takes data integrity seriously, but opts for the lazy approach - perpetual contiguous journalling, which means old free space is never automatically reclaimed. And it uses a text protocol for exchanging data! Sigh... maybe I'll try it if it ever reaches version 1.0.

The whole "document store" thing looks a little too rudimentary, considering that PostgreSQL has an arbitrary key-value "document" field and also an XML field type, both of which are "smart", searchable and indexable, with impeccable data consistency, performance and even some modes of replication.

#1 Re: A short time with MongoDB

Added on 2009-11-07T05:31 by Sam Baskinger

Hey Ivan,

I don't think your experience with Mongo is representative of where the
NOSQL groups are going. The idea is to have fast access by having
horizontial scalling and replication. That also "solves" the disk write
back consistency issues. Hope that gives you more "hope" in the movement.
:) The magic is in the distribution of the system more than the single node
instance.

#2 Re: A short time with MongoDB

Added on 2009-11-07T11:39 by Ivan Voras

It makes sense on one level, but I think it just opens another question
- who is it for? Today, it is easier (cheaper) to get more hard drives than
to get more servers - not everyone is Google :)

What I'm saying basically comes down to the question: "what is the
disaster recovery scenario?" - and the minimum "disaster" here is a
room-scale power outage which will take out all servers within the room
thus, in case of MongoDB, corrupting all replicated instances. Since the
on-disk state is never consistent during regular operation (since there are
no consistency algorithms used neither is the OS memory image of the cached
files), you cannot create reliable backups.

It looks like the whole thing relies on a large number of possibly
geographically distributed servers so that never ever will all of them be
down at the same time. Again, maybe possible for Google but I feel it's
absolutely not for 90% of current users of MongoDB.

#3 Re: A short time with MongoDB

Added on 2009-11-11T12:31 by Ivan Voras

Also, all users who think data consistency via memory-only (or
lazily-written) databases is so easy, they should ask themselves - why
doesn't Google (which for all purposes has nearly-unlimited hardware
resources for replication) doesn't do it this way :)

#4 Re: A short time with MongoDB

Added on 2009-11-14T18:19 by Emil Eifrem

Neo4j (http://neo4j.org) is taking a different route. We have a robust
transactional core, which has been running in 24/7 production since 2003.
We support XA-protocol transactions (yes, even including, -gasp!-, 2PC!),
deadlock detection, transaction recovery, JTA, online snapshot backups,
MVCC, etc.

Now we're rolling out replication and next in line is auto sharding.
It's the other way around from a lot of NOSQL projects, but we feel that
it's better to start with a solid transactional core and then go from
there.

--

Emil Eifrem

http://neo4j.org

http://twitter.com/emileifrem

#5 Robust Storage

Added on 2009-11-14T18:28 by Chris Anderson

Ivan,

Thanks for making the point that we call them databases because we want
to get back what we put in.

CouchDB uses pure tail-append storage with compaction because it is the
simplest to know is reliable. As long as your disk
respects fsyncs (or doesn't reorder writes) there is no possibility of
CouchDB crashing in an inconsistent state.

We could implement in-place compaction with a free-space map but I
honestly don't think it's worth it in terms of complexity. We're aiming to
be the Honda Accord of databases, so simplicity is a prime goal.

Also, don't underestimate the performance benefits of pure tail-append
storage (especially as we write binary attachments in parallel, so we can
handle a lot of concurrent uploads.) Both SSDs and spinning platters have
highest throughput (and I'd venture to say longer mtbf) with contiguous
writes.

#6 Re: Robust Storage

Added on 2009-11-14T19:05 by Ivan Voras

I'm aware and fine with append storage as reliable method of storing
data, but most serious implementations choose to limit the use of this
method to small journalling-style logs instead of their core data set :)

Orthogonally to this: linearily written journals (of which append-only
files are just a subtype) are not the *only* way of ensuring consistency,
but have developed to be extremely popular with spinning disk media because
it's the fastest way. The alternative is to do a lot of fsync-ed scattered
writes all over the place, which while exactly as safe (or actually safer in
some regards), is slow. SSDs will
of course change this.

As FreeBSD's "softupdates" technology demonstrates, you don't
specifically need a journal-like structure to achieve either consistency or
performance in real-life file systems as long as the assumption of no write
reorders by the drives (or the controller) is true. It's just noticably
easier to do journalling than proper algorithms to order scattered writes
so that used structures (metadata, indexes, etc.) maintain internal
consistency in absolutely every point in time. Though this may sound
insanly hard, it has been done :)

#7 Re: Robust Storage

Added on 2009-11-14T19:06 by Jonathan Ellis

Cassandra fsyncs, either every-N-ms or before acking a write,
configurably.

#8 Re: Robust Storage

Added on 2009-11-14T19:26 by Jan Lehnardt

Hi Ivan,

Chris just said the append only log is easiest to get right. Wikipedia
(heh) says getting softupdates right is hard
(http://en.wikipedia.org/wiki/Soft_updates)* :)

And I'm not sure just because nobody treats the whole DB as a log it is
a bad idea :) There are disadvantages (2x disk spaces needed for
compaction) that can be solved with more complexity. But then, simplicity
rules. I know of a high-volume production CouchDB setup that happily
compacts / garbage collects more than 2^15 times a day.

Cheers

Jan

--

* I'm a huge fan of the *BSDs.

#9 Re: Robust Storage

Added on 2009-11-14T19:37 by Ivan Voras

It is true that universally, the simplest working solution is almost
always good enough and worth pursuing, so I'm not going to nag CouchDB
developers to suddely switch to another data model :)

On the other hand, most of the uses I have that could benefit from a
document store like CouchDB also tend to have the property of data being
often updated, so I'll just have to try and see if it starts to waste
unacceptable amounts of disk space.

OTOH, there's still all this "we shall parse and construct JSON at
every data query and entry" thing... you wouldn't believe how much
performance can be gained from using a sane binary parsable format as
opposed to a text format. I know, I did it for a (different) project.

#10 Re: Robust Storage

Added on 2009-11-14T19:47 by Jan Lehnardt

Totally, JSON is not the most optimal way, but it is, again, simple.
Worth noting too is that a JSON parser can be faster than the much hailed
protocol
buffers: http://blog.juma.me.uk/2009/03/25/json-serializationdeserialization-faster-than-protocol-buffers/

Other comparative benchmarks with CouchDB suggest that neither (the
perceived as "slow") HTTP and JSON are a real bottleneck for operation as
we get disk bound soon enough.

We have a (different) production setup where we "shard by time" (a
database a day). This keeps the actual GC-process manageable (there are
other reasons for the setup, this is just a nice side-effect). But that
adds complexity one level up from CouchDB and you might just not want to
pay it.

Anyway, I don't want to turn this into a CouchDB lecture. The take-away
lesson is that NoSQL comes in a lot of flavours and one solution is not
representative for any other :)

Thanks for the worthwhile discussion!

Cheers

Jan

--

#11 Re: Robust Storage

Added on 2009-11-14T19:55 by Ivan Voras

Thank you!

#12 Re: Robust Storage

Added on 2009-11-14T20:38 by Sammy

I've been using MongoDB in production and getting around 55k inserts per
second. Not sure why your performance isn't great, but might be more
helpful to ask questions and try to help rather than just criticize.
From what I've seen, the database often out performs some of the
drivers. So depends on which driver, how many indexes, etc...

As for durability, server process crashes don't corrupt anything since
the files are still in the OS's memory. On a os or power lossage,
there is a risk of loss/corruption of course. However, in my
experience this is the exception, and the much more common problem is
disk failure, which is why i'm happy using replication (lan and wan)
for durability.

#13 Re: Robust Storage

Added on 2009-11-14T20:44 by Sammy

Also - one of the reasons I switched to MongoDB was because i lost some
mysql data with innodb because of a power failure. I trusted innodb
too much, and ended up losing a couple hours of data. Obviously my
fault for not having enough replication, etc.. but the point is single
failure durability is kind of a dangerous thing to rely on.

#14 Re: Robust Storage

Added on 2009-11-14T20:46 by Ivan Voras

Re: performance: I've added that I've used the Python driver, for what
it's worth.

Re: what is a "common" problem and what isn't: once is enough :)

The comments on this post are public - if MongoDB developers can
benefit from them, that's great!

#15 Re: Robust Storage

Added on 2009-11-14T20:48 by Sammy

I'm using python as well. Wondering if its a freebsd thing.
Would love to see your benchmark code. Would you mind
publishing it?

#16 Re: Performance

Added on 2009-11-14T20:56 by Ivan Voras

Sure, it's a slight modification of one of pymongo examples. See here:
http://ivoras.net/stuff/bigtest.py . Now that you mentioned how much better
performance you get, I see that most of CPU time is spent in python, not in
mongodb. I am using the C BSON extension (_cbson.so) so it really might be
a bad client driver. What result do you get from my test?

#17 Re: Performance

Added on 2009-11-14T21:09 by Sammy

On my desktop:

28101.45 inserts/sec

python: 100% cpu

mongod: 45% cpu

Have you run a similar test on another db that has gone faster?
If so, we should send the mongo python guys the source for that
driver to figure out how to make it faster :)

#18 Re: Performance

Added on 2009-11-14T21:45 by Ivan Voras

It's curious how you got more than 2x performance I did, but it's still
kind of on the low end :)

I'm running python 2.6, a 64-bit OS, 2 GHz Core2 CPU, disk drives are
not important since there is no disk IO during the test, mongodb 1.1.3 and
pymongo 1.1.

I haven't tested it on other workloads - there may be cases where it is
faster, but currently I don't have the time to experiment more with it.

#19 Re: Robust storage

Added on 2009-11-14T22:01 by dwight_mongodb

MongoDB does not use a transaction log but we have found in practice
that this works just fine -- lots of sites using in production without
problems. I think an analogy to MySQL and innodb/myisam is
appropriate. The MySQL web site says myisam, which has similar
durability to mongodb, can be many times faster than innodb. That is
the idea with mongo. We think one size fits all for databases is
over: if one is building a bond trading system i would use a different
tool.

In the past also i have had several situations where i lost data
with mirrored drives and a logging database when drives began to have
hardware errors. My experence is that redo logs with fsync are not
enough to achieve true durability.

#20 Re: performance

Added on 2009-11-14T22:06 by dwight_mongodb

In general we hear (and see ourselves) very good things on performance
with MongoDB. It is very likely in your benchmark the Python client
was the bottleneck. Also i have never tested it myself with freebsd
there could be some issue there.

It will not be as fast as memcached - it is not a pure key/value store
and addition of ad hoc queries, secondary indexes, sorting, replication,
etc. does have some overhead. However for many problems it can be
much faster than postgres, depending on the problem.

Also think there are a lot of other interest properties beyond
performance, such as easy development from object oriented programming
languages, and horizontal scalability.

#21 Re: performance

Added on 2009-11-14T22:09 by dwight_mongodb

@Ivan curious what are you comparing to s.t. 28k inserts/second is slow
("low end")?

#22 Re: performance

Added on 2009-11-14T22:23 by Ivan Voras

@dwight_mongodb: "low end" - I've said before: I'm comparing it to other
memory-only databases, like memcached. The reason why is that since there
is no disk IO, everything happens completely in memory. When the OS is not
told to flush data to the drives, mmaped memory is on the low level not
different than anonymously allocated memory.

I suppose you will not agree with me because with MongoDB there is
still the *option* of having the data on the disk eventually and MongoDB
data can be more complex (BSON), but still - 28 kops/s for memory databases
is what was achieved in the era of pentium 3s.

#23 Re: performance

Added on 2009-11-14T22:33 by dwight_mongodb

@ivan - yes we disagree -- i feel like your argument is "mongodb is
slower than memcached therefore i will use postgres" :-)

i'm actually a fan of memcached - if that's all you need it is a great
tool.

#24 Re: performance

Added on 2009-11-14T23:08 by Ivan Voras

I'm saying that, since it already abandoned on-disk consistency, MongoDB
could be a lot faster than it already is :)

I'm not against databases such as MongoDB but I feel it, in particular,
has missed some opportunities to make it better. It could have gone for
data consistency but it didn't, but at the same time it appears not to have
taken advantage of this decision to bulk up on performance.

I just couldn't resist now that this discussion is ongoning and did
another experiment: I've created an equivalent python script that inserts
the same records in a PostgreSQL database using the flexible ("document")
key-value data type (hstore), and got the same performance as I did with
MongoDB (around 11,000 INSERTs/s :) Though I cheated a bit: I used
autocommit but disabled synchronous_commit, meaning the logging is still
fully active and all consistency guarantees are there, but the last few
seconds of committed transactions could be lost.

Again - I'm not trying to say the whole concept of MongoDB is bad, but
that the implementation could be better in some specific places :)

#25 Re: performance

Added on 2009-11-15T02:59 by Sammy

Just to be clear, there is a big difference between 28k ops and 28k
inserts.

I think you're maxing out the wrong thing with this test. The
case is so simple, that with just about any database your'e probably maxing
our CPU <->RAM, and database architecture probably isn't even a
factor. (This is why benchmarks are so hard to do well).

Would be curious to try your postgres code on my box as well though.
Would you mind posting that as well?

#26 Re: performance

Added on 2009-11-15T03:16 by Ivan Voras

Have fun with it: http://ivoras.net/stuff/ppg.py . Instructions are in
the file.

Don't make too large a deal from it - I'm just saying that to get
nearly as complex as PgSQL (which has a lot of overhead in this case: e.g.
SQL parsing, transactions, etc), MongoDB has lots of room to grow.

#27 Re: performance

Added on 2009-11-15T03:17 by Sammy

i tired the same test with java btw, and got 51692
inserts/sec.

Then also changed the test to use _id as one the unique field.

Java: 55k Python: 32k

I got postgres to 16k on my box, but can't seem to get it higher.

#28 Re: performance

Added on 2009-11-15T04:28 by Sammy

A little more color for those curious. The mongo java driver and
postgres python driver have the same cpu usage ratio for this test.
100% db, 25% driver.

Java: 53140.610054203426 inserts/s

Postgres: 100000 INSERTS in 5.7 seconds: 17520.0 INSERTS/s

That's postgres with synchronous_commit off. (was 3.5k with it on).

#29 Re: performance

Added on 2009-11-15T08:52 by Chris Anderson

Ivan,

If you're curious about the effect the JSON/HTTP layer has on CouchDB,
I'm guessing it's pretty significant. Speed-of-light from inside the Erlang
layer is significantly faster (almost 2x, than via HTTP). However, we tell
people not to interface directly via Erlang calls, because nothing is going
to scale and deploy as smoothly as HTTP.

I haven't run the benchmark scripts since Damien's latest optimizations
but I'm guessing that right now our speed-of-light (w/o HTTP/JSON overhead)
is in the same ballpark as PostgreSQL. Since we're optimized for
concurrency over serial speed I'm feeling like that's "fast enough" but
we'll see, and of course we'll continue to remove bottlenecks as they
become apparent.

#30 Re: performance

Added on 2009-11-15T17:37 by Josh

I'm not sure the performance numbers here make sense. Remember that
mongodb is documented as opposed to row oriented. That means that you may
be writing a lot more per insertion with mongodb, then with postgres.

If you are not denormalizing your data, you are missing something with
these data stores.

Just curious, can anyone run this benchmark against couchdb?

#31 Re: performance

Added on 2009-11-15T17:39 by Emil Eifrem

I threw together an equivalent test using Neo4j. On a standard Dell
server with dual 2.3Ghz Core2, 64-bit OS, standard SATA disks,
I inserted 100k "documents" with two properties and a throughput of
~30k / seconds.

I tried to emulate what I think Sammy mentioned in #27 by removing one
of the properties and use native ID for lookups and then throughput rose to
~45k / seconds. I sized every transaction to 10,000 "documents." This
is fully ACID transactional with guaranteed consistency, durability and
recoverability.

But it's a huge microbenchmark. It may serve as an indication, but at
the end of the day the only thing that makes sense is to benchmark real
world use cases similar to whatever domain we want to model.

#32 Re: performance

Added on 2009-11-15T18:59 by Jan Lehnardt

At this point I feel it necessary to link to two of my blog posts that
discuss benchmarks:

http://jan.prima.de/plok/archives/175-Benchmarks-You-are-Doing-it-Wrong.html
and http://jan.prima.de/~jan/plok/archives/176-Caveats-of-Evaluating-Databases.html

Cheers

Jan

--

#33 Re: reliability

Added on 2009-12-13T14:12 by Ivan Voras

It looks like the developers made a modification for inclusion into
their 1.2.x branch: http://jira.mongodb.org/browse/SERVER-442 -
"durability: fsync files to disk every minute." I hope they know that this
doesn't help durability at all except if there are absolutely no dynamic
structures stored in the mmaped region (i.e. no trees, hash arrays,
anything).

#34 Re: reliability

Added on 2010-02-19T02:57 by Mathias Stearn

We just posted to our blog about this topic:
http://blog.mongodb.org/post/381927266/what-about-durability

@Ivan, since this page is frequently linked to would you mind
adding that link to our blog into your main post?

#35 Re: reliability

Added on 2010-02-20T12:56 by Ivan Voras

@Mathias: thanks, I've written an updated post and liked it from the
main post here.

#36 Re: Performance

Added on 2010-03-16T02:53 by Mark Smith

Just as an additional data point, (the discussion has moved on a bit, I
know), I'm getting 36-38K inserts/s on my laptop, using the C# drivers.

#37 Re: Performance

Added on 2010-12-05T15:04 by prakash patidar

Hi ,

I use C# driver and getting 12-15k insert/second only.Its done on dual
core laptop and I saw my C# client was taking 45% cpu(one core completely)
and 25% of mongod process.Is it serialization which is hurting me from .net
to bso nobject?

call = new Document();

call["_id"] = i;//int i

call["data"] = byt; //byte array of 2k chunk even i do byte array of 1
byte performance does not change much

calls.Insert(call);

do we have ways to bulk upload into it using c# or other way?

in my use case i need to append document ,can u share code for the
same?

#38 Re: Performance

Added on 2011-03-07T18:28 by ASBai

The posix system use 'msync' to perform a file mapping flush. So it's
may be ok if you simply could not found the 'fsync' call in its source
code.

#39 Re: Performance

Added on 2011-08-19T11:55 by good luck

The idea of MongoDB is to use 32x more hardware to achieve the same as
with traditional database systems. It just sucks.

#40 Re: Performance

Added on 2012-05-11T03:09 by iPhone guy

That was a QED response if I've ever seen one. Thanks good luck.

#41 Re: Performance

Added on 2014-03-03T17:59 by jbg

I wonder if mongo db is fashion or real stuff against legacy DBMS and
even other NoSQL .

For the moment, and probably the next 10 years, you need it. Even if you
are a web programmer and don't want to hear of it. saying "google" is
consistent, use fsync, increase number of disks and server, yes. But what
experiments for critical (eg customers) data, description, links,
procedures ...