MongoDB or How I learned to stop worrying and love SQL

We’ve been using MongoDB for a year and a half at ThoughtLeadr. During that time we’ve gone from elation to depression using this trendy NoSQL datastore. Based on the documentation, it’s not hard to see why you’d get pulled in. A schemaless, performant database that can utilize both sharding and replica sets to mantain high availability at nearly limitless scale. Well, I guess I should have known that when something is too good to be true, it probably isn’t. Here’s the lowdown of MongoDB’s web-scale breakdown.

Global lock

I’ll admit that I didn’t realize the global lock (now database level lock) was such a major issue when I first started using MongoDB. I’ve never written database internals myself before. While that doesn’t excuse me from doing my homework, I bought into 10Gen’s benchmark’s page. Oh wait, they don’t have benchmarks? Strange, I remember reading all these great articles about MongoDB’s performance when I picked it up. Way Backmachine to the rescue. This is one of the most frustrating aspects of working with MongoDB, the global lock has far greater repercussions in production than what you see in the benchmarks. I’ll get into specifics below but the global lock is a reoccurring theme throughout, a serious flaw in any database design, that should have be represented more honestly.

MapReduce is useless at web scale

When you run a MapReduce job against a database, the global write lock stops any other process from manipulating that data. Meaning if you run a moderate number of MapReduces per hour you massively degrade application performance against those collections. Plus, you can block the rest of replica set from sync’ing in a timely fashion, which can cause “primary” database switches and accidental loss of data.

One of the core principles of MapReduce is the ability to get concurrent data processing out of a system to analyze large datasets quickly. With MongoDB, MapReduces run inside a single-threaded Javascript VM eliminating concurrent processing and slowing web-scale data processing down to a crawl.

Sysadmin tooling is painful

Every major sysadmin tool is a blocking process. Most of this falls back on the global write lock issues or immature tooling but in practical terms if you need to modify your database structure in production you’ll be forced to have downtime.

Here’s a great example, database compaction only works if you are under 50% disk utilization. Let me repeat that again. Database compaction only works if you’re not using more than half your disk. Have you ever created a collection only to realize later that you don’t really need it? If you have over half your disk in use, you’ll need to take one of the replica set members offline and delete the entire database manually then bring it back up to sync all the data from the primary. Only after it has completed sync’ing, a process that can take days, can you set that system to primary and do the same with the other members of the replica set.

A woeful lack of production grade tooling

We never really felt the need to use an ORM tool with MongoDB since its JSON data structures map nicely to both Python and Haskell (our core languages) analogs. However, after using MongoDB in production for over a year, migrations became a real pain. There’s no mature tools to simplify this process – forcing any adopter to eventually roll their own migration system. Plus, there wasn’t a clear best practice for migrating objects in storage. Do you use a framework to lazy migrate objects as you need them? Or run a migration script to update all the data at one time (updating some objects even if you never use them again)? The answer is both, heavily dependent on the situation. When you spend most of your time in the NoSQL world, it’s easy to forget migration support is built into SQL with the ALTER TABLE command.

What’s next?

Honestly, we started using MongoDB because of its great documentation and blazing developer speed (amazingly fast to get up and developing features). The problems only crop up when your product has real traction, real data, and real scale. Then it becomes apparent that MongoDB isn’t ready for prime time. We’ve already switched over to Percona for our production metadata database but we’re not done with NoSQL. Our full database stack still includes Redis and Riak since we have a need for both fast IO and big data respectively.

Brady Sullivan

Albert Einstein

Andrew Pennebaker

Haha, nice. I just finished a distributed and concurrent systems programming course at GMU, where my professor demonstrated the simplicity and ineffectiveness of a single lock distributed system. Alternative solutions include transactional memory, something to look into with Haskell/STM.

Have you tried Redis? This makes two thumbs-down articles I’ve read on MongoDB, and I’m wondering if Mongo in particular is suboptimal, or if it’s a problem in other NoSQLs as well.

I know Google uses MapReduce to great effect. What kind of non-global-locking database do they use?

Jeff

I believe Andrew Pennebaker wonders what non-global-locking database Google uses, not what ThoughtLeadr uses. Google uses BigTable as a datastore, but I don’t know what its locking is like. I’m sure that info is easily searchable.

Xorlev

We’ve had similar experiences with Mongo. Back when everything fit on a single server and we were much less seasoned in the ways of Mongo, we attempted a data migration with MapReduce. I wish I was joking.

Another somewhat painful realization is that sharding is really the only way to scale out reads. I’d hoped that adding more slaves would do the job, but 5-10s behind on replication (on beefy boxes, only taking in 500 writes/sec — durably) made it a little too eventually consistent for our uses.

Riak is a fantastic datastore, but we moved back to MySQL for anything we really cared about. The real eye-opener for me was the realization that NoSQL started as a way to have great lookup characteristics of key -> value pairs but has started adding on indexes and all the complexity and slowness that comes with them. At that point, RDBMSes have a leg up withs years of mature, tuned index implementations.

Xorlev

At the time, it -was- the correct choice. We had extremely fluid schemas and trying to prove out a product without putting too much into it. MongoDB was great to us for that while.

MongoDB promises a lot but has a lot of caveats when you go to make it a functional datastore for a real business.

I’m not sure what part of the Mongo manual says, “hey, you can only scale if your writes aren’t durable.” I agree though, the manual required a close look once we had time to make sure our data story was in line. It certainly does scale, but not effortlessly like it tries to claim. They’ve improved things significantly in Mongo 2.2, but I still expected more out Mongo given the hardware we threw at it.

Funny – virtually all posts involving MongoDB not quite working for someone (immediately, eventually, whenever), end up w/ the “You should have read the documentation more carefully” argument.
I suspect there is some of Godwin Rule equivalent for MongoDB —
“All discussions about MongoDB will eventually claim vindication via documentation” 🙂

Shakakai

The delay on replication was less of an issue for us but it is definitely something you need to account for when selecting Mongo. We love Riak. All our MapReduce functionality runs in Erlang across our Riak cluster, fast and efficient.

We are helping at least two customer a month to migrate away from MongoDB to something that really scales, i.e. GridGain. With production customers running on 1000s of nodes in fully ACID transactional mode on document data in real time – GridGain can handle a lot. Take a look: http://www.gridgain.com

Shakakai

Anon

Mongo’s architecture is a joke. This sort of story is no surprise to folks who’ve been working with database internals. This has nothing to do with SQL vs NoSQL, and everything to do with the fact that their system is basically a pile of rookie mistakes… Memory-mapped files, seriously?

It is certainly interesting to read articles like this; we use mongodb in our production app (with ~6K hits per week so far and growing) and so far everything is working great. Could we have done the same thing with sql? probably, but much of it would have been more of a pain. Totally agree with the frustration about insufficient tooling; of course, mysql was the same way when it first came out. I think most of the issues you mention will be fixed with time.

The Map/Reduce issues you mention are definitely important for anyone to understand; map/reduce is not something that works well for doing anything complex in real-time. If used differently, however, it can be insanely powerful. http://hamstudy.org (personal site, not the production one I mentioned earlier) uses map/reduce to keep track of user statistics by simply updating them as it goes with a reduce into collection map/reduce; it’s blazing fast. MongoDB 2.2 has introduced the aggregation framework which hopefully fixes some of the other cases where map/reduce doesn’t cut it due to the lock, though.

The only thing I really disagree with in this article is the statement that “MongoDB isn’t ready for prime time”. That’s not true; it is absolutely ready for prime time, but it is unfortunately not always easy to determine what the performance issues you will have are ahead of time. If you understand mongodb well enough and are using it for the things that it is good for then it is absolutely ready for prime time; if you misunderstand some of it (which unfortunately many of us don’t do, and the docs are still a little young so it’s not always easy to understand all of it correctly without experience) then you may be trying to do things with it that it simply isn’t designed to do.

#1 thing that people should realize before signing up to use mongodb is that mongodb is *not* a drop-in replacement for sql; it’s a totally different system, different paradigm. Many things are far, far easier to deal with and incredibly slick. Unfortunately, some things simply aren’t possible or are not reasonable to do in a performant manner.

The good news is that the mongodb team seems to be (from what I can tell) working hard to try to continue to improve and provide solutions to the problems. Until then, understand what they are and choose wisely =]

Noone

I’m sorry, I didn’t realize that this was a contest. The point is that even that is enough load to start doing some measurements and see what things do and don’t work — and some things do and some things don’t work. The important thing is to understand the tools and the performance implications of various pieces.

Shakakai

Absolutely agreed. That tends to be the case with any database, though, and there are other websites that are using mongodb successfully with far more users than that. The argument that because the things you expected to work at that scale weren’t performant then the database “isn’t ready for prime time” is logically flawed, though; it simply means that Map/Reduce in the way you’re using it isn’t feasible at that scale, or other particular things you were doing isn’t feasible at that scale.

One of the biggest challenges with MongoDB (and other nosql databases, but mongodb more than most) is that it is so similar in so many ways to sql that the natural first instinct is to do things in the same way, just substituting things straight across. That doesn’t work; it’s a totally different database architecture and requires entirely different architecture of your classes. For extensive aggregation and reporting map/reduce isn’t the equal of sql joins and groupings. That may mean in your case that mongodb isn’t a good option, but that doesn’t make it a bad option for anything else, just for your use.

Pierce Wetter

Shakakai

That’s true but they replaced it with a database level lock. The best practice now is to have only one collection per database to minimize the lock. Makes you wonder why they even have databases at all if you have to organize everything at the collection level.

mjasay

I find these sorts of articles frustrating, because for every “MongoDB doesn’t work” I know of scads more “MongoDB is manna from heaven” cases. The problem is that no one feels compelled to write up the latter, which probably contributes to more of the former. We need better knowledge-sharing between those who are successful with MongoDB (or any technology, really) and those just coming up to speed. I know companies (big brand names that you use every day) that are running MongoDB at massive scale. But are they going to blog about it? Almost certainly not.

None of which is to downplay the particular problems encountered here. Whether user error/”you should have read the manual” or due to real problems with the technology, the result is the same: unhappy user.

Todd, any chance you’d be willing to debrief with some friends at 10gen? Not to get you back on MongoDB, but rather to just try to better understand the complete experience you had, so that it can be improved for others (and hopefully for you on your next application). Ping me at mjasay @ that Google mail thing.

Shakakai

mjasay

I should note that I’m not disinterested in this, having very recently joined 10gen. Still, the reason I joined is because while MongoDB isn’t perfect, I do believe it’s moving (fast) in the right direction and is already good for a great many use cases. It may not have been ideal for yours, Todd, which is why I’m glad we’re going to be talking about it, to see if we could have done something better and helped you be successful.