Thursday, September 27, 2012

MongoDB: Replication Lag, Network Partition

Replication: when everything is working, it seems the easiest thing, but the challenge comes when we have failures. This is something that MongoDB, being a database that is out for some years, shows us very well.

If you look at MongoDB's replication page, there is a section called "Rollback". This applies when there is a replication lag - your secondary is behind the primary - in conjunction with a network partition. It may look like, but it's not a scenario that uncommon.

What happens then? Since the primary is ahead of the secondary, if it wants to rejoin the cluster, it needs to somehow "resync" to the current state. This means that operations applied previously but not replicated (due to the lag) will need to be rolled back. First, this process in MongoDB is manual.

But what happens then if your secondary was behind a large amount of data - more precisely, 300Mb or more? MongoDB does not have the capability of rolling back and manual intervention is required to recover that node.

Read for yourself:

Warning: A mongod instance will not rollback more than 300 megabytes of data. If your system needs to rollback more than 300 MB, you will need to manually intervene to recover this data.

Although it supports asynchronous replication (the eventual consistency), it does not solve these resynchronization problems automatically. Maybe it was just deemed as not very common and low priority compared to other features, but can you imagine the pain if you need to go through this process?