Saturday, August 31, 2013

Handling mongodb connections

Mongodb is fast and for medium to large setup, it just works out of the box. But for setups which are bigger than large, you may run into a situation where the number of connections max out. So for extra large setups, let us look at how to increase the number of connections in a mongodb server.

Mongodb uses file descriptors to manage connections. In most unix-like operating systems, the default number of file descriptors available are set to 1024. This can be verified by using the command ulimit -n which should give the output as 1024. Ulimit is the per user limitation of various resources. Ulimit can be temporarily changed by issueing the command ulimit -n .

To change file descriptors system wide, change or add the following line to your /etc/sysctl.conf

fs.file-max = 64000

Now every user in your server would be able to use 64000 file descriptors instead of the earlier 1024. For a per user configuration, you will have to tweak the hard and soft limits in /etc/security/limits.conf.

By default the mongodb configuration file mostly mongodb.conf does not specify the number of max connections. It depends directly on the number of available file descriptors. But you can control it using the maxConns variable. Suppose you want to set the number of max connections to 8000, you will have to put the following configuration line in your mongodb.conf file.

maxConns = 8000

Remember : mongodb cannot use more than 20,000 connections on a server.

I recently came across a scenario where my mongodb which was using 7000 connections maxed out. I have a replica set configured, where there is a single master and multiple replicas. All reads and writes were happening on the master. With replica sets, the problem is that if the master mongo is restarted, any one of the replica mongo servers may become the master or primary. The TMC problem was caused by a missing index which caused multiple update queries to get queued up. To solve the problem, firstly the missing index was applied. Next all queries which were locking the collection had to be killed.

Here is a quick function that we were able to put together to first list all pending write operations and then kill them.