Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I have a few hundred small databases, each between 0.2GB and 4.0GB in size. These are sitting on a sharded mongodb environment with 10 or so shards.

At any one time, only a very small subset of these databases are being (intensively) written to.

All of them are being intensively read from, all of the time (Target OTE 300,000 queries per second). I can exert enough control over the read order to spread the reads across the databases fairly evenly.

Right now, none of these small databases are partitioned.

When I look at the output of the db.printShardingStatus() command, I see that most of the databases are sitting on shard0001. Indeed, mongostat shows that most of the reads are hitting that one machine.

I have (so far) done nothing to try to influence which db goes on which machine.

My question is this: Left to it's own devices, will mongodb automatically move the primary for these small databases so that the load (eventually) ends up being more balanced, or do I have to intervene in the process myself?

(Or should I try to partition these databases over multiple machines so that the index size is smaller on each shard, then re-sequence my reads so that I hit each database one at a time?)

these seem like too small collections to be sharded. Have you considered moving them to separate mongod servers which can run on different machines? that gives you complete control over which are on which machines and keeps their indexes from fighting for the same RAM on the same machine...
–
Asya KamskyJun 27 '12 at 4:35

I could do that, but I was hoping I would not have to. We already have the 10 machine cluster available, and too limited time to reconfigure it dramatically. (It is also being used by another set of processes, which complicates things). The databases are automatically generated - one per hour, so I could switch to use 1 per day, which would make them big enough to shard (at the expense of some write performance). I would like to be able to explicitly specify which shard they should go on, but I suspect that movePrimary will not work for this.
–
William PayneJun 27 '12 at 15:41

1 Answer
1

Left to its own devices, no, MongoDB will not move those unsharded databases to a different primary shard - the automatic balancing only applies to chunks from sharded collections.

It will round robin through your shards as the databases are created to spread them out across all the shards from that perspective. If you had one shard originally and expanded to many, the databases may have been concentrated on that shard - the round robin aspect only applies when you create the database, not the collections inside it.

Once the databases are created, and assuming you can predict what will be used and when, you can then move them to whatever shard you wish using the movePrimary command and distribute load accordingly:

Thank you very much for your answer. I am now sharding the databases when they are created, so that the load balancer can spread them around as required. I am expecting the databases to get quite a bit bigger as the load on our system ramps up, so I think that this is the right approach.
–
William PayneJun 29 '12 at 15:25