Reading enormous data from MongoDB using MongooseJS

In my experiments with Mongoosejs/MongoDB I stumbled upon a huge performance problem where the requests took awfully long, actually somewhere between 20 – 25 secs to read 100,000 documents.
So, why does this happen? I had my field indexed ! Being a Node/Mongo/Mongoosejs noob (yes all 3 at once) I was pretty sure this had to do something with the way I was writing my queries, guess what, turns out that I was right ! Lets break it down.

Approach 1 (avg. 20 secs)

Below is a code snippet that reads from MongoDb. The below code uses .find() from mongoosejs.

Though this works well for small amounts of data, huge reads almost always are slow. This is because under the hood .find() does a lot of magic like ensuring if the connection to mongodb is present, creating getters and setters etc. This is what causes the .find() operation to take such a long time to complete.

Approach 2 (avg. 16 secs)

How about using stream()? Well this also takes approximately the same amount of time. The below code calls the ‘data’ handler on every document that is read and then finally calls ‘close’ when everything completes or an error occurs or destroy is called somewhere in the code by the user.

The below snippet uses the lean option. This is quite fast but there is a trade-off here. The documents returned by this call are plain JavaScript objects and not Mongoose documents. This means you do not have the entire mongoose magic which is provided by the .find() from approach 1. You can find a discussion here about the same. This works like a charm in high performance scenarios.

And that’s a wrap. As of now I am sticking to this till I get a better solution. I think caching, but that is till I graduate from this noobdom.
In conclusion if we need to just do reads that are not going to result in updates it is always a better option to use lean() as the preferred way of reading keeping our queries running fast because hey, everyone loves speed.