This particular application has been collecting data for months now, but hasn’t really had any users by design. At 33GB of data, pulling up a list of messages received was taking f-o-r-e-v-e-r!

So I decided to document how to go about and fix a running production system… Hope it helps.

Log into mongo console and turn on profiling (the ‘1’) to monitor slow queries. I entered >10 seconds, which really stinks (!). You should adjust it to suit your app’s definition of “slow”—maybe 500ms:

To prevent this sort of thing, you can consider adding indexes when you create new queries. But the best way to do this is to be empirical and know whether you should add the index through some testing. I’ll leave that for another day!

I wanted to export a MongoMapper document and it’s related documents as JSON — with embedded arrays for the collections. Invoking to_json did not seem to work perfectly, so I set about to discover what was going on.

Conclusion

If you use Embedded Documents for every associated document, the to_json method will work perfectly.

If you have normal Documents, you must override the as_json method to export the object “tree.”

Details

Here is a walk through of exporting mongo documents as JSON.

I created a simple Author class. And will use a simple test to show how to_json works:

Turns out we can add a custom as_json implementation to the class that you want to export as JSON. The as_json is responsible for indicating which fields and collections should be included in the json.

I developed apps with the fundamental architecture of the following (with dependencies only crossing one layer):

——
UI Layer
——
Business Object Layer
——
Data Mgt. Layer
——

Born from the one pattern that is king of the hill in my book: “Separation of Concerns.” The above was my architecture for all projects since the early 90s… C++, Java… but so far not so much in Rails.

I have thought about trying it, but not sure if it will pay off or not.

Essentially, it is about cleaving the rails model classes into two parts:

Business Methods, attributes, business rules

Persistence Methods, attributes, all knowledge of the DBMS details

In general, the UI deals with BOs, but sometimes we create dumb “Data Transfer Objects” that are lightweight versions of the business objects to be thrown about the system.

As a side note, in general, moving code to a more “object-oriented” state often ends up with the same lines of code. And often a bit more due to the boiler plate of creating additional classes.

In a current project, we have pulled out the business objects into a separate gem — but mostly because it needs to be used by our web app and by an eventmachine app.

The thing that shocked me the most about Rails, when Corey Haines introduced me to it in 2009, was that it was a lot like “Model Driven Architecture” that I had worked with for a few years. Given an architecture, a vertical slice thru the app, weave the model thru the architecture generator and out comes an application with a consistent architecture for the bulk of the app that is mostly the same (save for model/property names). Commercially, this MDA technology was a failure, last time I checked. Even though I thought it was the smartest way to develop apps, few others did. Except for Rails developers — largely because most rails devs probably have a very different mindset than other devs.

Though Bob pokes fun at Rails high-level directory structure as not revealing the business domain, I am totally fine with that. It’s a good thing. Yea, sure, it is revealing that it is an MVC style app designed to deliver web apps, so what? No matter which architecture is used, I look for the domain classes to tell me what the system is doing…

In my handful of rails apps to date, I have only used MongoDB and MongoMapper — and this is the closest I have gotten to the good old days of when I used the POET Object-oriented Database with C++ back in the late 90s. It is the closest I have been to nirvana. I basically *almost* don’t need to care that there even is a database…

One of these days, I’ll compare and contrast a Rails/MongoMapper app with and without Business Objects separated from Data Management classes.

Distinct

Though I originally hard-coded the message types (as they do not change very often, and are meaningless without other code changes anyway), I figured I would test dynamically gathering the distinct types. MongoDb supports the distinct function. From the MongoDB console:

Though I saw distinct in MongoMapper, I had trouble getting it to work (this is an older app on <v2.0, method missing error).

However, a very powerful technique within MongoMapper worked just perfect! Essentially, every collection in MongoMapper will return itself as a collection that MongoDB understands (in their db.collection.blah format) — helps when you need to execute MongoDB style commands:

Map-Reduce Too Slow

In this instance, it turned out that Map-Reduce was significantly slower, and I am not exactly sure why. Other than I suppose that iterating over each document is more expensive than calling count with a filter on the event_type key (which is covered by an index).

As part of this (unintended) mini-series on MongoDB and indexing, I had written a little test to see if I could document performance gains through indexing. I used realworld data, albeit only 50,000 records, to query out a handful or documents (24 being the most).

Results:

The effect of adding indexes on query performance

The results are shown in the accompanying graph. Except for the query that returned 24 documents, the general trend was that 3 indexes were better than one. And one was w-a-a-a-y better than none (of course, you already knew that). The odd outlier being for count = 6, in that a single index did not perform as well as it did in all the other tests.

Fiddling a Bit More – Using the Profiler

You can drop into the mongo console and see more specifics using the mongo profiler.

> db.setProfilingLevel(1,15)
{ "was" : 1, "slowms" : 100, "ok" : 1 }

For this example, I cleared the indexes on accounts. and I ran the following query, and examined its profile data. Note: the timing can vary over successive runs, but it generally is fairly consistent — and it is close to the “millis” value you see in explain output.

Define Indexes in a Class Method, Invoke in Initializer

A small tweak to putting indexes into an initializer was to place the knowledge of the indexes back into the model classes themselves. Then, all you needed to do was invoke the model class method to create it’s own indexes.

Next I created a rake task (in lib/tasks/indexes.rake) that invoked the ruby code to do the indexing mojo.

namespace :db do
namespace :mongo do
desc "Create mongo_mapper indexes"
task :index => :environment do
CreateIndexes.all
end
end
end

Any tips/comments/insights appreciated…

PS: self.show_indexes Mix-in

I created a mix-in for the “show_indexes()” class method for each model. I could not add it directly to the MongoMapper::Document class unfortunately — I ran into errors and finally gave up. Here’s the mix-in that I defined in lib/mongo_utils.rb:

Mongo is schema-less, that means we can create new fields when needed, I read the mongo mapper document, it still needs to write model code like below, so we have to change the code below when we need new field, is this kind of limitation to use the schema-less database mongodb?

Strict answer: NO, you do not have to change the model code to add a new “column” to your “table.”

I like to think of MongoMapper as making my domain classes behave more like an object-oriented database than a data management layer.

During rapid development, having a database that just follows along with your model makes for speedy feature creation and prototyping. Though “migrations” being first-class citizens, of sort, in Rails is a great step forward for managing development changes with traditional schema databases, MongoDB doesn’t even require that level of concern.

That said, it also implies you, as the developer/designer/architect, are treated as an individual willing to take on the responsibility of wielding this amount of “power” (just like Ruby does).

Therefore, even though you do not technically have to add keys to your MongoMapper document class, if you are talking about core aspects of your domain, I would add the keys so as to make it clear what you are modeling.

Now, there may be certain cases where a class is just storing a bunch of random key-value pairs that are not key elements of your domain, with the sole purpose of merely showing them later. Maybe you are parsing data or taking a feed of lab results, for example, and just need to format the information without searching or sorting or much of anything. In this situation, you do not need to care about adding the (unknown) keys to the model class.

As the saying goes, “just because you *can* do something does not mean you *should* do something.”

I figured I would add some pages to serve as my own reference docs for stuff I wrestle to the ground. I have been putting these sorts of things on internal project wikis for eons (even before wikis!). But that doesn’t do anybody any good.