Using Percona Toolkit pt-mongodb-query-digest

This blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog post, we’ll look at how to use the
pt-mongodb-query-digest tool in Percona Toolkit 3.0.

Percona’s
pt-query-digest is one of our most popular Percona Toolkit MySQL tools. It is used on a daily basis by DBAs and developers to help identify the queries consuming the most resources. It helps in finding bottlenecks and optimizing database usage. The
pt-mongodb-query-digest is a similar tool for MongoDB.

About the Profiler

Before we start, remember that the MongoDB database profiler is disabled by default, and should be enabled. It can be enabled server-wide, but the full mode that logs all queries is not recommended in production unless you are using Percona Server for MongoDB 3.2 or higher. We added a feature to allow the sample rate of non-slow queries (like in MySQL) to limit the overhead this causes.

Additionally, by default, the profiler is only 1MB per database. You may want to remove/create the profiler to sufficient size to find the results useful. To do this, use:

According to the documentation, to check if the profiler is enabled for the samples database, run:

1

`echo"db.getProfilingStatus();"|mongo localhost:17001/samples`

Remember, you need to connect to a MongoDB instance, not a mongos. The output will be something like this:

1

2

3

4

MongoDB shell version:3.2.12

connecting to:localhost:17001/samples

{"was":0,"slowms":100}

bye

The value for the field “was” is 0, which means profiling is disabled. Let’s enable the profiler for the samples database.

You must enable the profiler on all MongoDB instances that could be related to a shard of our database. To check on which instances we should enable the profiler, I am going to use the
pt-mongodb-summary tool. It shows us the information we need about our cluster:

From the output, we can see that this query was seen 97 times, and it provides statistics for the number of documents scanned/retrieved by the server, the execution time and size of the results. The tool also provides information regarding the operation type, the fingerprint and a query example to help to identify the source.

By default, the results are sorted by query count. It can be changed by setting the
--order-by parameter to: count, ratio, query-time, docs-scanned or docs-returned.

A “-” in front of the field name denotes the reverse order. Example:

1

--order-by=-ratio

When considering what ordering to use, you need to know if you are looking for the most common queries (-count), the most cache abusive (-docs-scanned), or the worst ratio of scanned to returned (-ratio)? Please note you may be tempted to use (-query-time), however you will find this almost always ends up being more queries affected by, but not causing, issues.

Conclusion

This is a new tool in the Percona Toolkit. We hope in the future we can make it grow like its big brother for MySQL (
pt-query-digest). This tool helps DBAs and developers identify and solve bottlenecks, and keep servers running at top performance.

Related

Carlos, a Web back-end Go developer, has been writing computer programs since 1984. Prior to joining Percona, he was a developer at Onapsis Inc., Edrans/Zappos.com, Tupperware. Carlos lives in Rosario with his wife and her two daughters.

One Comment

Interesting – especially for those who prefer to use a shell. Back in 2013 I wrote a web app in order to collect, store and to visually analyze slow operations. It’s open sourced so you can try it out:
https://github.com/idealo/mongodb-slow-operations-profiler

I’ve just released an important update (v1.0.3) which has a lot of new features compared to the first version which I presented to the Berlin mongodb user group back in 2013. I’m eager to know your feedback.