Apache Cassandra NoSQL Performance Management

Apache Cassandra is a leading NoSQL database platform for online applications. By offering the benefits of continuous availability, high scalability & performance, strong security, and operational simplicity — while lowering overall cost of ownership — Cassandra has become a proven choice for both technical and business stakeholders. When compared to other database platforms such as HBase, MongoDB, Redis, MySQL and many others, Cassandra delivers higher performance under heavy workloads.

Monitoring, troubleshooting, and tuning databases are a top priority for you as a DBA. This section details how you can carry out your performance management tasks on a NoSQL database like Cassandra.

Monitoring Basics

There are a number of command line utilities that enable you to get a status of your database clusters, as well as general metrics for the network, objects, and I/O operations both at a high level and low level (e.g. table) fashion. For example, the Cassandra nodetool utility lets you quickly determine the up/down status and current data distribution of a cluster:

Checking a cluster’s status with the nodetool utility.

Advanced Command Line Performance Monitoring Tools

From a performance metrics standpoint, Cassandra delivers many different statistics that can be accessed in various ways. If you are coming from an RDBMS like Oracle or Microsoft SQL Server and are used for performance data dictionaries like Oracle’s V$ views or SQL Server’ dynamic management tables, the most familiar interface for you is the one supplied by DataStax Enterprise’s Performance Service.

The Performance Service collects, organizes, and maintains an in-depth diagnostic data dictionary for each cluster. It consists of various tables that can be accessed via any CQL utility (e.g. the CQL shell utility, DataStax DevCenter, etc.) and gives you both high-level and detailed performance views of how well a cluster is running.

The Performance Service maintains the following levels of performance information:

Statement level – captures queries that exceed a certain response time threshold along with all their relevant metrics.

You can configure the service to collect nothing, all, or selected performance metrics for the above categories. Once the service has been configured and is running, the statistics are populated in their associated tables and stored in a special keyspace (dse_perf). You can then query the various performance tables to get statistics such as the I/O metrics for certain objects:

(2rows)

Visual Database Monitoring

In addition to monitoring your database clusters from the command line, you can also easily check on the health of all clusters you’re managing visually (just as you probably do with your chosen RDBMS performance monitors) by using DataStax OpsCenter. OpsCenter gives you both global, at-a-glance dashboards that help you understand how all clusters under your control are doing, as well as drill down capabilities into each cluster and its individual nodes.

A global dashboard helps you understand how well all clusters are running and if there are any alerts or issues for one or more clusters that need your attention:

Checking OpsCenter’s global cluster dashboard.

From the global dashboard, you can drill down into each individual cluster and create customized monitoring dashboards for the performance metrics you care about the most:

Examining performance metrics for a single database cluster.

You can also create proactive alerts that notify you far in advance of a problem actually occurring in one of your clusters:

Creating an alert in OpsCenter.

In addition, you can utilize built-in expert services like the Best Practice service that will scan your clusters and provide expert advice on how to configure and tune things for better uptime and performance:

Creating an alert in OpsCenter.

In addition, you can utilize built-in expert services like the Best Practice service that will scan your clusters and provide expert advice on how to configure and tune things for better uptime and performance:

Finding and Troubleshooting Problem Queries

As a DBA, you’re sometimes called upon to locate a database’s worst running queries that slow the performance of the system as a whole. You’ll find this isn’t hard to do with Cassandra.

First, you can use the DataStax Enterprise Performance Service to automatically capture long-running queries (based on response time thresholds you specify) and then query a performance table that holds those statements:

In addition, there is a background query tracing utility available that you can use on an ad-hoc basis. You can choose to trace all statements coming into a database cluster or only a percentage of them, and then look at the results. The trace information is stored in the systems_traces keyspace that holds two tables: sessions and events, which can be easily queried to answer questions such as what the most time-consuming query has been since a trace was started, and much more.

You can also use the tracing utility much in the same way you do an EXPLAIN PLAN on an RDBMS query. For example, to understand how a Cassandra cluster will satisfy a single CQL INSERT statement, you would enable the trace utility from the CQL command shell, issue your query, and review the diagnostic information provided:

With Cassandra’s tracing capabilities, OpsCenter’s visual monitoring, DataStax Enterprise’s Performance service, and general command line monitoring tools, you will have most, if not all, of the typical performance tools at your disposal with Cassandra as you do today with your favorite RDBMS.