Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I have a large table which contains sensor data, along with the fields:

sensor_id
timestamp

I do queries against it using these fields almost exclusively. There are multiple sensors, and different sensors might have the same timestamp for a given set of data, so neither index can be unique.

I created three indexes: one for each of these columns (not unique), and a compound one for both (unique).

The table is constantly being written to, and queries to read data seem increasingly slow.

My question is, is the compound index unnecessary? Would it be faster to have only the two separate indexes? (Or remove those and keep only the compound index?) No other columns are used for filtering query data.

2 Answers
2

It might be worth posting the table definition from your other question for clarity.

The composite index is doing a few things for you:

As you know, enforcing uniqueness on (sensor_id, timestamp); I'm unsure whether this is an important data integrity constraint.

Allowing queries that filter on both columns to look up matching rows by using a single index. MySQL can answer some queries (equality conditions on multiple columns are the ones I know about) by merging two indexes, but this tends to be significantly slower compared to using a single composite index.

The index can also be used to search for values in a left-based subset of the composite index, but not a right-based subset. So in this case it could help a query that filters on sensor_id values or sensor_id and timestamp values, but not timestamp values alone.

There are a number of caveats to this, so it's good idea to look at the EXPLAIN output for your queries and verify what indexes they're using. Keep also in mind that indexes can support the read part of UPDATE and DELETE queries, as well as JOINs, GROUP BY, ORDER BY, and other operations I'm neglecting.

An example of a scenario where the composite index is unnecessary would be if you don't care about the uniqueness constraint and all your queries filter on timestamp or sensor_id, but not both.

The single-column index on sensor_id is actually redundant since the composite index on (sensor_id, timestamp) can be used by the same queries, but still you might find that some queries perform faster when doing scans on the single-column index compared to using a composite index with a wider key. The difference might not be enough to matter, though, and some testing will probably be required to find out.

In addition to looking at the EXPLAIN output for your queries, tools such as pt-index-usage from the Percona toolkit or the table INFORMATION_SCHEMA.INDEX_STATISTICS if you're running Percona Server or MariaDB can help you assess what indexes are actually being used.

redundant_keys - as part of the common_schema tool; an SQL oriented solution (disclosure: I'm the author of this tool)

It might also be worth noting that sometimes, even while "mathematically speaking" one index is redundant, it might still be beneficial in practice as it holds a smaller footprint on disk and thus quicker to scan. This applies to very large tables (exactly what "large" is depends on your resources).