If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).

MySQL cannot use an index if the columns do not form a leftmost prefix of the index. Suppose that you have the SELECT statements shown here:

SELECT * FROM tbl_name WHERE col1=val1;SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;

SELECT * FROM tbl_name WHERE col2=val2;SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3;

If an index exists on (col1, col2, col3), only the first two queries use the index. The third and fourth queries do involve indexed columns, but (col2) and (col2, col3) are not leftmost prefixes of (col1, col2, col3).

Now, my question is how I should define the indexes if I want to execute the following queries efficiently:

SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
SELECT * FROM tbl_name WHERE col1=val1 AND col3=val3;
SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2 AND col3=val3;

An index on (col1, col2, col3) doesn't work with the 2nd query. If instead I would define an index on (col1, col2) and (col1, col3) would that help the 3rd query?

r937
—
2010-01-26T18:30:52Z —
#2

rblon said:

Now, my question is how I should define the indexes if I want to execute the following queries efficiently:

declare on index on (col1,col2,col3), which will be used by the first and third queries

then declare an index on (col1,col3) which will be used by the second

rblon said:

If instead I would define an index on (col1, col2) and (col1, col3) would that help the 3rd query?

nope, mysql will use only one index per table

rblon
—
2010-01-26T18:57:57Z —
#3

ok but what if the situation is somewhat more complicated... Eg there is also a col4 and col5 which is sometimes used in the SELECT statement.

The only thing the queries have in common is that they alway start with

SELECT * FROM tbl_name WHERE col1=val1 AND ...

In the "single column case" (lets imagine col1 doesn't exist) you would just put indexes on col2, col3, col4, and col5, respectively. However, in the multiple column case it seems you cannot extend this reasoning by putting indexes on (col1, col2), (col1, col3), (col1, col4), (col1, col5). But creating an index for all combinations doesn't seem a good idea either.

r937
—
2010-01-26T19:14:10Z —
#4

i believe you understand the situation, yes

rblon
—
2010-01-26T19:38:32Z —
#5

not yet actually.

r937 said:

nope, mysql will use only one index per table

I didn't know that. But then it seems, that what I called the "single column case" can be extended.

Because in the "single column case" [-> col1 doesn't exist, index on (col2), (col3), (col4), (col5)], the following statement only uses one index:

SELECT * FROM tbl_name WHERE col2=val2 AND col5=val5

which is the same in the "multiple column case" [-> index on (col1, col2), (col1, col3), (col1, col4), (col1, col5)]

SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2 AND col5=val5

SpacePhoenix
—
2010-01-27T06:47:16Z —
#6

rblon, do you need all the fields of the table in the results set? If not, specify the fields in the SELECT clause. It will save resources (memory) as you want be grabbing data that is not needed.

For exampleSay each field is approx 1MB (per record)and you have 20 fields in the table, by doing a SELECT star you'll be grabbing 20MB approx (1MB x 20 fields) per record which matchs. But say you only needed 4 fields, by specifying them in the SELECT clause instead of using SELECT * you'll only be grabbing 4MB approx (1MB x 4 fields) per record which matches.

It might also be more efficient to specify all the fields wanted, even if you want them all, which is something that I'll have to test at some point.

rblon
—
2010-01-27T08:37:29Z —
#7

SpacePhoenix, I am aware of that. I was just extending an example from the MySQL manual about indexing.

SpacePhoenix said:

It might also be more efficient to specify all the fields wanted, even if you want them all, which is something that I'll have to test at some point.

Personally, in that case I would also specify the field names. It makes the program more readable.

SELECT rowid FROM single WHERE col2=71 AND col3=5;
SELECT rowid FROM multiple WHERE col1=1 AND col2=71 AND col3=5;

I attach a screenshot of the EXPLAIN for both statements.

I see that the first query uses only key. This is col2, which makes sense as that returns the smallest number of rows. However, the EXPLAIN output says 1252 rows, while "SELECT rowid FROM single WHERE col2=71" only returns 1009 rows. Also, I thought for the subquery (col3=5) some kind of index merging would happen, but that doesn't seem to be the case. So I guess, MySQL is going through all 1252 (or 1009) rows.

The second query does use index merging but I am not sure how to further interpret the EXPLAIN output. More specificially, how to read "possible_keys: col1,col1_2"?

Any help would be much appreciated.

r937
—
2010-01-27T17:01:32Z —
#11

you're definitely on the right track to becoming a mysql internals expert, a path i myself have chosen not to go down, so i don't think i can help you further

all i know is that "possible keys" means the indexes that were available to choose from