Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

Let's say I have a table with fields A and B. I make regular queries on A+B, so I created a composite index on (A,B). Would queries on only A also be fully optimized by the composite index?

Additionally, I created an index on A, but Postgres still uses the composite index for queries on only A. If the previous answer is positive, I guess it doesn't really matter, but why does it select the composite index by default, if the single A index is available?

I tried to set up a small test for this. In my case, however, the two-column index was used only when I dropped the single-column one, unrelated to which one was created first. It is interesting that if I created the two-column index first, the initial plan used a bitmap heap scan. If I created the one-column index, then run the query (used index scan) and dropped the newly created index, the plan involving the two-column index switched to index scan. See the steps on SQLFiddle
–
dezsoOct 23 '12 at 19:38

@dezso Interesting. Where are the costs for each query?
–
LucianoOct 23 '12 at 20:05

2 Answers
2

It also depends on the data types of the indexed columns. An index on, say, two integer columns is exactly as big as an index on one. Space is typically allocated in chunks of 8 bytes. For one integer (4 bytes) that's another 4 bytes of padding.

The keywords here are data alignment and MAXALIGN: usually 4 Bytes on a 32-bit OS or 8 Bytes on a 64-bit OS. If you are not sure, check out pg_controldata.

In such a case there is really no downside for the planner to use an index on (a,b) - compared to an index on just (a). And it is generally preferable for multiple queries to use the same index. The chance for it to reside in the cache already (or at least parts of it) grows when shared.

If you already maintain an index on (a,b), then it doesn't make sense to create another index on just (a) - unless it is substantially smaller. The same is not true for (b,a) vs. (a). Follow the link in the first line for more on that.

On the other hand, there is a potential downside to including an additional column in the index, even if that only utilizes space otherwise wasted for padding. If the additional column is updated, the index needs an update, too. Also prevents potential HOT (Heap Only Tuple) updates on the table when this column is involved. More on HOT updates in this related answer on SO.

This caching effect was my target in the small test described above - what I got I think is 'negative'.
–
dezsoOct 24 '12 at 6:22

@dezso: Did you maybe forget to ANALYZE the newly created table? I can't make sense of the results you describe in the above comment otherwise.
–
Erwin BrandstetterOct 24 '12 at 19:37

Say I already had an index on column a. If I now realize that I need an index on (a, b), should I drop the single-column index on a?
–
ZoltánJun 19 '13 at 14:58

@Zoltán: It depends. For very small columns (like two integer) the index on (a,b) would make an additional index on (a) pointless. For bigger columns it may be different. Start a question if you need more details.
–
Erwin BrandstetterJun 19 '13 at 16:58

@dezso It will use index only in the case if there are one or two predicates [A] Or [A],[B]. It will not use it in the order [B],[A] or [A],[C]. To be able to utilize index with additional column [C], you need to enforce index by ordering predicates as [A],[B] and [C].
–
FarfarakOct 24 '12 at 9:52

Did I miss something or your answer really does not address the question ("why does it select the composite index by default, if the single A index is available?") ?
–
dezsoOct 24 '12 at 9:53

Of course. But this still does not explain why the single column index was not used even when only that column was searched.
–
dezsoOct 24 '12 at 9:55

2

What exactly do you achieve with B=B? I think you achieve nothing so I'm voting down absent any evidence this isn't just ignored by the optimizer
–
Jack Douglas♦Oct 24 '12 at 9:57

B=B is effectively the same as B IS NOT NULL, which seems uncalled for. Certainly not needed to use an index on (a,b).
–
Erwin BrandstetterMay 27 '14 at 15:05