November 22, 2010

Index Join

One of the less well known access paths available to the optimizer is the “index join” also known as the “index hash join” path. It’s an access path that can be used when the optimizer decides that it doesn’t need to visit a table to supply the select list because there are indexes on the table that, between them, hold all the required columns. A simple example might look something like the following:

Note that the columns in the where clause appear in (some) indexes, and the column(s) in the select list exist in (at least) some indexes. Under these circumstances the optimizer can produce the following plan (the test script was one I wrote for 8i – but this plan comes from an 11.1 instance):

We do a fast full scan of the two indexes extracting the rowid and id from index ij_v1 and just the rowid from index ij_v2. We can then get the result we want by doing a hash join between these two result sets on the rowid values because any time the two rowsources have a rowid in common, it’s a rowid for a row where val1 is between 100 and 200, and val2 is between 50 and 150 and the first rowsource is carrying the id – which is the thing we need to report.

There are a couple of little observations that we can make about this example.

First, although I’ve only used two indexes in this example Oracle is not limited to just two indexes. The number of indexes that could be used is effectively unlimited.

Second, the index_join path is strictly limited to cases where the optimizer can see that every column in the query can be found in indexes on the table.

Third, although my example uses index fast full scans that’s not a necessary feature of the plan. Just like any other hash join, Oracle could use an index range (or full) scan to get some of the data.

Finally, there are clearly a couple of bugs in the code.

Bugs:

If you check the rows/bytes columns in the plan you’ll see that the predicted number of rows selected is the same for both indexes (lines 3 and 4) – but we extract the rowid and the id from the first index (projection detail for line 3), so the total data volume expected from line 3 is slightly larger than the total data volume from line 4 where we extract only the rowid; theoretically, therefore, the optimizer has used the tables (indexes) in the wrong order – the one supplying the smaller volume of data should have been used as the first (build) rowsource.

More significantly, though, a quick check of the code that generates the data tells you that each index will supply 101 rows to the hash join – and you can even show that for other query execution plans the optimizer will calculate this cardinality (nearly) correctly. In the case of the index join the optimizer seems to have lost the correct individual cardinalities and has decided to use the size of the final result set as the cardinality of the two driving index scans.

There’s more, of course – one of the strangest things about the index join is that if your select list includes the table’s rowid, the optimizer doesn’t consider that to be a column in the index. So even though the predicate section of the plan shows the rowids being projected in the hash join, Oracle won’t use an index join for a query returning the rowid !

Footnote: The reason I’ve written this brief introduction to the index join is because an interesting question came up at the first E2SN virtual conference.

“If you hint an index hash join, is there any way of telling Oracle the order in which it should use the indexes?”

The answer is no – but there are ways of creating code that will do what you want, and that will be the topic of my next blog.

Updated Feb 2012

I’m wrong about controlling the order of the index hash join – it suddenly occurred to me that the full specification for the hint includes a list of the indexes you want to join. Having run a few experiments using the full specification for a two-index join, I think I’ve convinced myself that the order you list the indexes is taken as the join order, with no swap. Unfortunately there are various bugs that hide this fact. Furthermore, I haven’t yet investigated more complex examples (viz: examples which hash more indexes) to see whether Oracle also allows for swapping join inputs when there are more than two indexes.

There are no histograms on the columns. It’s a question worth asking because it might (in theory) make some sort of difference. In fact all three columns are uniformly distributed (rownum) with no gaps in the range, so Oracle’s analysis of the data in its default “gather stats” calls shouldn’t produce histograms.

Another point about stats, though, is that I’ve disabled CPU costing (system statistics) to make the results as consistent as possible across different versions of Oracle – and you might wonder whether CPU costing would be enough to switch the order of the join.

It might appear this evening while I’m waiting in Heathrow airport for a flight to Geneva, or it might appear on Friday evening while I’m waiting in Geneva airport for a flight home – but it will be here soon. (And then there are two more articles I could write on the topic.)

Nice article just a quick question why the rows column of the execution plan show 3 row in the final step (i.e. id=0)
it should be 50 right (or at least close to 50 if there is some arithmetics error)?

This is “just” the usual arithmetic problem that the optimizer has with dependent columns. You can see that the two columns are perfectly correlated and that the query is asking for a given overlap. The optimizer sees only that we have two columns that are each asking for 1/30th of the data so, ignoring some of the fiddly bits it’s arithmetic is “total_rows * 1/30 * 1/30″ (which isn’t 3, but it’s a lot closer than the 50 you’re expecting.

I’m thinking about the sentence “…total data volume expected from line 3 is slightly larger than the total data volume from line 4 where we extract only the rowid…”

If we look at the Projection Information you are right, but I think that in this scenario Oracle is saying something like “fast full scan of the index costs me 36 bytes, then of this 36 I take only the ROWID” that is: in a fast full scan Oracle do something like a full table scan so it reads all the index’s column(s) even if they are not needed and then selects only what really need (the Projection Information).
So in this case Bytes (column from the plan) and the Projection Information can be a bit “unrelated”.