August 9, 2010

Joins – NLJ

This is part one of my thesis that “all joins are nested loop joins – it’s just the startup overheads that vary”; there will be a note on “Joins – HJ” and “Joins – MJ” to follow. (For a quick reference list of URLs to all three articles in turn, see: Joins.)

In some ways, the claim is trivially obvious – a join simply takes two row sources and compares rows from one row source with rows from the other row source, reporting cases where some test of column values is true. Until the age of quantum computing when everything will happen everywhere at once we are stuck with Turing machines, and hidden somewhere in the process we are bound to see a sequence of steps similar to:

for each interesting row in rowsource X loop
for each related row in rowsource Y loop
report required columns from X and required columns from Y
end loop
end loop

This, of course, is the basic theme of the nested loop join – we have two loop constructs, one inside the other (hence nested), and we can understand intuitively what we mean by the outer loop and the inner loop and therefore extend the concept to the “outer table” and “inner table” of a traditional nested loop join.

Looking at this from an Oracle perspective, we typically think of a nested loop join as a mechanism for examining a small volume of data using high-precision access methods, so the loop logic above might turn into an execution plan such as:

In this case we use an accurate index to pick up just a few rows from table TABX, and for each row use an accurate index to pick up the matching rows from table TABY. When thinking about the suitability of this (or any) join method we need to look at the startup costs and the potential for wasted efforts.

By startup costs I mean the work we have to do before we can produce the first item in the result rowsource – and in this case the startup costs are effectively non-existent: there is no preparatory work we do before we start generating results. We fetch a row from TABX and we are immediately ready to fetch a row from TABY, combine, and deliver.

What about wasted efforts ? This example has been engineered to be very efficient, but in a more general case we might have multi-column indexes and predicates involving several (but not all) columns in those indexes; we might have predicates involving columns in the tables that are not in the indexes and, of course, we might have other users accessing the database at the same time. So we should consider the possibility that we visit some blocks that don’t hold data that we’re interested in, visit some blocks many times rather than just once, and have to compete with other processes to latch, pin, and unpin, (some of) the blocks we examine. Given sufficiently poor precision in our indexing we may also have to think about the number of blocks we will have to read from disk, and how many times we might have to re-read them if we don’t have a large enough cache to keep them in memory between visits. It is considerations like these that can make us look for alternative strategies for acquiring the data we need: can we find a way to invest resources to “prime” the nested loop join before we actually run the loops ?

I’ll answer that question in the next two notes – but before then I’d like to leave you with a concrete example of a nested loop join. This was run on 10.2.0.3 with an 8KB blocksize in a tablespace using freelist management and 1MB uniform extents.

I’ve created one of my two tables in a single-table hash cluster, given it a primary key which is also the hash key, and ensured that I get no hash collisions between rows in the table (i.e. no two rows in the table hash to the same hash value). With my particular setup the optimizer has decided to access the table by hash key rather than by primary key index. Here’s the execution path.

This isn’t the right place, and you haven’t given a version number.
I’ve reproduced this on 11.1.0.6 – the first query erroneously returns two rows, the second correctly returns none – it’s a bug, raise an SR.

Looking at the 10053, Oracle seems to have “lost” the sal > null predicate in part of the logic for calculating single table cardinalities in the case where that predicate is the last for the table. The error doesn’t occur if the null is supplied as a bind variable.

As you have written specifically about Nested Loop Joins, would it be possible for you to mention how oracle processes NL join, especially in following releases?
a) 8i (plan as mentioned above)
b) 9i and 10g (introduced table prefetch)
c) 11g (don’t know what it is called but plan changes a bit)
I have not managed to find these details in a single document/post anywhere else.

True, I just tried it myself. Even a full google search shows little more than a couple of passing references I’ve made to the feature and lists of hints from 11g. It’s possible that I’m remembering a discussion from Christian’s book. (Troubleshooting Oracle Performance).