All about Performance of Oracle and other relational databases on Linux and UNIX based systems. How to measure it, analyse it, tune it, and manage it over time.

Monday, 2 July 2012

All Outer Joins Are Not Equal

When executing SQL involving ANSI syntax OUTER
JOIN's Oracle first converts this to its own native syntax before executing the SQL, and sometimes
this conversion is not totally transparent thereby affecting and
restricting the execution plan produced.

This started when I had a
long running query, and it happened to involve
outer joins between most of the tables (but not all of them). From the
execution plan it was clear that the main problem was a full table scan
on a very large table - out of a total cost of 35,000, 28,000 was from
this full table scan alone. So replacing the full table scan by a more
efficient index access, if possible, could lead to a significant
improvement.

After some analysis I created an index on all of the
referenced columns in the table. This meant that all necessary data
for the query could be obtained from the index, with no need to visit
the data rows in the table at all. As the data row was quite wide (180
bytes), the index was much narrower and much smaller in size.
Also, because the index stores data in sorted order by the leading
columns, it meant that the rows the query wanted would all be
stored together within the index structure, further reducing the number
of disk I/Os needed to retrieve it all. But, the Oracle Optimizer would
not use this index at all. It just seemed to ignore it, and always do a
full table scan.

At this point I tried another of my techniques
for understanding how a query gets executed - remove tables from the
query one at a time, until you get a more "sensible" execution plan, and
then build it back up again to the final query by adding a table at a
time, monitoring the execution plan after each change.

What I
noticed was that when I got down to 3 tables from the original 7, the
Optimizer would now choose to use the new index (doing an Index Fast
Full Scan, at a lower cost of 4,000). But when I added back in a fourth
table, it went back to doing a Full Table Scan at a cost of 28,000.

I
did a 10053 trace on the execution of each query (3 table and 4 table queries)
and noted that in the SINGLE TABLE ACCESS PATH section of the 3 table
query it was costing an Index Fast Full Scan (IndexFFS), whereas in the 4 table query it was not costing this access path, and only had the Full Table Scan costed.

At
this point while I could see what was going on (in one case the
Optimizer knows about the Index Fast Full Scan as an option, but in the
other case it doesn't) I could not explain why it was happening, nor
what to do about it. So I turned to the Oracle Forums and raised a
question about Why Optimizer ignoring Index Fast Full Scan?
While waiting for replies I also continued to tinker with the queries
in the hope of gaining some extra understanding of what was going on.

The cause of the problem was identified by Dom Brooks as being the fact that Oracle internally rewrites ANSI syntax OUTER JOIN's to its own native syntax (using "(+)")
before then optimizing and executing the query, and that this can
result in some complex SQL that cannot be optimised as you might
otherwise expect. Also see his earlier reply which gives more background information and links to other explanations about this.

When
it rewrites the ANSI syntax JOIN's to Oracle native syntax, it wraps
each join up in its own in-line view, to ensure that it is equivalent to
the ANSI meaning. It then tries to merge these inline views back into
the main query. For a single outer join, and other simple cases, this
view merging happens smoothly. But in other cases it does not, and the
Optimizer is left with some complex SQL with the joins being performed
within the inline views. And this restricts the options available to it
when building the final execution plan.

"In
Oracle, ANSI left and right outerjoins are internally expressed in
terms of left outerjoined lateral views. In many cases, a left
outerjoined lateral view can be merged and the ANSI left (or right)
outerjoin can be expressed entirely in terms of Oracle native left
outerjoin operator. (A lateral view is an inline view that contains
correlation referring to other tables that precede it in the FROM
clause.)"

However, if the joins are more complex, or there are other
issues e.g. NULL values allowed within indexed columns, then it may not
be possible to collapse the lateral views back into a single level
query. In which case the options then open to the Optimizer are less
than they would be otherwise, as it must treat the unmerged views on
their own.

So if you have a query involving OUTER JOIN's that is
not getting as good an execution plan as you think it could get, it may
be due to you having used ANSI join syntax. And when Oracle rewrites
this to its native syntax it is not able to merge back together the
inline views for one reason or another. In my case it seems to have
been due to the fact that almost all columns allowed NULL values to be
stored in them (except the primary key columns), and NULL values are not
stored within indexes on disk. As a result, Oracle was not able to
utilise this extra index when more tables were added to the query that
were also outer joined.

No comments:

Search This Blog

About Me

I am an Oracle Database professional with over 20 years experience of using Oracle on large, complex, high performance database systems. Excellent knowledge of Oracle internals and scalability, especially on UNIX and Linux systems.