January 31, 2011

ANSI Outer

Here’s an example of ANSI SQL that does something in a fashion that arguably looks a little tidier than the strategy you have to adopt in Oracle. As so often when I compare Oracle syntax and ANSI syntax it’s an example that relates to an outer join. We start with two tables – as usual I have locally managed tablespaces, 8KB blocks, 1MB uniform extents and freelist management. I’ve also disabled system statistics (CPU costing):

If you’re familiar with ANSI SQL you won’t need more than a couple of moments to interpret the following query – but I have to admit that I had to think about it carefully before I figured out what it was trying to achieve:

Look carefully at the query. The ON clause includes a reference to the t1 table that is NOT a join condition. This condition means that Oracle will try to find a matching row in t2 only if the n1 value in t1 is in 7,11, or 13.

Since the join is left outer, any t1 row where id = 15 will be returned, but the n1 column from t2 will be reported only if the t1.n1 value is in 7,11, or 13 and there is a t2 row with a matching id value.

How would you express the same requirment in standard Oracle syntax ? Here’s one possibility:

You will have noticed that I used the /*+ gather_plan_statistics */ hint in both queries, and the plans I’ve shown are the ones that I pulled from memory with their last execution statistics included.

The plans are clearly just a little different from each other – but are they equally efficient ?

Both plans start the same way – for each relevant row in t1 they call line 4 – and in both cases there is only one relevant row in t1, hence one call to line 4.

In the first plan we call the subroutine to create a rowsource (the VIEW operator), and this calls the FILTER operation. The filter operation is an example of a “conditional” filter – i.e. if the test in line 5 is true then line 6 is called – and in this case the test is false so line 6 is never called (Starts = 0) and Oracle doesn’t try to access table t2. So we fetch one row from t1, then call two subroutines that between them do a “case” test but access no data blocks.

In the second plan line 4 calls the table access operation, which calls the index range scan operation in line 5 – but line 5 starts with a call to the case statement that returns NULL – so even though we call the index range scan operation, we don’t access any data blocks, which means we don’t pass any rowids to the table access in line 4, which means that that operation doesn’t access any data blocks. The net workload is that lines 4 and 5 in the second plan also represent nothing more than two subroutine calls and a “case” test.

The two plans are virtually identical in resource usage – so your choice of which to use probably comes down to aesthetic appeal and comfort factor. Thanks to my previous habits I think the Oracle variant is “obviously” much easier to understand – but the client probably felt the opposite because they spent most of their time writing ANSI.

Is there another way to write the query – yes, there is, but for some reason it’s not supported. If you look at the 10053 trace file for the ANSI example you’ll find that Oracle has rewritten it with a LATERAL subquery before optimising it. So, presumably, we can appply the same transformation manually:

Oracle uses the table() operator with collections (or pipelined functions) as the equivalent of the ANSI LATERAL() operator, but doesn’t let you use it with queries (you get Oracle error “ORA-22905: cannot access rows from a non-nested table item” if you try).

The concept is simple – the table() or lateral() operator allows you to introduce a subquery in the FROM clause that references columns from a table that appears earlier in the same FROM clause.

I don’t know why Oracle doesn’t support the lateral() operator in end-user code – but actually, for experimental purposes only of course, if you want to play with it you can always set event 22829:

The plan is identical to the plan for the ANSI after transformation. I’ll leave it to you to decide whether the code is easier to understand than the ANSI or ordinary Oracle versions – but I’d like to see it made legal, even if I didn’t find many cases where I needed it.

Related

Oracle uses the table() operator with collections (or pipelined functions) as the equivalent of the ANSI LATERAL() operator, but doesn’t let you use it with queries (you get Oracle error “ORA-22905: cannot access rows from a non-nested table item” if you try).

Very cunning – but you only get half marks for that one because it is still using a collection operator, and (in general) you would have to create both a scalar and a table type to support each subquery you wanted to use, and it’s got the 8,168 (or blocksize – 24) cardinality issue to address.

I personaly prefer the ANSI syntax but always try to avoid “ON clauses that are NOT join conditions”. I’m sure you are not the only one that “had to think about it carefully before figuring out what it was trying to achieve”. It is the same with me, each and every time I see such thing.

I found a way to write such queries that is very clear and understandable to me. We just need to understand that the “ON clauses that are NOT join conditions” are actually “WHERE clauses” for the table being outer joined. This is visible in the predicate of line 7 in your very first query:

Correct if I’m wrong. But I have heard that the CBO cannot optimize a query that uses an ANSI Outer Join syntax. The CBO must convert it to Oracle’s outer join syntax before the query can be optimized. Is that correct?

Essentially the optimizer will transform almost any SQL to suit its purposes; ANSI is only slightly special in this respect; but you are right that in effect Oracle transforms from ANSI style to Oracle style before optimising in most cases. As far as I know there are only two classes of ANSI SQL that have a “native optimization” code path – the full outer join and the partitioned outer join.

[…] table’s metadata to the table by partition identifier – except you would probably need to use a laterval view, which Oracle doesn’t support, and make the partition extended syntax part of the lateral […]

I use to think in rowsets, that is to say, reduce any complex SQL to its sets of rowset over which I apply operators in order to produce the desire output. This leads to the usage of ‘with v as …, v1 as ()… ‘ and piling of ‘select … from ( ) ‘. Thought less elegant, it lis a faster delivery method of complex SQL where would-be-lateral are just another rowset to produce and integrate. The downside is rather the optimizer behaviour when the layers piles, however this is balanced by the fact that rowset-building-approach allows a gradual build-and-tune of the query while you are integrating rowset after rowset.

[…] solution and followed by window functions, scalar subqueries, and lateral derived tables. Note that event 22829 needs to be set before using lateral derived tables; thanks to Jonathan Lewis for that […]

[…] 4 and 15. If you tried to run this SQL from the command line (after fiddling the event to enable lateral() views), or even just the simple select running from lines 13 to 26, you would get error ORA-00923: […]