May 23, 2012

Logical tuning

Here’s a model of a problem I solved quite recently at a client site. The client’s query was much more complex and the volume of data much larger, but this tiny, two table, example is sufficient to demonstrate the key principle. (Originally I thought I’d have to use three tables to model the problem, which is why you may find my choice of table names a little odd). I ran this example on 11.2.0.2 – which was the client version:

You’ll note that I’ve included the /*+ unnest */ hint in the subquery because I want Oracle to run this as a hash anti-join; and it would appear to be legal (and simple) to do this given the various not null contraints and primary key information. In fact the plan uses a filter subquery:

I suspect that the optimizer code bypasses the anti-join because of the (carefully bracketed) disjunct (OR) predicate. On the client site this resulted in the subquery being executed 9 million times, reducing an intermediate data set from 9M rows to 2M rows at a cost of 27 million buffer visits and about 60 CPU seconds. Fortunately I was able to dredge up a little bit of propositional calculus and quote the following equivalence:

not( A or B ) <=> (not A and not B)

Equally fortunately I didn’t have to worry about three-valued logic (all relevant columns were declared not null), so I was able to rewrite the query in the form:

The change to hash anti-joins was a huge benefit (the nested loop anti-join would have improved things for the client to a degree, but there’s not really an enormouse difference in some cases between a filter subquery and an equivalent nested loop anti/semi-join). In this case the query run time dropped from 95 seconds to 27 seconds – all of it CPU time.

Update: a check of the 10053 trace file for 10.2.0.3 shows the following:

Related

I think that it would be be possible to get a good plan with initial query by the using of concatenation, if there weren’t bugs (see Bug 7286607: INCORRECT RESULT FOR THE QUERY IN 10.2.0.4 for example). I’m not sure that concatenation could occur for outer anti-joins in 10g where bugs wasn’t fixed and I don’t have 10g for testing right now.

I had a look at both bugs (this comment, and the next), and I don’t think either of them really relates to the shape of query I have. Both are about semi-joins, and the 10g bug relates to tableA with subquery being joined to tableB with subquery, and the positioning of the two separate subqueries.

Given my additional comment about the query working with a range predicate for the third column – it really does look as if a concatenation (or probably inlist iterator) method could and should apply in this case inside the unnest.

again the “less than” issue.
I’ve checked this on 11.2.0.3 and it shows same behavior with the same message in 10053 trace.
And a question: did your client file an SR for this trouble? I think that would be a very nice feature request to the optimizer team, especially since they already can do opposite type of transformation called coalescing with multiple subqueries (in some cases).

We didn’t raise an SR – it’s the usual problem, once you’ve worked around a problem you move on to the next one. It would have taken a couple of hours to raise the SR. It’s possible that someone from Oracle might read the blog though, and follow it up.

[…] example of Oracle’s code not behaving consistently. You may recognise the basic query from yesterday’s example of logical tuning – so I won’t reprint the code to generate the data sets. This examples in this note […]

Thanks for sharing the problem and excellent analysis of it as well as the clever solution.
Just one doubt. You said in the “Update” On the other hand, it is possible to get unnesting and a hash anti-join with the predicate: id3 between id2 and id2 + 1000.
But won’t that make the query different to the original query? id3 between id2 and id2 + 1000 is not equivalent to (id3 = id2 or id3 = id2 + 1000). Or is it because of the nature of data that these two predicates will produce same results? Or did I miss something obvious?

Thanks for making that comment. I should have stated explicitly that the query was not logically the same and could be expected to return a different answer. The reason for the comparison was to point out that a surprisingly similar type of query (equality on two columns out of three). In fact I think I’ve even seen cases where the optimizer has turned “in (list of values)” into a range scan that discarded unwanted values – I’ll have to see if I can find time to engineer an example.

Hi Jonathan,
I have learned a lot lot by reading your replies on OTN forums and also few times you have helped with nice replies to my queries. Thanks a ton for those.

Can you please help me with my one concern.
One thing i am still uncomfortable is sql tuning. Few questions:
1. How can i become really comfortable?? How can I be sure that when i tune a sql or a job there is no scope of betterment left.
I really want to build my credibility in my organization with respect to sql tuning. As of now i am not satistified with my improvement and performance :(

I am curious to know your views on this not-so-exactly-technical topic :)

First, I think it is enormously helpful if you understand the meaning of a query. I’ve often made the point that it usually takes me longer to optimize complex queries if the table and column names are words not English. If you don’t understand that a query is trying to find a find all orders placed in May for a particular product that was out of stock and due for replenishment within 7 days, then it’s hard to envisage the paths through the data that might lead to the most efficient join order.

Second, you really have to know the data volumes, patterns, and possible anomalies (special cases) so that you can say things like: “each time I pick a row from tableX I have to acquire N rows from tableY, but with the current indexing that means picking up 5*N rows and discarding 80% of them”.

Third, it’s important to be able to think flexibly about transformations and access paths. Do you want to execute a filter subquery rather than an anti or semi join, at what point in the query do you want to do that and how much work will it take; could you change a “not exists” into a “minus”; would aggregating early be better than aggregating late.

Finally, re-read the manuals, and contents of v$fix_system_control, and v$sql_hints, and Oracle’s bug-fix report (etc.) from time to time. There may be things that you didn’t understand or appreciate the last time you read them that suddenly make sense, or perhaps look relevant because you’ve recently done something similar.