Sql tuning request

Without knowing anything about the problem in advance, I thought it would be good to do a walkthrough post of a sql tuning request.

But now that I’m done I’m unconvinced as it’s probably too long, the query too meaningless and the real time sql monitoring text output too unreadable in a blog post.

I’m always reluctant to post real, specific application issues because I’m never sure how well they translate and illustrate the desired points unless you convert them to a standalone test case.

Let’s see how it goes.

The idea is to touch on the broad strategies that I’m going through.

The scope of the solution should match the scope of the problem – so for a single problem query, table design and current indexes, stats and histograms should be considered as set in stone.

Ideally we want to avoid hinting as much as possible or at least stick to acceptable hints.

And if we’re going to manually intervene, we’re not particularly interested in what the current production plan is nor the usage of any plan stability features to preserve it (unless it’s better than we can do ourselves).

This is from an 11gR2 testing environment, 11.2.0.3 to be specific.

Having done the latest merge of a production code release (9.2.0.8) into our 11gR2 environment (upgrade ETA March), a report has been reported as slow, taking about 1.5 minutes in production and some 20 minutes in the 11gR2 environment.

So, having traced the report and found that the driving query is the prime suspect, let’s get some feedback on the performance of the SQL statement.

Here’s the original sql statement to give some context – I’ve commented out most of the columns because they don’t add much other than just length to the post.

Looking at the query and the repetition of tables and joins between the two UNION ALL parts, it looks like a classic case of two sets of disparate driving data that then need to be joined to the same additional tables.

Ironically, expanding ORs out into UNIONs is a common initial tuning step for performance problems with OR predicates.

Here’s what little extra schema knowledge might be useful:

INST has a pk of INST_NUM.

ISCO has an n:1 relationship with INST.

ISDM has an n:1 relationship with INST.

BORG has a 1:1 relationship with INST.ISSUER_NUM.

BSTA has a 1:1 relationship with INST.

ISBK has a n:1 relationship with INST and a n:1 relationship with BOOK.

BOOK has a pk of BOOK_NUM.

RULE is a table used for generic filtering rules, in this case related to BOOK.BOOK_NUM.

Let’s use Real-Time SQL Monitoring (usual license caveats apply) to see what the current performance story is.

SELECT dbms_sqltune.report_sql_monitor(<sql_id>) FROM DUAL;

If there’s one downside to RTSM, it’s a bit too wide for these blog posts :(
Maybe the RTSM pictures would have been better?

Anyway, I’ve sacrificed some of the columns in the report and tried to shrink the font (there’s a scrollbar at the bottom of the plan window).

First thing you might have spotted is the Join Factorisation going on @ step 4 as indicated by the name VW_JF*.

Maybe you’re always wary of the MERGE JOIN CARTESIAN … BUFFER SORT? Not always a problem of course but where there’s trouble you’ll often find her (in real production scenarios, I find that this mechanism is a problem not so much when there are missing join conditions – because these are rarely found in production code – but rather as a valid join mechanism but where the rowsource estimates are significantly inaccurate).

If you look at the SQL, you’ll see a function call as well.

There are also a whole bunch of estimates that are significantly off.

Where to start?

There are four main areas of questioning:

What’s taking all the time? Are there some particular steps in the plan which are more problematic than others?

If estimates are inaccurate, where do they go most wrong or go wrong first?

Which predicates eliminate the most data? i.e. it’s rarely a good thing to join thousands upon thousands of rows only to do a late filter in the plan to reduce it down to a few handfuls. Aka eliminate early.

What is the simplest / quickest / least invasive change that can be made to significantly improve performance? And will it actually be sufficient?

From “Activity %”, I hope it’s clear from the report that all the time is taken up by the bottom half of the UNION ALL.

The rowsource cardinality estimates are not that accurate.

Plus we find in that second half our old friend the MJC+BS.

And we’re just burning CPU down there.

So, I’d like to isolate that bottom half of the UNION ALL and run it standalone.

However, in a clear indication of the issues with it, it won’t run standalone – It’s just blown 30+ gig of temp space.

But it returns only 486 rows so let’s try to get a runnable standalone version.

Before we start looking at possible solutions, let’s start to ask questions about the query logic itself, keeping an eye out for redundant tables and joins and asking ourselves whether this is the best way to word the question we think is being asked.

Best way to start that is

So, we’ve got a few “filters” predicates, a couple of “join predicates” and a NOT EXISTS correlated subquery.

The logic of the subquery is the first thing that jumps out at me.
1. BOOK seems redundant
We join RULE to BOOK and BOOK to ISBK all by BOOK_NUM and ISBK is correlated to the outer INST by INST_NUM.
We do no filtering by any BOOK attribute so it serves no purpose so, let’s remove BOOK, join RULE straight to ISBK.

2. The subquery filtering by BSTA.CURRENT_MAT_DATE is irrelevant.
This is the same filter as in the outer select.
It doesn’t make sense.
Ignoring the actual evaluation order of all these predicates, we should effectively only be checking the NOT EXISTS against INST_NUMS that have passed the outer BSTA filter. So, by definition, this particular predicate in the subquery will always be true. Why repeat it? This is a mistake.
The only thing this subquery should be doing is checking they’re not in the RULE/ISBK combo. So, let’s remove that.

If we run just comment out the tables and joins as per suggestion above, then that bottom query runs in a couple of seconds:

However, I wouldn’t blame you if you wanted to go further, even if we’re bordering on Compulsive Tuning Disorder:

I don’t really like leaving behind a plan that’s got significantly inaccurate estimates – it leaves behind too much of a future threat.

And maybe we can also look at the original query and the direction the Join Factorisation was indicating and take it further? How about doing the UNION ALL a bit earlier and then do one lot of joining to the shared tables from the original SQL?

Let’s use some dynamic sampling to improve some single table cardinality estimates but then perhaps let’s go too far and add some join estimate adjustments (I say too far because I’m going to use opt_estimate but it’s undocumented and I’m not recommending it but if you do use it, then use it in conjunction with qb_name).

And let’s get rid of the function call which, take it from me, in this case is effectively a single table outer join lookup anyway.

An acceptable compromise might be to refactor the UNION ALL as per above but omit the undocumented opt_estimate hints.

All we need to do is some more testing to validate the results and to also triple check the performance when the data is not cached and we’re done.

If you made it this far then Wow! I probably wouldn’t have done.

So, to summarise, what have we done?

Well, we haven’t needed to know much about the original intention of the query nor the schema.

We’ve improved the accuracies of some of the estimates.

We’ve found some redundant tables and joins.

And we’ve reordered the query slightly to better phrase the question that we think was being asked, in the process moving the UNION ALL earlier in the processing so that some of the joining tables only needed to be referenced once (an advantage that might not be preserved if the CBO decides to merge the UNION ALL but that would be the opposite of the recently developed join factorisation mechanism).

And these simple, quick steps effectively reduced the execution time of a query from 20 minutes to a few seconds.

In hindsight, a good example because of the gains realised but a bad example because of the length of query and the associated detail, particularly the redundant tables.

Like this:

Related

4 Responses to Sql tuning request

I’m always reluctant to post real, specific application issues because I’m never sure how well they translate and illustrate the desired points unless you convert them to a standalone test case.

The reluctance is good, because your audience skill level is so unpredictable. Overcoming the reluctance is better, because there is value added to demonstrating your thought process away from a canned simple test. All you need to resolve is to say the prerequisite is to already be somewhat familiar with plans and the access methods involved. Of course there will be questions from the under-prepared and perhaps some complaints about no replicable test case, the former you can politely answer and the latter, well, they will have such a complex problem if they don’t already, and maybe some will be tempted to try to put one together.

Test cases are necessary for demonstrating principles and bugs, your post reflects the reality of support. Being explicit about strategies, knowledge and conclusions is extremely helpful. You might consider uploading things that don’t size well in a blog to other places where they do, but keep the cut-down things in the blog so people don’t have to go there.

Thanks for the feedback Joel.
I know I’ve done at least one post previously with real time sql monitoring output in a pop out xls. I’m not sure that worked particularly well.
A wider output would be better but it costs.

Nice Point to look at the query logic from redudant tables and joins point of view .
Buffer gets are reduced from 257K to 230K. Can you please what you mean by “We’re doing a lot more buffer gets, but the query is executing a heck of a lot faster.”