In the above, the table PFMQ_MESSAGEDATASTORAGE contains a long raw column – the MESSAGEPARTDATA column. You enable a 10046 trace, but only at level 1 – so there are no wait events written to the trace file. The resulting trace file is then processed using TKPROF, and you obtain the following output:

How would you troubleshoot this performance problem? What are the problems? What looks good about the above output? What about the above makes you wonder if some detail is missing? You will find the above SQL statement in this comp.databases.oracle.server Usenet thread.

What can be said about the above output?

The client application is retrieving roughly 15 rows in each fetch request: 64,806/4,321 = 14.998 rows per fetch. Maybe setting the fetch array size to a larger value would help?

The fetch calls required 14.56 seconds of the server’s CPU time, while the elapsed time for the fetch was 508.31 580.31 seconds. 565.75 seconds were probably spent doing something other than actively burning CPU cycles – such as waiting for the completion of a disk read.

Most of the indexes accessed were high precision indexes, meaning that few of the rowids returned by the index where eliminated at the table level, with the exception of the AK1_PFMQ_SEQUENCENUMBER index, where 46% of the rows identified by the index were discarded. Those rows were discarded very early in the plan, so the performance impact was likely minimal.

There was heavy use of nested loop joins – that might be OK, but might not be as efficient as a hash join if a large percentage of the tables were being accessed or if the clustering factor were high.

There were 231,750 blocks read from disk, and considering the heavy use of indexes and nested loop joins, those blocks were likely read one at a time from disk. If that was the case, the average time to read a block from disk was 0.0024412 seconds (2.4412ms), which is considered to be very fast access times for physical reads from disk.

Considering the WHERE clause, the execution plan starts with the two tables with the greatest number of predicates – so that is probably a smart starting point.

The elapsed time reported on the last line of the execution plan is greater than the elapsed time reported on the first line of the execution plan – just an odd inconsistency that the time reported on the last line was 613.5 seconds, yet the elapsed time reported by TKPROF for the entire execution was only 580.31 seconds? The second line of the execution plan shows 1,729.7 seconds (28.8 minutes) on the time= entry, which again is inconsistent with TKPROF’s elapsed time statistic.

580.31 seconds is roughly 9.7 minutes – what happened to the other 20.3 minutes in the 30 minute total execution time that was reported by the original poster?

What might be the next couple of steps for troubleshooting this performance problem?

Generate the 10046 trace at level 8, rather than level 1. That will cause the wait events to be written to the trace file. If the trace file contains a large number of long duration waits on SQL*Net type wait events, check the network with a packet sniffer (Wireshark, for example) and check the client-side activity to make certain that it is not the client application that is the source of the slow performance. If you see large unexplained gaps in the tim= values in the trace file without a corresponding long wait event in between, investigate whether the server’s CPUs are over-burdened.

Check the statistics on the tables and indexes to make certain that those statistics are reasonably accurate and up to date.

Review the current optimizer related initialization parameters to make certain that silly parameters are not specified.

Most of all, determine where the missing 20.3 minutes have gone.

Like this:

LikeLoading...

Related

Actions

Information

8 responses

3032010

Narendra(08:38:14) :

Charles,

Nice analysis. You covered almost everything that I could think of and then asked what else. That is not fair. 🙂 Here is what I can think of:
1) To confirm the 3rd point (about row elimination) and 6th point in your analysis (about execution plan starting point), I would check what is the EXPLAIN PLAN outcome. Assuming this is on database version that gives predicate information in plan, the EXPLAIN PLAN will a) prove if expected plan is same as actual plan and b) tell clearly at what stages filter predicates are applied. Also, the ROWS column in EXPLAIN PLAN output shows expected rowcounts (per iteration) for a NESTED LOOP join which can be vital info as ROW SOURCE OPERATION reports row counts for all iterations combined (although in this case, it may not provide any new details)
2) I was actually a bit surprised to see following steps in ROW SOURCE
OPERATION

With the way query is written, I was expecting PFMQ_MESSAGESTATUS to be accessed more than once (for each iteraion). Is this some kind of optimizer improvement? Or is it because XPKPF_MESSAGESTATUS index includes both “statusrevisionnumber” and “ID” columns? Will there be any benifit in rewriting this query (using analytical function) in order to remove the correlated subquery ?
3) Finally, my most important question will be- What is done (by the “application”) with 64806 ordered set of rows that this query produce? In other words, is it absolutely essential to have this query being executed on its own?
I am not saying this is necessarily wrong but I would certainly validate the approach/design.

p.s. BTW, what does the “r=” component in ROW SOURCE OPERATION indicate? Physical Reads? I always seem to get “pr=” in 10.2.0.x versions and no “r=” in TkProf reports.

It is good that you are reading these blog articles somewhat as personal challenges – that is how many of my articles are written. I do not always tell everything that I see to be wrong or right – partially due to time constraints, and partially to encourage other people to offer comments that provide much more complete (and accurate) answers than I could have offered. Fortunately, there are a number of very smart people who are willing to freely share their knowledge.

Good idea to check the estimated number of rows per operation in an execution plan – it could be that statistics have not been gathered in the last 8 years which will cause Oracle to greatly under-estimate the cardinality for the operations, or it could be a problem with non-mutually exclusive predicates (I think that Randolf described this as a potential problem). Randolf’s comment about the timing in the portion of the execution plan agrees with a comment that I thought about making, but his explanation is better than the one I considered offering.

Good point about questioning what would happen with the 64,806 rows – will 64,800 of those rows be thrown away, or will all of the rows end up on a report of some sort?

The Oracle 11.2.0.x documentation describes the “r=” entry in the Row Source Operation section of the TKPROF output. Quoting from the documentation:

“In this sample TKPROF output, note the following under the Row Source Operation column:

•cr specifies consistent reads performed by the row source
•r specifies physical reads performed by the row source
•w specifies physical writes performed by the row source
•time specifies time in microseconds”

The original poster in the Usenet thread is running Oracle Database 9.2.0.6, so the “r=” entry on the STAT= lines in the trace file (the “Row Source Operation” lines in the TKPROF output) indicate the number of blocks that were physically read from disk. With your help, I think that we just found a documentation bug – something that was not updated in the documentation to reflect changes in Oracle’s tracing behavior. I am guessing that the STAT line trace file content changed starting with Oracle 10.1.0.x, but the documentation was never updated to reflect that change.

This is what the Row Source Operation section looks like in Oracle 10.2.0.x:

Centinul answered your second question about how I calculated the average time for a block read. Interestingly, Randolf came up with entirely different average read times by using the “Row Source Operation” section from the TKPROF output. He calculated that the average read time for an index block was 13 ms, while the average read time for a table block was between 3 and 5 ms. Which calculation is more accurate, I do not know.

I’ve replied to the original thread on Usenet group – there are a couple of noteworthy oddities with this TKPROF output that might explain the issue with the missing 20 minutes.

@Narendra: The PFMQ_MESSAGESTATUS is actually queried for every iteration, since it is a child operation of the inner table of the NESTED LOOP operation. But as you can see from the TKPROF output it does not seem to be relevant in this particular case – it only takes 2 seconds to execute this subquery 64,000 times, so there are other areas to tackle first and no need to replace this subquery with the present execution plan.

Thanks for stopping by my blog and sharing your great troubleshooting skills. I did not feel comfortable/confident with using the time= values in the Row Source Operation section of the TKPROF output for analysis purposes because of the apparent inconsistencies between the STAT lines and the aggregation of the PARSE, EXEC, and FETCH lines, but that does not mean that the time= values should not be used.

I seem to recall Jonathan Lewis telling me in the first half of 2008 on the OTN forums that the time= values in the STAT lines could be inaccurate. I do not recall if he stated that the TIME= values could not be trusted because not all of the code paths are instrumented, there are rounding errors when a code path is repeatedly executed in a loop (a NESTED LOOP join), or possibly another reason such as time occasionally being allocated to the wrong operation in the plan. Of course, I have been unable to find that thread on OTN, so I might just be remembering his statement incorrectly.

It is an interesting problem, so I hope that the OP returns to the Usenet thread.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:
<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: