Parent-Child Relationships and the Questions Left Unanswered by TKPROF, Re-Learning Something Old

30012012

January 30, 2012

As we have seen in the past, TKPROF output sometimes lies, and in a recent OTN thread I was reminded of another case or two where TKPROF output may be misleading. In the OTN thread, the original poster (OP) started the thread by asking a simple question about an execution plan that appeared in a TKPROF output. Below is an execution plan that is similar to the one posted by the OP:

For the above execution plan, the trace file was created on Oracle Database 11.2.0.2 and processed with a 11.2.0.1 client with the patch that fixes the ODBC problem. The original poster (OP) in the OTN thread posed a question about the cr= (consistent gets blocks) and pr= (physical read blocks) statistics. In the above, the INDEX RANGE SCAN operation shows cr=2 pr=8, and the TABLE ACCESS BY INDEX ROWID operation shows cr=3 pr=16. The OP questioned whether the TABLE ACCESS BY INDEX ROWID operation statistics included the INDEX RANGE SCAN operation statistics for a total of 3 consistent gets and 16 physical blocks read from disk, or if the TABLE ACCESS BY INDEX ROWID operation statistics are in addition to the INDEX RANGE SCAN operation statistics for a total of 5 consistent gets and 24 physical blocks read from disk. As sometimes happens with Internet forums, people responding to questions sometimes provide conflicting answers; in this case, at least one person stated that each interpretation is correct.

In an attempt to provide the OP with answers to all of his unanswered questions to that point in the thread, I provided the following response:

In the above, the “TABLE ACCESS BY INDEX ROWID” operation is the parent operation and the “INDEX UNIQUE SCAN” is the child operation. The child’s statistics are rolled into the parent’s statistics. In Oracle Database 11.1 and above, the default behavior is to output the STAT lines (that is the Row Source Operation execution plan) after the first execution of the SQL statement. For the first execution of the SQL statement, there were 11 physical block reads, 6 from the index and 5 from the table. There were 5 consistent gets, 3 from the index and 2 from the table. The estimated cost for the index access is 2, and the estimated cost for just the table access is 2.

I see that you also used the EXPLAIN option of TKPROF – that can lead to misleading execution plans being written to the TKPROF output, especially when bind variables are involved. The 1 parse call should be expected – it would be bad if there were 72,181 parses, because that would mean that there would be 1 parse per execution – a sign that the cursor was not held open either by the application or by the session cursor cache. Simply enabling a 10046 trace in a session changes the environment for the session, which will typically require a hard parse of previously parsed SQL statements when those SQL statements are re-executed. Note that “recursive depth: 1” appears in the output, which suggests that this SQL statement is found in a PL/SQL block.

db file scattered read waits indicate that more than 1 block is read from disk at a time. You can refer to the raw 10046 trace file to determine how many blocks are read on average in the db file scattered read waits.

OK, so if you believe the above quote, the child statistics are rolled into the parent’s statistics, the EXPLAIN option of TKPROF may write incorrect execution plans to the output file especially when bind variables are used, and the original trace file included SQL statements marked with dep=0 or dep=1. In the thread there was disagreement on the first point, so let’s throw together a very short test case to determine if that point is correct.

First, we will create a simple table with a primary key index, insert 200 rows into the table, and collect statistics on that table:

Next, I want to make certain that the two SQL statements that will be used in the test script will already exist in the library cache (so that we can test whether or not enabling a 10046 trace will cause a hard parse when the SQL statements are re-executed). I also will set the fetch array size to 1000 to minimize the number of round trips between the server and client (the default fetch array size is 15 for SQL*Plus, and that value would make the generated trace file a bit more difficult to read due to the extra FETCH and SQL*Net wait events). The second SQL statement, intended to make certain that the STAT lines for the first SQL statement will be written to the 10046 trace file, is only necessary prior to Oracle Database 11.1.0.6:

The above shows a hard parse, 8 blocks read by multi-block read for the index, and 16 blocks read by multi-block read for the table. If we process the trace file with TKPROF, a portion of the TKPROF output should look like this:

By examining the raw trace file, we are able to confirm that the pr=24 statistic in the first row of the Row Source Execution plan includes the 16 blocks for the table and the 8 blocks from the index, as identified in the raw 10046 trace file.

—

The OP later asked a handful of questions, including whether or not he should add the statistics found in the OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS section of the TKPROF output to the statistics found in the OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS section. My first thought was to modify my original test case script shown above like this, so that there is a logging trigger on the T3 table that records changes made to the T3 table in the T4 table:

In the script I set the NO_INVALIDATE parameter of the DBMS_STATS.GATHER_TABLE_STATS procedure to FALSE so that the session cursor cache would be flushed of all references to the T4 table, which then leads to the SQL statements appearing in the trace file on repeated executions if you need to execute the script a couple of times from that line of the test script to the end. I also moved the DBMS_MONITOR.SESSION_TRACE_ENABLE call outside of the anonymous PL/SQL block so that PL/SQL block would be included in the trace file.

After executing the test script, I found the following lines near the start of the generated 10046 trace file:

We are able to determine the elapsed time between these two calls with simple mathematics: (2302533226450 – 2302523409025) / 1000000 = 9.817425 seconds elapsed time in the trace file between the start of the 10046 trace and the call to end the 10046 trace.

Now, let’s see if we are able to find another case where TKPROF lies. We are able to find a hint of the problems in the book “Optimizing Oracle Performance” (see page xv and page 90). Here is the TKPROF summary for my execution of the test script (generated with this command: tkprof or1122p_ora_19180_10046_FIND_ME.trc or1122p_ora_19180_10046_FIND_ME.txt):

Does the above output indicate that my anonymous PL/SQL block completed in 5.35 seconds, and that the recursive call accounted for 2.1 seconds of that time? Or, should we add the non-recursive and recursive elapsed times together to determine the total time? Should we add the total blocks read from disk displayed in the non-recursive portion (0) to the total blocks read from disk displayed in the recursive portion (1240) to determine the total number of blocks read from disk during the execution? The above statistics indicate that there were 1047 waits for the db file sequential read wait event and 182 waits for the db file scattered read wait event in the OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS section of the TKPROF output. We know that each db file sequential read wait event represents reading one block from disk, and that each db file scattered read wait event represents reading at least 2 blocks from disk. Let’s try some simple mathematics: 1047 + (2 * 182) = 1411 – the wait event statistics for the OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS section of the TKPROF output indicate that at least 1,411 blocks should have been read from disk, while the TKPROF output only shows 1,240 blocks. Is TKPROF lying?

Let’s take another look at the raw 10046 trace file. Very close to the end of the trace file I found the following line:

The above indicates that cursor number 481810328 consumed 9.266459 seconds of CPU time, 9.631079 seconds of elapsed time, read 3,231 blocks from disk, and performed 267,427 consistent gets when that cursor was executed – yet that is not in agreement with the above TKPROF output (the numbers are greater than even the non-recursive plus recursive statistics would suggest). If we search upward in the trace file, we are able to find the code that is associated with cursor number 481810328:

That is our anonymous PL/SQL block, which would fall into the non-recursive totals that shows a total of 5.36 seconds of CPU time, 5.35 seconds of elapsed time, 0 blocks read from disk, and 0 consistent gets.

So, did TKPROF answer more questions than it generated?

I was curious, so I processed the trace file using my Hyper-Extended Oracle Performance Monitor program (AKA: Toy Project for Performance Monitoring). This is the what was written to one of the output files:

My program double counted the CPU time and elapsed time (and ? … see below) from the recursive calls. TKPROF only reported the consistent gets (218,319) and physical blocks read from disk (1,240) from the recursiveFETCH lines in the trace file, but discarded those statistics from the non-recursiveEXEC lines in the trace file. That possibly explans why the number of db file single block read waits plus two times the number of db file scattered read waits exceeds the number reported in the disk column of the TKPROF output:

1047 + (2 * 182) = 1411 > 1200

The partially interesting item is that the db file scattered read waits, on average, read 12.00 blocks at a time, and the many of the db file scattered read waits are attributed to a mysterious cursor #482246872. Another output of my program is an Excel spreadsheet that lists all wait events found in the trace file. I added the following formula in cell F2 and copied that formula down to the last wait event line:

With those two formulas in place, I was able to determine the actual number of physical block reads, and whether or not the blocks were read one at a time, or more than one block at a time. The resulting Excel spreadsheet looks like this:

…

…

So, the above screen captures show that there were 9.8159 seconds in the trace file between the first and last tim= values. The last of the above screen captures shows there were in total 1,047 single block reads (db file sequential read waits) and 182 multi-block reads (db file scattered read waits), with a total of 2,184 blocks read by multi-block read for an average of 12 blocks per multi-block read.

Well, it seems that the recursive SQL has caused a bit of a problem for my program. The 1,240 physical block reads found on the FETCHs line are actually included in the 3,231 (1,047 + 2,184) physical block reads found on the EXECs line.

—

As a final self-answering question, which is worse: that TKPROF reports only 1,240 physical block reads, or that my program double-counts these 1,240 physical block reads and suggest that there were 4,471 physical block reads? Maybe we should just ignore the recursive SQL statistics by checking an option box in my program:

That option in my program fixed the CPU seconds, elapsed time, total physical blocks read, and total consistent gets, but now the number of EXECs, FETCHs, and ROWs do not match what is found in the raw 10046 trace file.

As the above Excel output and my program’s fixed output demonstrates, sometimes there is no replacement for examining the raw 10046 trace files.

Now the way I interpret this is that each row resulted in 1948310 buffer reads that included 1,941,430 physical block reads. In my case I read the first 100 rows from the table and hence would expect 194,831,000 buffers reads and 194,143,000 physical block reads. Let us check against tkprof report. It shows 194142806 physical block reads and 194831114 buffer reads as below.

Thanks for showing how to use Unix/Linux OS tools to parse the raw 10046 trace files.

Oracle Database 11.1.0.6 and later by default will output the STAT lines (the Row Source Operation execution plan) in a trace file only after the first execution of the SQL statement (can you see where those lines appeared in your trace, it should be shortly after the first EXEC). Earlier versions would only output the STAT lines when the cursor was closed. That likely explains why the TKPROF Row Source Operation execution plan shows cr=1948310 pr=1941430 and not cr=194831114 pr=194142806.

This is really cool. I came here from OTN and saw the struff that Charles is showing and then Mich’s stuff. They are great, A very good explanation and examples of treating tkprof output and aggregates.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:
<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: