10046 Extended SQL Trace Interpretation 3

In previous blog articles in this series we looked at various methods for starting a 10046 trace file and also the basics of walking through a trace file. Today’s blog article continues the investigation of 10046 trace files. To begin, we need a test script:

Table and index statistics are collected (the index statistics would already exist on Oracle Database 10.1 and above, as the statistics are calculated automatically during index creation).

The buffer cache is flushed twice to force physical reads when the test SQL statements are performed.

Two bind variables are defined.

The trace file to be created is named.

10046 tracing at level 12 is enabled for the session, specifying that the execution plan should be written after each execution of the SQL statement, rather when that cursor is closed (the PLAN_STAT parameter is only valid in Oracle Database 11.1 and above).

A SQL statement using the bind variables is executed with the array fetch size set at the SQL*Plus default of 15.

The value of the second bind variable is increased from 2 (a very small number) to 10,000 (a very large number for the tables involved).

The array fetch size is increased from the default value of 15 to a value of 500.

The SQL statement is executed again with the new bind variable values.

A simple SQL statement is executed to force the closing of the previous SQL statement and the printing of the execution plan to the trace file (this is needed prior to Oracle Database 11.1, or when the PLAN_STAT parameter is not set to the default value of ALL_EXECUTIONS – note that the SQL statement being added to the session cursor cache after three executions may prevent the execution plan from being written to the trace file unless the PLAN_STAT parameter is set to ALL_EXECUTIONS).

The test is repeated a second time with a different name specified for the trace file.

The above script was executed on Oracle Database 11.2.0.1, on a platform that only supports direct, asynchronous I/O (equivalent to setting the FILESYSTEMIO_OPTIONS initialization parameter to SETALL). The direct, asynchronous I/O configuration prevents an operating system level file cache from speeding up the access to the physical data files (the OS would be able to completely avoid accessing the disks if an O.S. level file cache were used). Let’s start by looking at the statistics that were output by AUTOTRACE:

Interesting, the number of physical blocks read from disk (164) exceeds the number of consistent gets (163) plus the current mode gets (db block gets = 0). No redo was generated (redo generated on a SELECT might be caused by delayed block cleanout). 840 bytes were sent back to the client, and the server received 520 bytes from the client. Two round trips were performed in order to retrieve two rows from the database (even though the array fetch size was set to 15). There was also a single recursive call, probably during the hard parse of the SQL statement.

The number of recursive calls dropped from one to zero (probably because a hard parse was not required). The number of consistent gets (blocks) is now roughly double the number of physical reads (blocks). 94,102 bytes were sent back to the client, and the server received 729 bytes from the client. Twenty-one round trips were performed to retrieve the 10,000 rows, and there are still no sorts in memory or on disk.

On the second execution with the N2 bind variable reset to a value of 2 we see another recursive call (maybe another hard parse?). Again, the number of physical blocks read from disk (171 – seven more than earlier) exceeds the number of consistent gets (163 – the same as earlier) plus the number of current mode gets (db block gets = 0).

On the second execution with the N2 bind variable reset to a value of 10,000 we see another recursive call (maybe another hard parse?). The number of consistent gets (336 – thirty-eight less than earlier) is a bit more than twice the number of physical block reads (150 – again thirty-eight less than earlier, and now less than what was required when N2 was set to a value of 2). What happened, is Oracle Database 11.2.0.1 toying with our performance tuning ability? Maybe we should look at the TKPROF output (note that this is output from 11.2.0.1 TKPROF):

Assuming that TKPROF from the Oracle 11.1.0.6 client was not used, the above indicates that the same execution plan was used when C2 was set to 2 and also when it was set to 10,000. Because the execution plan was written after each execution, the execution plan provided by TKPROF is a little confusing when trying to cross-reference the statistics to see exactly what happened. The statistics show that 10,002 rows were retrieved in 0.20 seconds with 537 consistent gets and 352 blocks read from disk, while the execution plan shows that 2 rows were retrieved in 0.000000 seconds with 163 consistent gets and 164 blocks read from disk. The summarized wait events in this case cannot be correlated to the lines in the execution plan because TKPROF only printed the first execution plan found in the trace file for the query.

Because we used the 11.2.0.1 client, TKPROF clearly shows that the execution plan changed during the second execution when the value of N2 changed from 2 to 10,000. With the two execution plans and a single execution with each value of N2 we are able to cross-reference the various statistics.

With N2 at a value of 2, we see that:

163 blocks were obtained from memory using consist gets, 3 of which belonged to the IND_T4_1 index, 2 of which belonged to the T4_1 table, and 158 of which belonged to the T3_1 table.

171 blocks were read from disk, 8 of which belonged to the IND_T4_1 index, 8 of which belonged to the T4_1 table, and 155 of which belonged to the T3_1 table.

There was only 1 single block read, with 19 of the 20 physical read events performed using multi-block reads with an average of 8.95 blocks read per multi-block read [(171 – 1)/19].

The total physical read time per the wait events is 0.00 seconds. The elapsed time and CPU time per the statistics summary is 0.00 seconds.

With N2 at a value of 10,000, we see that:

336 blocks were obtained from memory using consist gets, 0 of which belonged to the IND_T4_1 index, 178 of which belonged to the T4_1 table, and 158 of which belonged to the T3_1 table.

150 blocks were read from disk, 0 of which belonged to the IND_T4_1 index, 150 of which belonged to the T4_1 table, and 0 of which belonged to the T3_1 table.

There was only 1 single block read, with 16 of the 17 physical read events performed using multi-block reads with an average of 9.31 blocks read per multi-block read [(150 – 1) / 16].

The total physical read time per the wait events is 0.00 seconds. The elapsed time and CPU time per the statistics summary is 0.01 seconds and 0.03 seconds, respectively.

The TKPROF summaries are nice, but there is still a missing element – detail. Before we dive into the raw 10046 trace, let’s find the OBJECT_ID and DATA_OBJECT_ID for the various objects in our test:

As we saw in the execution plan earlier, 83176 represents the IND_T4_1 index. 83174 represents table T3_1, and 83175 represents table T4_1. The objects were created in an ASSM autoallocate tablespace. Now the raw 10046 trace files.

From the above, the parse call required 0.000122 seconds to perform a hard parse using an optimizer goal of 1 (ALL_ROWS), and two numeric bind variables were provided. The SQL_ID and HASH_VALUE are also displayed (these may be used to retrieve a DBMS_XPLAN formatted version of the execution plan). Continuing with the 10046 trace file:

From the above, we see that the first fetch returned only a single row, and that fetch required 0.108849 seconds, reading 164 blocks from disk, with 161 consistent gets. The one single block read at the start was for object 83174 (table T3_1), while the one single block read just before the fetch was for object 83175 (table T4_1). Object 83176 (index IND_T4_1) was read using a multi-block read – not all index accesses are necessarily single block reads, although a fast full index scan will usually use multi-block reads, the execution plan shows a range scan. What was the purpose of the two single block reads of the tables?

Comparing those numbers with the single block reads, we see that the segment header block of table T3_1 was read, and the block immediately after the segment header block of table T4_1 was read. Notice that there is quite a bit of variation in the block read access times, ranging from 0.000093 seconds to 0.000303 seconds for a single block read and 0.000685 seconds (0.7ms) to 0.025845 seconds (25.8ms) for the multi-block reads. While these physical read times were achieved using typical hard drives without the benefit of an operating system cache or RAID controller cache, or even a SAN cache, for the most part the times are atypically low for “spinning rust” (for example, a 15,000 RPM hard drive requires 4ms for a single rotation, the average seek time should be at least 1/2 of the single rotation time) – cache memory built into the physical drives is likely the cause of the atypically low average read times. We see that for the most part, the multi-block reads are either 7 or 8 blocks, with a single 36 block multi-block read (the autoallocate extent size for the final extent in table T3_1 increased from the previous extent’s 64KB to 1MB, allowing for a multi-block read larger than 64KB). The PLAN_HASH_VALUE 39991016 (visible in V$SQL) was also written to the trace file. Continuing with the 10046 trace file:

From the above, we see that SQL*Plus required roughly 0.036739 seconds to process the first row that it received from the database and to request the next 15 rows. The second fetch call, which also returned a single row, required just 0.000048 seconds and required 2 consistent gets. If we subtract the tim= value found on the first FETCH call from the tim= value found on the second fetch call, we see that 0.036857 seconds actually elapsed, so 0.000118-0.000048 seconds = 0.000070 seconds of excution time that essentially disappeared – part of the unaccounted for time for the trace file. Continuing with the 10046 trace file:

Remember here that we did not flush the buffer cache in between the execution with N2 set to 2, and when it was set to 10,000. Notice that the first fetch of the first row this time completed in 0.003430 seconds (a decrease from 0.036739 seconds) and did not require any physical reads. The second fetch call, which retrieved 500 rows, performed a multi-block read of 4 blocks, and then a multi-block read of 8 blocks. Why 4 blocks and not 8 (or 128 to permit a 1MB multi-block read) for the first read request? Block 214219 was already in the buffer cache (see the trace file section from the previous output), and since Oracle will not perform a physical read of a block for the buffer cache that is already in the buffer cache, the multi-block read of the 64KB extent could only read 3 or 4 blocks (214216 through 214218 or 214220 through 214223) before hitting an extent boundry. The first fetch call performed 161 consistent gets with no physical reads, while the second fetch call performed 10 consistent gets with 12 physical block reads (of table T4_1) – I still wonder why the number of consistent gets is less than the number of physical reads in this one fetch call (likely a difference in the array fetch size and the number of rows returned by the multi-block read size). Continuing with the 10046 trace file:

From the above, we see that the execution plan, indicated by the STAT lines, was the same as when N2 was set to a value of 2. However, unlike with TKPROF, we are able to see how long the second execution required just from looking at the STAT lines in the raw 10046 trace file.

How does the 10046 trace file for the second half of the script compare with the 10046 trace file for the first half of the script? What has changed? What is the same?

And just to keep you wondering (what has changed and why), here is output from another box that is also running Oracle Database 11.2.0.1 using direct, asynchronous I/O and ASSM autoallocate tablespaces:

The second execution of the statistics gathering was NOT performed to change the execution plan. The NO_INVALIDATE parameter (of DBMS_STATS.GATHER_TABLE_STATS) was not set to FALSE, so on Oracle Database 10.1 and above those statistics would not have immediately affected previously generated execution plans.

I selected the statistics gathering lines as the starting point where the script repeated, so that the timings in both 10046 trace files would be affected roughly the same by any data that might remain in the hard drive caches when Oracle’s buffer cache was flushed. It might be interesting to see whether or not removing those lines will affect the outcome of the test results.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: