Physical Reads are Very Fast, Why is this SQL Statement Slow

15032010

March 15, 2010

How would you troubleshoot the following situation – what do you know, or believe to be true with at least a 95% confidence about this situation? You encounter a SQL statement that is taking much longer to execute than is expected. You enable an extended SQL trace at level 8 and pass the resulting trace file though TKPROF. In the TKPROF output you find the SQL statement that is of interest:

While reviewing the above you notice that there were 652,774 consistent gets and 283,093 physical block reads. The wait events associated with the SQL statement indicate that there were 283,093 waits on the db file sequential read wait event, which indicates that all of the physical block reads were performed one block at a time (the Times Waited statistic for this event matches the disk statistic in the Fetch summary line). If a total of 8.51 seconds (from the Total Waited statistic for the db file sequential read wait event) were needed to complete those 283,093 single block reads, then on average each single block read required approximately 0.000030 seconds – roughly 133 times faster than one revolution of a fast 15,000 RPM hard drive.

If we experiment with the average single block read time, there were 4,985 single block reads of object ID 52001 (the IJL_LIN_FK_IX index), which suggests that the total time for those physical reads to complete was 4,985 * 0.000030 seconds = 0.15 seconds. There were 5,176 logical reads for that operation, which is just slightly more than the number of physical blocks that were read from disk for that operation. The time= statistic for that line in the plan indicates that the operation required 9.90 seconds. Keeping in mind the other wait events, what happened during the 9.75 seconds of that operation that did not involve the physical block reads? Notice that the TKPROF output shows that the query statistics summary indicated that the query consumed 105.05 seconds of CPU time with an elapsed query time of 102.73 seconds.

Actions

Information

8 responses

* What was the server doing in the 9.75 seconds of the INDEX RANGE SCAN operation on the last line of the execution plan that probably could not be attributed to physical I/O?

The CPU time mainly doing the logical IO,and Logical IO is a time consuming operation.

* How is it possible that 105.05 seconds of CPU time were required for a SQL statement that had a total execution time of 102.69 seconds? What was the CPU doing for 105.05 seconds?can answer this ? any parallel operation?

* Are the number of elements in the IN list contributing to the performance problem?

I think number of elements in the IN list contribute little to the performance problem.

* Are there any hints that you might try to force a different execution plan, and if so, is there a reason to change the execution plan?

Need more detail about the data distribution to do this.

* Why was there a miss in the library cache during both the parse and execute calls?

only execute once? miss in the execution call may to due to the statement cache with a optimizer change.

* What caused the incorrect cardinality estimates, and how would you correct the cardinality estimates?

I have no answer to this.

* Does the ANSI style join syntax have an impact on the execution time for the query?
* Would you recommend any changes to initialization parameters, and why?
* What needs to be done to allow the query to complete in 5 seconds?
* How would the server’s behavior change if significant competition for the server’s RAM caused excessive use of the operating system’s page file?

CPU consumption with rise a lot, and physical io read time with raise also ,because there will not be so many blocks in FileSystem Cache again.

* Are the server’s CPUs a constraining resource for this SQL statement – if so, why, if not, why not?

I think the problem is more due to the high consistent gets ,and not due to the CPUs limitation.

I was waiting to see a blog post from you on this. :)
I believe jametong has answered most of your questions above and you yourself has answered many in your response on the forum. Some remaining (??) are: How is it possible that 105.05 seconds of CPU time were required for a SQL statement that had a total execution time of 102.69 seconds? What was the CPU doing for 105.05 seconds?
The difference is simply due to the different ways oracle uses to derive CPU time and elapsed time. The elapsed time “misses” out on counting especially when there are large number of small waits (283093 waits with max wait of 0.02). I learned it from Tom. Why was there a miss in the library cache during both the parse and execute calls?
Because SQL was executed from SQL*Plus What caused the incorrect cardinality estimates, and how would you correct the cardinality estimates?
I would verify (and regather) statistics on I_JOURNAL table and IJL_LIN_FK_IX index, which should take care of difference in cardinality estimates. What needs to be done to allow the query to complete in 5 seconds?
Query less data or delete data from tables ?? :) On a serious note, the 5 seconds of expected execution time may not be realistic given the database version, table structures, available CPU/memory etc. Are the server’s CPUs a constraining resource for this SQL statement – if so, why, if not, why not?
Looks like apparantly but I won’t be surprised if it turns out to be an optimizer limitation (or “bug”) if the database version is 10.1.0.x.
For me, the main suspect is I_JOURNAL table (and probably its index). If I_JOURNAL table is in a tablespace with non-default block size (16k or 32K), would that possibly explain higher CPU time ? I remember Richard Foote describing the increased CPU usage when an index was rebuilt in a 16K block tablespace as compared to a 8K block tablespace. I also thought if having LOB datatype columns in I_JOURNAL table might affect the increased CPU time but as the query is simply doing a COUNT(*), I am not really sure if it matters.
However, it certainly looks like I_JOURNAL table (and its index) access looks to be the most influencial factor (the outer hash join also does not appear to be spilling to disk so probably no issues with PGA memory settings). So, in order to isolate the issue, I would also trace following queries
1) Original query wihout I_JOURNAL table
2) A simple count(*) on I_JOURNAL table alone.

p.s. BTW, if you have noticed in the forum, there is a significant difference between the trace details (especially for I_JOURNAL table) between original query and the query with FULL hint. So it appears that OP has also changed something else, in addition to adding FULL hint. Following are key diff.
Original With Full Hint Diff
Rows 888117 5010465 4122348
CR 5389 262646 257257
PR 4986 198250 193264
Elapsed 23801052 35075329 11274277

Great discussion so far – I do not wish to bring an end to the constructive comments. Deleting data is an interesting approach to making the query execute faster, but that might throw off the query results.

For points of reference, this screen shot (10.2.0.x) shows 7,407,092 consistent gets performed in 87.51 seconds, which implies a rate of about 84,643 consistent gets per second on a 4 year old Intel Xeon 3.66GHz server:

Page 308 of the Expert Oracle Practices book shows a TKPROF output with the STATISTICS_LEVEL parameter set to ALL on Oracle 10.2.0.4 where the server performed 128,958,761 consistent gets in 2857.12 seconds, which implies a rate of 45,136 consistent gets per second on a 2.93 GHz Intel Nehalem architecture CPU. The number of consistent gets with the Nehalem CPU jumps significantly on Oracle 10.2.0.4 when the STATISTICS_LEVEL is set to TYPICAL.

Edit: I want to repeat that the above are just points of reference for how many logical I/Os might be performed per second.

Are those figures (logical I/Os per second) based on reading one block at a time (typically indexed access) or multiple blocks at a time (typically full scan) ?
Another reason came to mind about increased CPU usage. Table and/or index compression used on I_JOURNAL table (or its index) ? Although not sure how much more CPU the compression will cause.

The 10.2.0.x reference refers to consistent gets that were mostly caused by full table scans.

The reference on page 308 of the Expert Oracle Practices book is from a TKPROF summary where the execution plan consisted mainly of nested loop joins with index lookups and one Cartesian join that was caused by statistics that were a couple of hours out of date. The execution time summary that we included on page 313 of the book for the same SQL statement shows 66 minutes and 44.31 seconds for the query execution when the STATISTICS_LEVEL parameter was set to ALL on Oracle 10.2.0.4 and 6 minutes and 56.73 seconds for the query execution with STATISTICS_LEVEL set to TYPICAL on Oracle 10.2.0.4. If we recalculate the number of consistent gets using those numbers, at the setting of ALL the server processed an average of 32,205 consistent gets per second. With the setting at TYPICAL the server performed an average of 309,454 consistent gets per second.

But it also makes me wonder if more than 1 session, maybe 500, were all trying to execute this or a similar SQL statement at the same time the OP was attempting to trace the execution – that could effectively reduce the upper limit for the number of consistent gets for the traced session by a factor of 500 or more (while RAM is fast, there is an upper limit to RAM speed).

I have been tied up a bit lately with non-Oracle things – I am giving a presentation about methods of custom report development to an ERP user’s group tomorrow. I will try to provide a couple more comments about the thread in a couple of days.

It might be interesting to find out if the OP is running one of those special Sun servers with the T2000 CPUs (I think that is the number), where each thread receives an effective 300MHz of CPU capacity – that might explain the low number of consistent gets per second.

There are still some questions which were raised in this blog article that the OP would need to answer to avoid this article from slipping into just a guessing game – a lot of logical reasoning has gone into all of the comments, which I am very happy to see.

It could be a problem with:
* Statistics level set to ALL
* Exceeding the RAM memory in the server, causing a lot of swapping to disk. I believe that someone correctly mentioned in the comments of this blog article that the OP would likely lose the benefit of the OS file cache if the server were really starved for RAM. That is probably true, unless – it is a huge SAN cache that is actually allowing the 0.000030 average single block read time, or the data resides on SSD hardware (likely also with a huge cache), or maybe the OP is doing something scary – like having the data files located in a RAM drive (and in the process consuming a lot of physical RAM).
* Maybe 500 users are all simultaneously trying to execute this SQL statement, or another inefficient statement.
* The number of consistent gets is high enough that I think an execution time of 5 seconds is very unlikely even under ideal conditions. 7 seconds might be possible with a fast CPU and fast access to RAM. Of course we do not know if the database is using 2KB blocks, 32KB blocks, or somewhere in between.
* I think that some of the people who responded in the thread are on the right track – the SQL statement needs to be changed because it currently will not scale to the performance needed by the OP. I see that Iggy Fernandez suggested a materialized view – that might be a step in the right direction, or maybe a more complete change in the data model is needed.

[…] are actually more efficient for solving certain types of problems. In another excellent post titled Physical Reads are Very Fast, Why is this SQL Statement Slow Charles also dives into why a slow query is “actually” slow. There’s an […]

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: