The CBO CPU Costing Model and Indexes – Another Introduction September 16, 2009

I’ve previously discussed some basic concepts and formulas regarding how the CBO derives index related costings via the I/O costing model. Time to look at system statistics and the CPU costing model with specific regard to indexes.

The first point I would make is that the CPU costing model has some significant improvements over the older I/O costing method and I would strongly recommend adopting the CPU costing model where possible. I’ll explain some of these improvements and advantages over the coming posts.

The I/O costing model basically looks at the cost of a specific execution plan in terms of the estimated number of physical I/Os. The less I/Os, the less costly and more efficient the execution plan and the faster the expected response times. There are however a number of short falls with this basic I/O costing strategy in that is doesn’t automatically differentiate between the costs associated with different types of I/Os (eg. between single block and multiblock reads), it doesn’t automatically determine a typical or average size of a multiblock I/O and it doesn’t cost and take into consideration the time and overheads associated with likely CPU resources.

The CPU costing model attempts to take into consideration these previous limitations. It automatically takes into consideration discrepancies between the time to complete an average single block I/O versus a multiblock I/O, automatically determines the average size of a multiblock I/Os so it can more accurately determine the likely number of multiblock I/Os in a FTS and automatically determines the expected CPU time for a specific task.

To use the CBO CPU costing model, one needs to collect system statistics so that CBO has this additional information, based on the actual system hardware characteristics (Note: since 10g, the hidden parameter _optimizer_cost_model defaults to ‘cpu’ and so is used by default). You do this with the dbms_stats.gather_system_stats procedure. You can collect “Noworkload” statistics in which Oracle basically randomly reads the database data files to determine base statistics such as the average I/O seek time, the average I/O transfer speed and the CPU speed. However, I would rather recommend the collection of “Workload” stats which are based on the actual workload characteristics of your hardware, based on the real load on your system during the time in which system statistics are gathered (in which case Noworkload statistics are simply ignored).

You can gather Workload system statistics by either running:

dbms_stats.gather_system_stats(‘START’) to start the system stats collection process followed by dbms_stats.gather_system_stats(‘STOP’) to stop the collection process over a typical, workload period, or

SREADTIM – time in milliseconds for a single block I/OMREADTIM- time in milliseconds for a multiblock I/OCPUSPEED - million of CPU cycles per secondMBRC – average number of blocks actually read during multiblock read operations

In the above figures, just note therefore that a multiblock read on average takes approximately double the time of that of a single block read and that on average, 10 blocks are read during a multiblock read operation. This provides the CBO with vital information regarding how to now cost and compare potential execution plans.

The CBO CPU costing model basically looks at the total time required to complete an execution plan by summing:

total time to complete all single block I/O activity +
total time to complete all multiblock I/O activity +
total time to complete all the CPU activity

This can basically be calculated by:

sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second

In theory, this should provide the total response time to service an execution plan. However, to keep the actual “cost” figures calculated by the CBO consistent with the I/O costing model, the CBO divides this total time by the average time for a single block I/O, such that the full formula becomes:

(sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O

The final “cost” figure, even with the CPU costing model, is therefore still expressed in units of single block I/Os. This is an important point …

So how does the CBO determine the value of the various figures within this formula ? Well as we’ll see, the CBO get’s the required information both from the system statistics and from the costing formulas previously discussed with the I/O costing model.

However, for index related access paths, there’s some good news regarding being able to simplify matters somewhat.

The first bit of good news is that from the perspective of an index access path, there are no multiblock I/Os (except for a Fast Full Index Scan) and so the CPU costing formula can be simplified for indexes to remove the multiblock read component and be just:

(sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O

Secondly, if the CPU component is relatively trivial, it may not be sufficient enough to count towards the final cost. As smaller index scans are likely to consume little CPU, it means the CPU component can also generally be ignored. This reduces the formula for such index scans to just:

(sum of all the single block I/Os x average wait time for a single block I/O)
/
average wait time for a single block I/O

However, the average wait time for a single block I/O now becomes redundant in this simplified equation, reducing the cost to now be just:

sum of all the single block I/Os

Well the next bit of good news for those that have followed my previous blog entries with regard to the CBO and Indexes is that the previous formulas regarding the I/O costing model are still applicable when determining the sum of all expected I/Os. The sum of all the single block I/Os associated with an index scan is still basically:

In other words, for smaller index scan execution plans, the cost calculated by CBO using the CPU costing model is the same as with the I/O costing model. So no, I wasn’t wasting everyone’s time discussing the various formulas using the older I/O costing model :)

If we run the same demo as I ran previously in my initial post regarding the CBO and Indexes where the total cost of the index access plan was 18, but this time using the system statistics listed above:

We notice that the cost remains exactly the same at 9 for the index range scan component and exactly the same at 18 for the total execution plan when comparing the cost of using the IO costing model vs. the CPU costing model. Introducing system statistics hasn’t changed things for this particular index related execution plan.

And this is a very common observation. As indexes use single block I/Os, as the CBO cost remains as a unit of single block I/Os and as CPU consumption for an index scan is often trivial, the resultant costs for index access paths often remain unchanged with the CPU costing model.

Previously, we looked at how changing parameters such as the optimizer_index_cost_adj impacts the costings of index related execution plans to create a level playing field between index and FTS execution plans.

The key point to make with regard to system statistics and the CPU costing model is that in general, the system statistics and the associated formula will automatically ensure a level playing field. However, unlike the optimizer parameters, it will do so by typically adjusting the associated costs of the FTS (rather than the index accesses) as the true costs and wait times associated with multiblock FTS are calculated, but are divided by and expressed in units of single block reads.

So rather than decreasing the associated costs of an index access path, system statistics and the CPU costing model will typically create a level playing by automatically increasing the associated costs of a FTS as appropriate.

Share this:

Like this:

Related

What exactly is the parameter CPUSPEED in AUX_STATS$? Modern cpu’s dynamically scale their frequency to adjust to the workload. Thus, if there is a lot of “quiet time” during the stats-gathering period, the reported CPUSPEED value would be low (I’m implicitly assuming that CPUSPEED is atime averaged value of the actual CPUSPEED). Won’t such a low value lead to the CBO overestimating costs (in cases where cpu time is important)?

Any idea how multiple cores and/sockets impact the measured CPUSPEED?

Just wanna say a very big thanks for such educational blogs – your blog is an invaluable resource to us folks new to Oracle.

Good question for which I don’t have a precise answer regarding how Oracle measures the actual CPU speed, except to say that it takes into consideration the number of CPUs, sockets etc. in it’s anlaysis.

The last time I tried to check (which was 9i) I got the impression that Oracle simple ran a “benchmark” subroutine to see how many times it could run per second. So the CPUSPEED is “millions of Oracle Benchmark Operations per second from one CPU” – changing the number of CPUs in the box shouldn’t make any difference, but having CPUs that operate at different speeds could produce an interesting result.