Monday, March 25, 2013

This page has been permanently moved. Please
CLICK HERE to be redirected.

Thanks, Craig.

Q: Are SQL Statement CPU Consumption Times Reliable?

During an Oracle Database performance analysis, it's usually important to collect the CPU consumption for specific SQL statements for opportunity and comparative analysis.

Interestingly an Oracle Database provides multiple SQL CPU consumption data sources. I'm the type of person who asks, "Are they to be trusted? Is one method more accurate than the other?"

This posting will compare four different sources of SQL CPU consumption: v$sess_time_model, v$sqlstats, v$sql (Statspack report), and the OS Linux command procstat.

You download the analysis pack containing all related scripts and data HERE.

My Real Motivation (and fear)

My initial motivation for this investigation occurred when a student of mine noticed something very strange in this below Statspack report (source: http://filebank.orapub.com/perf_stats/SP_PDXPROD.txt ) :

The snapshot interval is about 150 minutes. (I don't know why 150 minutes was chosen.) Looking at the "Top 5 Timed Events", the top wait event is clearly "latch: cache buffer chains" with all sessions in the snapshot interval waiting nearly 32K seconds. The other wait times are not even close. Now focus on the "SQL ordered by CPU" section, where I show the clear top SQL CPU consumer. The next SQL statement consumes only 6.9% of the CPU compared to the above 86.3%.

Important to know: An Oracle session is either consuming CPU or waiting for something. The wait may be idle, but it is still waiting. When running a SQL statement the elapsed time is all the CPU plus all the wait time. For example, if a statement takes 30 seconds to run, 10 seconds may be associated with CPU consumption and the other 20 seconds associated with wait time. It follows that if the CPU consumption is 10 seconds and the elapsed time is 30 seconds, the wait time must be 20 seconds. ...not rocket science.

If you look closely at the above top CPU SQL statement it's elapsed time is 497518.90 seconds and its CPU consumption is 14456.36. It follows the wait time must be 483062.54 that is around 480K seconds. Now look at the top wait event's time. It's around 32K.

The problem:How is it possible that the clear top wait event time is 32K seconds while a single SQL statement has 480K of wait time? Obviously something is wrong, very very wrong.

There are a number of possible reasons for this discrepancy and I may never the truth. The issue could be incorrect CPU time collection, faulty Statspack data collection or reporting, a bug in Oracle's wait interface, or perhaps something else. Because my entire Oracle performance career is based on quantitative Oracle performance analysis, I can't ignore this. Plus it's a good reason to take the time to compare various data collection sources reporting the same thing, like CPU consumption.

My first step in this quest is drilling into CPU consumption. (Which is something that should be done regardless of this situation.) Then I'll look into the wait time and finally the elapsed time. And of course I may need to dig into Statspack data collection; especially when a system is seriously bottlenecked (on this system there is a massive CPU bottleneck); Data collection can take a while and numbers tend is be tweaked... but not (I suspect) like what we are seeing above (I hope anyways).

Let's press on to check SQL statement CPU consumption!

What To Test?

It's not possible to test every scenario. So I pick the ones that, based on my experience, will likely be questioned. And of course I need to be able to develop a good test, that is an experiment.

My initial thought was the issue was with sub-second SQL that gets executed multiple times a seconds. However, if the above Statspack total execution and total elapsed time values are correct, the SQL shown above has an execution rate of 0.8 execs/second with the average elapsed time around 71 seconds. So the issue here is not sub-second elapsed time SQL and not simultaneous executions.

This makes my experimental setup much simpler. It's very difficult to collect accurate timing information from multiple sources with sub second SQL. For my experiment I had the SQL run for around 85 seconds. The longer the SQL elapsed time, the less impact performance statistics data collection has on the analysis. For example, if my data collection takes 1 second and the SQL elapsed time is 1 second, I've got a big problem. However, if my data collection takes 1 second and the SQL elapsed time is 60 seconds, I can deal with that. Especially if all the performance data sources show the same performance picture. In my experiment to simplify the situation, there was only one session executing the SQL.

The above SQL is a select statement, so decided not to test DML.

Below is the SQL decided to use in this experiment (snippet from quick.sql).

The exec_todo variable gives me control over the elapsed time. In the experiment I set exec_todo to 200. The quick.sql script is called by main experiment driver script, startExpr.sh.

As detailed in the Data Gathering Specifics section below, I varied the load five times.

CPU SQL Data Sources

There are a number of sources to get CPU consumption for a specific SQL statement. Most DBAs have a slightly irrational love and hate relationship for the various sources. Here are the sources I used in this experiment.

v$sess_time_model. The "DB CPU" statistics is server process CPU consumption in microseconds. Since I queried from the "sess" and not the "sys" _time_model view, I should only get the server process CPU consumption for a specific session. If I surround the SQL statement with a "DB CPU" query and subtract the ending value from the initial value, the result should be just the SQL statement CPU consumption.

v$sqlstats. The "cpu_time" statistic is CPU time in microseconds. Gathering from v$sqlstats is tricky because as the SQL statement is running additional rows related to your statement are inserted. So it's not a simple, "ending time minus beginning time" thing. You have to very careful to look at the correct v$sqlstats rows. You can see how I did by examining getsidcpu.sh and startExpr.sh, both of which are contained in the analysis pack (link at top of posting).

Statspack (v$sql). My source is the "CPU Time (s)" column in the SQL listing area. Regardless of how Statspack gathers the data, I need to have confidence in what Statspack stores and reports. In my experiments I pull the CPU consumption from the stats$sql_summary table's "cpu_time" column.

As you and I are both wondering, "Where does Statspack get its SQL CPU time?", in SQL*Plus I ran a simple SQL trace just before I executed the Statspack collection procedure ( exec perfstat.statspack.snap(i_snap_level=>10); ). Looking closely at the raw trace file, it appears Statspack is gathering SQL CPU consumption from v$sql not v$sqlstats. In the trace file, look for the statement, "INSERT INTO STATS$SQL_SUMMARY" and you will see column cpu_time is populated from v$sql. I am running Oracle 11.2 on Linux.

Operating System. In my mind, the OS is the "truth" in CPU consumption. So I needed to find a way to collect CPU consumption for a specific OS process. It took me a while to find a tool that provided good time granularity. I stumbled upon a tool called procstat. It compiled fine on my Linux box. Here is alink to the code.

Now the trick is to gather all this data without significantly impacting what I'm observing and skewing the collection data. Read on...

Data Collection Specifics

The general collection idea is simple: gather initial session CPU time, run the SQL, gather final session CPU time, calculate the CPU time delta (i.e., difference), and store the results. But alas, it was not that simple. For two main reasons:

Querying from v$ views is not read consistent. And, if you remember back to my posts about when v$sysstat and v$sqlstats get updated, the various statistics are updated for different reasons and at different times. My solution was to sleep three seconds after the SQL completes using dbms_lock.sleep(3), thus giving Oracle time to update the statistics while at the same time, consuming virtually no CPU. This appears to have worked wonderfully.

When/where to collect the data. My goal is always to collect performance data just before and just after the thing under observation. But the various collection methods (e.g., OS tool, v$) can make this troublesome. With this objective in mind, I had to move some of the collection away from the "thing under observation" or insert a delay until I was confident the data was being collected correctly. ...and yes, this does take some time... you're welcome!

The main experiment script is "startExpr.sh" which calls the "quick.sql" and "getsidcpu.sh" scripts. Looking at these you'll see exactly how I gathered the data. It's kind of complicated because of the issues mentioned above and because I'm gathering data from four difference sources.

The load on the system is an important factor when gathering performance data. So I altered the load five times; starting with a reasonably active but not out-of-control system and then increased the CPU load until the OS CPU run queue would hit more than twice the number of CPU cores... a massive CPU bottleneck. I did not run this experiment on an IO bottleneck system. Perhaps later if I need to.

At each load level and for each data collection source, 31 samples where collected. (People tend to complain when there are less than 30 samples collected (for reasons they can't explain), so I typically gather at least 31.

Data Analysis

To analyze the data I used the free "R" statistics package. It is free, runs on Windows/Linux/Mac OS X, easy to use and especially so with basic statistical analysis, and you can script your analysis in a text file. Here's the link to the R script I used in this analysis. Usually I use Mathematica (which I love) but I want to start incorporating R into my One Day Performance Seminar, so this is a good opportunity to develop some courseware as well. I have a very short and simple step-by-step R tutorial for Oracle DBAs here. That link may become stale as I add more specific R tutorials.

Here is a link to the experimental results data in a SQL*Plus like format.

I used the pulldata.sh script to format the data into an easily usable format for R. All the data files (end in .dat") are available in the main experiment pack (see top of posting for link).

The entire experiment can be summed up in the below "smoothed" histograms. Below are the results for each of the five different system loads. The load increases left to right, top to bottom. So the lightest load is the upper left image and the most intensive load is the bottom center image.

While difficult to see, all the above histograms contain all 31 samples for each of the four data collection methods. For example, the green line represents the SQL CPU consumption based on v$sess_time_model's "DB CPU" statistic. The lines are plotted in this order: red, blue, black, and green. If you don't see the color you are looking for, it's because another data source essentially had the same results and graphically covers it up. For example, if you look at the bottom center image (the most intensive load) all you can see are the green and blue lines. Where are the read and black lines? If you think about it, the red line is being covered up with either the blue or green line.

Does this "covering up" make a difference to the analysis? No. In fact, it is just I wanted to see! Because all four data sources and their associated lines are essentially the same, resulting in a single line, this means all four data sources show the same CPU consumption for the SQL statement!

To get more statistical: While the lines are overlapping because their results look to be the same, looks can be deceiving. Therefore, I did some significant testing. I suspect the most controversial data is be between v$sess_time_model's "DB CPU" statistic (highly trusted) and Statspack SQL CPU time (not as much trusted). And also between the operating system (highly trusted) and Statspack (not as much trusted).

While all the statistical analysis data is in the "R" output text file (cpu_r.txt), I will only comment on the upper right image (experimentally, this is load 3) where three lines can be visually seen. Since the sample sets are not normally distributed (their sample normality test p-values < 0.05), I used the Wilcox location test. If the Wilcox p-value is greater than 0.05 then any difference in our sample sets is likely caused by unfortunately picking just the wrong samples for our test (i.e., random selection) or perhaps something else (e.g., sun spots).

The p-value for "DB CPU" and Statspack is an amazingly high 0.9885. The p-value for the operating system and Statspack test is also an amazingly high 0.9103. This is the statistical way of saying, there is not a real difference in the sample sets values.

What Does This Mean?

I need to be careful how I write this. My experiment (Linux Oracle Enterprise Edition 11.2) showed that on my CPU bound system with a reasonably long duration SQL statement (~85 seconds), all four data collection methods returned the "same" values. So either, all are incorrect (which we all hope is not the case) or they are all correct (which I'm thankful for).

But to address my fundamental and motivating question of, "How is it possible that the clear top wait event time is 32K seconds while a single SQL statement has 480K of wait time? Obviously something is wrong, very very wrong." It appears the issue is not CPU consumption collection and Statspack reporting.

More practically and to the point, when I look at CPU time in a Statspack report I will trust it... unless I have a really good reason not to.

But that still leaves wait time, elapsed time, and Statspack collection and reporting as possible problems. I guess this means you'll be seeing some related postings in the near future!

Thanks for reading!

Craig.

1 comment:

you should repeat your test scenario on a CPU overloaded machine - like was the case in the referenced http://filebank.orapub.com/perf_stats/SP_PDXPROD.txt... reporting an load average of 93.06 on a 2-CPU-core machine

With such an excessive CPU overload, not-instrumented parts of the code (as well as OS processes context-switching/... overhead), which usually represent <20% of the DB elapsed time, can suddenly blow up to a more significant percentage.

Besides of this, there are of course plenty of bugs in the counters/dynamic performance views -- some of which, have never been fixed (I've seen one recently, which was "fixed" by Oracle Corp by simply not showing the obviously wrong value in an 11.2 AWR report at all) -- thus if one sees something obviously wrong, it is worth crosschecking with other sources of information, outside of the Oracle DB SW (OS/network/... stats).

Finally (while that would not seem to apply in this case, as there are lots of executions of the top SQL) I have seen recently as well a significant {off by a factor >=10} discrepancy between different parts of an 11.2 AWR report (using information from different dynamic performance views)- I've created an SR to Oracle Corp, and the answer was, that AWR report does not guarantee to contain correct information for queries, which started before the begin AWR snapshot, and ended within the AWR report period – since they report, on some of the AWR sub-reports, the whole resource consumption, within the AWR report period - while they may have been executing 10x+ longer than the AWR report period, already before the start AWR snapshot -- and thus their impact, within the impacted AWR sub-reports, is blown-up out of proportions.