Time Matters - DB Time

Doug's Oracle Blog

[In retrospect, the title of that first blog post might have suited the subject, but doesn't translate too well for subsequent related blog posts. That was a lack of planning or foresight on my part. These blog posts are tumbling out of my head in a fairly incoherent way. Maybe when they're finished, I could revisit them and write them up in a single, sensible article! In the meantime, I've retitled the previous post.]

The previous post might have given the impression that you shouldn't pay too much attention to the values in the "Time (S)" column of the Top 5 Timed Events section of a Statspack or AWR report*. That's not true, particularly when comparing system workload between two different periods. (I was just building up to this slowly ) In fact, although the values in that column are not "DB Time" they are nearly all components of it and, just as an increase in the values in that column indicate increased workload or decreased performance during the two different periods, so does an increase in an Instance's DB Time. (Edited later - see JB's comment below. Background Wait Events are
*not* components of DB Time, just Foreground Wait Events. I knew that, but wasn't careful enough with this
sentence.)

As Chen pointed out in her comment on the last blog, there's a direct relationship between the number of sessions that were running during the reporting period and the total time values. No matter how busy the system, the total time available can not exceed the maximum number of concurrent sessions * wall clock duration that they were running. That's a hard limit, but there are other relationships, too. If you think about it, as the number of active concurrent sessions increased from one to four so did the database instance's workload as more sessions were either working or waiting for something and, if the database is working on something then that implies the application is waiting for it to complete! In all likelihood on a small laptop with limited resources, the actual time spent waiting on individual events would increase as well, as contention for resources increases. Both an increase in the duration of events and an increase in the number of different sessions running or waiting on events will increase DB Time.

Here's a definition of DB Time that you'll see in various presentations from the Oracle guys who worked on this. (e.g. John Beresniewicz's excellent presentation "Average active sessions: the magic metric?" which is available on Kyle Hailey's website. I've never seen JB give the presentation, but I love the slides.)

Database time is total time spent by user processes either actively working or actively waiting in a database call

The same presentation describes the components of DB Time as

Time spent in the database by foreground sessions
Includes CPU time, IO time and wait time
Excludes idle wait time

In the context of a Statspack or AWR report, the top 5 timed events section is showing you the top 5 contributors to DB Time. (In fact, what I find most interesting about the latest versions of the Statspack report is the increased focus on time. For example, the first section of top SQL statements by resource consumption is ordered by DB CPU, followed by the SQL ordered by Elapsed Time, before we get anywhere near Logical or Physical Reads. Regardless of whether you're licensed for the Diagnostics Pack and have access to ASH, AWR and ADDM, you can still take advantage of some of the instrumentation improvements - Event Histograms, for example - and it's clear that Oracle is increasing its focus on time.)

Note that, although I'm choosing to write about aggregated system-wide values as a performance indicator because it's one convenient use of DB Time (particularly when comparing two different periods of time that might be seperated by months or when you have hundreds of systems to support or, most important, if you're writing a tool like ADDM.), I don't think there is anything inherently system-wide about DB Time. If you insist that the only valid performance analysis is that carried out at the session level then, make no mistake, DB Time is recorded for each individual session. It can be used as the input to response time tuning and is really just the Oracle Servers recording of the R in R=S+W. i.e. it's equally useful for Response Time tuning at the session level. In fact, I wish it had been called DB Response Time but I understand that even mentioning 'Response Time' might have led to the argument that the Database Server is only one component of the user's end-to-end response experience and that Response Time doesn't make sense as a term for data that's going to be aggregated for multiple sessions in some cases. (i.e. Can an entire instance have a aggregated Response Time?)

DB Time is the most important of the various Time Model Statistics, which break down the Service component of R = S + W into more detail. Here are the Time Model statistics from the Statspack report for the single-user test. (Don't forget that, as always, this is just reporting underlying statistics. In this case, these statistics are also exposed via v$sys_time_model and v$sess_time_model)

(For someone who asked me this question recently, DB CPU is just the value you would see in the CPU row of the existing top 5 timed events section although I understand that CPU measurement has been improved significantly in recent versions.)

In this case, there's a relationship between the number of people trying to do something and the total amount of DB Time consumed for a given wall clock duration. If we divide DB Time by the Wall Clock time, we're left with the average number of active sessions during the period. If I apply this first to the single-user test, using the values from the Statspack report :-

Why weren't the values exactly 1 and 4? Well, for starters, the Statspack reporting periods were slightly longer than the test run period as I executed snapshots manually in a sqlplus session. Therefore there was a period of time in both reports when the server was dead quiet. There is also the application overhead of Swingbench running the benchmarks and submitting the calls to Oracle. Oracle can only sensibly record the time when it's actually doing work on behalf of the application, not when the application is doing it's own thing, processing the results of the last database request. None of the database sessions in this test that were performing many small transactions was active for 100% of the time.

But that's only one part of the relationship. In this case, the instances workload has increased because more users are working at the same time, but there's more to performance problems than that. How about a different example? Imagine a single user is running a single query that takes 2 minutes to complete. The session has been connected for 20 minutes, most of which has been idle, waiting for the user to submit the query. If we look at the session-level information (using v$sess_time_model), we'd see something like this.

DB Time = 120 seconds

i.e. DB Time shows us 'the Oracle bit' that we might be able to tune. The goal of the DB Time Performance Method that Graham Wood presented at last year's UKOUG conference (amongst others) is to reduce the amount of DB Time taken to deliver the same results. So, how can we reduce DB Time here? By making the query run more quickly, whether it's through tuning it to do less work, or increasing the efficiency of that work by reducing bottlenecks. Regardless of *how* I improve the performance of the query, let's say I happen to make the query run in 50 seconds.

DB Time = 50 seconds

The end user's experience has improved.

There's a lot more I could say about DB Time. Like all of the best performance concepts or methods (e.g. YAPP, Method-R) it can seem so obvious as to not be worth saying, but contains an enormous amount of common sense and technical rigour. I suppose one important aspect of DB Time that I would highlight is that it's a common currency for ADDM. If you were to write an automatic performance diagnostic tool, what would be your goal? To reduce the time that any particular action takes, whether that's by eliminating a bottleneck at the server level, or in the instance configuration or a particular part of the application code. By combining ASH and AWR data using DB Time as the key measurement, ADDM can focus on the actions that will deliver the most significant response time reductions, whether they be more CPU or disk resource, tuning individual SQL statements or eliminating locking issues. Because DB Time can be aggregated at many levels (e.g. Session, Service, Instance) , it can be used in a number of different ways, depending on what data is available and what the goal of the optimisation exercise is.

DB Time - it's the future

In the meantime, here are the slides from a couple of other presentations that cover DB Time in much more depth than I have here. Until Graham Wood starts that blog of his (hint, hint) these are the next best thing.

Log Buffer #112: A Carnival of the Vanities for DBAs
Welcome to the 112th edition of Log Buffer, the weekly review of database blogs.
First, thanks to last issue’s contributors–Joe Izenman, Dan Norris, and Jason Massie–for snatching victory from the jaws of defeat and making LB#111 a wo...

Nice post Doug. One small point about the Statspack top-5 and DB Time: DB Time is specifically defined to be foreground time spent in the database, while the top-5 report includes background wait time (e.g. log file parallel write.) This is no small distinction, as this definition contributes to what I like to call the 'fungibility of DB Time'...

"fun·gi·ble
Spelled Pronunciation[fuhn-juh-buhl] –adjective Law.
(esp. of goods) being of such nature or kind as to be freely exchangeable or replaceable, in whole or in part, for another of like nature or kind.
[Origin: 1755–65; < ML fungibilis, equiv. to L fung(ī) to perform the office of + -ibilis -ible]

I am still a little fuzzy as to what DB CPU is measuring, because when I sum up the values from the single-session run it comes to 362 seconds which is much greater than the 321 seconds reported for DB Time.

It took me a moment to see what you were calculating there, but now I see what you mean, the Time Model Stats could be a bit confusing when listed this way, so thanks for asking.

The problem is that DB CPU is listed along with various "elapsed" times. If you ignore the CPU and just add up the elapsed times, you'll get 304 seconds. Lower than DB Time and I'm not 100% sure why (rounding errors, uninstrumented areas, could be all sorts of reasons). The important thing to recognise is that DB CPU is the recording of CPU which is *included* in those various elapsed times.

hi there ,
i want to know how to increase the connection timing in te toad9..actually wheni m connecting thr toad its getting timed out around 30sec,if its idle...
can u plz suggest any configuration changes to be made in order to get rid of it..

Looking at what the various Time Model statistics mean and how they relate to each other, it didn't take long to find this article from Chris Foot. Looks like it's worth a read and far better than anything I've said here and I'll add this to the blog post as well, for those who don't read comments.

The Time Model stats are notoriously difficult to understand because there are containment relationships and not captured by the flat view presented. The reference doc shows these, and I believe it is basically a transcription of a time model tree first teased out by Kyle Hailey as 10g was going out.

For purposes of this thread it is sufficient to know that 'DB time' fully contains 'DB CPU' and that both are measured values (which means in particular that 'DB CPU' is not inflated by runq time)

'sql execute elapsed time' is fully contained in 'DB time' as well, but there is an unknown intersection between this and 'DB CPU'.

I know this is an old post, but I was just looking through some of your stuff and ran across this one. I agree with you that dbtime is a very useful metric. I use a little script that pulls the top snapshots from the awr tables based on dbtime. It has been pretty useful for identifying when the database was busiest. (I stole the script from statspack and just modified it to use AWR tables instead of statspack tables)

Disclaimer

For the avoidance of any doubt, all views expressed here are my own and not those of past or current employers, clients, friends, Oracle Corporation, my Mum or, indeed, Flatcat. If you want to sue someone, I suggest you pick on Tigger, but I hope you have a good lawyer. Frankly, I doubt any of the former agree with my views or would want to be associated with them in any way.