Tuesday, January 11, 2011

This page has been permanently moved. Please
CLICK HERE to be redirected.

Thanks, Craig.

Collecting performance data can sometimes be very simple. But at other times it can be a nightmare. Not only is there the classic possibility of the collection impacting the system your monitoring, but multiple sessions can be involved plus the differences in timing can be microseconds. And the closer you get to the database kernel, the trickier it gets. This past week I came across what I initially thought would be a very simple collection question, "When is an entry made into v$sqlstats?" It turns out, it's more complicated then I anticipated and I thought you might be interested in the answers plus the scripts I used to perform the experiments.

What's So Complicated?

Ask yourself these questions regarding when an entry is made into v$sqlstats.

When a statement has started running?
When a statement completes?
As a statement is running?
Are all statistics updated at the same time?
Is it the same for procedures, functions, anonymous PL/SQL blocks, standard SQL entered in SQL*Plus?

What initially seemed to be a very simple question, turned out to be much more complicated. The only way to answer these questions was to set up an experiment... so that's what I did.

The Experimental Design

The general data collection strategy I used requires two users; monitoring and application. There is actually a third, the data repository user which, I used system.

I created a single text file that contains an short introduction, all the code, explanations how to run the experiment yourself, and the results I will detail below. This single file can be downloaded by clicking here.

The monitoring user continuously checks the v$sqlstats view for a change in the specific SQL we're looking for. By default the change check is in a very tight loop with no delay. Through the variable troll_delay_in you can insert a delay. Actually, I query from v$sqlstats's underlying fixed view x$kkssqlstat because this is lower overhead and I was already using the x$ for some other experiments, so it ended up this way. In both my experiments and in this blog, when I reference v$sqlstats I am also referring to its underlying x$ fixed view. The experimental results are inserted into the op_results_raw table, which I created within the system schema. (I feel uncomfortable about creating objects in the sys schema.) An important aspect of the experimental design is all result entries are timestamped. It is the timestamped entries that enable us to see what occurred and in what order. As you'll see, this strategy works very nicely.

The application user runs the SQL that we closely monitor. I have four tests; basic select, a procedure with the same select, a procedure that runs the select twice, and a procedure that calls another procedure. As with the monitored user, the application user inserts the results into the op_results_raw table owned by the user system.

The SYSTEM user owns the results table, op_results_raw. It probably would have been cleaner to have the application user be the owner, but for now it is what it is.

One of the keys to make this work is clearly identifying the application SQL. To do this we need its sql_id and plan_hash_value columns. While I'm not displaying all the actual experimental code (but I provided the download link above), I think this is important and versatile enough to post. So here it is:

Notice that immediately after I run the above application SQL, I query from the v$sqlstats and v$session looking for the "just run" sql_id and plan_hash_value. This works wonderfully! Here is the output on my test environment.

Make sure you have serveroutput set to off when you run this or it will not do what we want.

My test server was running a single CPU with four cores on Linux with Oracle Release 11.2.0.1. It is a small database but large enough for this experiment.

As mentioned above, the experimental results from both the monitoring and the application user are inserted into the op_results_raw table. Here is an application user code snippet. For readability, I took out a bunch of lines from the middle. Obviously, before the below code is run the variable the_test_sql must be set.

To help me interpret the results, I insert various timing markers, such as Started the SQL execution test. I also want to know, for the monitoring and application user, when their duties started and ended. I wanted to ensure my monitoring started before and ended after the application testing. I also inserted a relatively long timing gap between the repeated SQL executions. In the example above, I used seven seconds. This makes understanding and interpreting the results much easier because the common events are clustered by time.

While I will detail the actual experimental results, so you can see how I reported the results here is a snippet. Note: For readability I replace the actual sql_id and plan_hash_value with 7,7. I did this kind of thing throughout this blog posting.

20.9452: The statistics collection was enabled before anything else started.

21.0485: The collection has never seen the statement before, so it thinks the v$sqlstats entry has just been inserted or refreshed. This is incorrect, so disregard this entry.

26.4489: There was a clear ~5 second gap. This is how long it took me to copy and paste the application use code into its sqlplus session...without making a mistake. In the Experimental Design section above, you can see the actual SQL that caused this entry.

26.4513: This application user entry was made just before the simple SQL was run.

26.4519: The SQL statement started so an entry is made into v$sqlstats (actually the row refreshed, that is, updated) and our monitoring user detects this, pulls from v$sqlstats, and records some basic information that we see above. Notice that compared to the 21.0485 entry, there is no change shown to trigger this entry. However, the monitoring user did in fact detect a change by comparing the previous and current last_update_change column. That column must have changed but the execution column had not yet been changed, or at least we don't see the change in the v$ fixed view. There could be many reasons for this, such as there is no read consistency with performance data. But notice the next entry.

26.4552: In less then 2 1000ths of a second we detect another change and we can see the number of executions has increased. I suspect we caught the v$ fixed view in the middle of an update...remember there is no read consistency in performance views! So it appears the execution column is updated when the statement begins.

28.4311: It has been about two seconds since the last v$ entry so another row is inserted into the v$ table and we see the results here. In this case both the disk reads and logical buffers reads have increased, but the number of executions has not changed... because the number of executions changed when the statement started.

30.4383: It's been about two seconds...so we get another entry into the v$ table. This will go one and on until the statement completes.

I am not showing all the entries and have jumped to the last few before the statement finishes. Directly above are the final entries for the above execution (SQL end...) and then you will notice the next execution will begin (SQL start...). Here is an explanation:

56.5887: The statement is still running and the normal two second insert into v$sqlstats occurred, detected by our monitoring user, and recorded in our results table.58.5933: Same as with 56.887.

00.5970: The statement completes and immediately the v$sqlstats row is refreshed. Notice the v$sqlstats row did not refresh in the usual two second cycle. As soon as the statement completed, the row was refreshed, detected by our monitoring user, and recorded in our results table.

00.8328: The application user inserts a row telling us our simple SQL statement has indeed finished. Notice the monitoring user has no idea this has occurred, hence there is v$sqlstats entry.

07.8464. About seven seconds later(!!) the simple SQL statement is about to be run once again.

07.8472. The statement begins and the monitoring user detects a refreshed row in v$sqlstats. Notice the execution count has been incremented to 22. Once again, for this SQL statement, the execution is incremented at the start of the SQL statement. Also, the v$sqlstats entry does not match the two second pattern. Again, it appears when a statement starts a row is immediately refreshed in v$sqlstats. We can expect the next v$sqlstats refresh to occur around the 09.8472 time.

09.6418. As expected about two seconds later, the next v$sqlstats refresh is made because the SQL statement is active. We can see that some of activity is related to the increase in disk reads.

011.6577. ...same as the above entry...

...

From this experiment I learned:

Upon a select statement start, the v$sqlstats row is either inserted or refreshed (if it already exists).

The execution count is incremented when a select begins.

Statistics are refreshed in v$sqlstats about every two seconds, except when the statement begins and ends, in which a change immediately occurs.

A word of caution. This test was specifically for a select statement. While I do additional experiments related to procedures, I did not do experiment with inserts, updates, delete, or DDL. While I suspect DML will follow this pattern, I have no experimental evidence to back this up... there are other things I need to attend to...

Results: Procedure With Simple SQL Statement

For the next experiment, I created a simple procedure containing the simple SQL statement. Here it is:

13.7554: The monitoring user has begun looking for change in v$sqlstats.

13.7887: Since we just started monitoring and have never seen an entry in v$sqlstats for this sql_id and plan_hash_value, our monitoring user detects this "change" and records the event. Again, disregard this entry.

17.6207: About four seconds after monitoring begins the procedure test is about to begin. Again, this time lag is due to me copying and pasting the application code (shown in part above) into the application user's session.

17.8577: The procedure is about to begin.

17.8584: Immediately an entry is made into v$sqlstats and the monitoring user detects this and records a row in our results table. Notice there has been no change in the execution count. It was 3 before we started the procedure and it is still three!

19.4002: Two seconds later, as expected since the procedure is running, the v$sqlstats row is refreshed and the monitoring user detects the change. In this case we can see there is an increase in both physical disk reads and logical IO. Notice the execution count has still not changed!

47.5802: The procedure is still running and the still consuming computing resources.

49.5903: Two seconds later, as expected since the procedure is running, the v$sqlstats row is refreshed, the monitoring user detects the change, and inserts a row into the results table.

50.0857: Before the next two second cycle occurs, the procedure finishes, v$sqlstats immediately refreshes the row, and our monitoring user detects the change and stores the results. Notice the execution count is finally incremented! So in this test, the execution count was incremented when the procedure completed not when the procedure started.

50.1243: The application user posts that the procedure has ended.

57.1721: After our specified seven second delay another identical procedure execution is about to begin.

57.1727: It does begin, the v$sqlstats row is refreshed, the monitoring user detects and records the change. Notice the execution count did not increment.

The application user script is exactly the same as the previous procedure test. The only difference is the variable prc_name is set to op_test5. For readability, I replaced the sql_id with xyz but did not change the plan_hash_value of zero. Here are the results.

If you have followed the previous two experiments, you should have no problem understanding the above. The same pattern is followed as with the above procedure. While I learned nothing new, it reinforced what I previous learned.

Results: Procedure With Simple SQL and Procedure Call

For the next experiment, I created a simple procedure containing the simple SQL statement and I included a call to another procedure, dbms_random. Here it is:

The application user script is exactly the same as the previous procedure test. The only difference is the variable prc_name is set to op_test6. For readability, I replaced the sql_id with akb. The results are exactly was with the previous two procedure experiments!

The application user script is exactly the same as the previous three procedure tests. The only difference is the variable prc_name is set to op_test7. The experiment ran in two different versions; the first was with serveroutput off and the other it was set to on. The serveroutput setting had no affect on the execution count and results were exactly the same as the previous procedure experiments.

While I learned nothing new, it strengthened what I previous learned...again.

Conclusions

Sometimes it is very difficult to design an experiment peering into the unknown. In this particular experiment I needed to create a way to understand when and what changed in a v$ view. My solution was to create a tight loop looking for a very specific change in the underlying x$ fixed view and when a change was detected it was timestamped and recorded in the results table. I also had specific application code performing specific actions while recording into the same results table when the application started and stopped its activity. A simple results table query enabled me to study the timing of events, which easily brought to light, in this specific set of experiments, whenv$sqlstats is changed and what is changed. As you will see in subsequent blog posts, it was very important for me understand this.

What I learned specifically about v$sqlstats timing:

Upon the starting and ending of an anonymous SQL statement or a procedure, the v$sqlstats row is immediately refreshed if it exists or inserted if it does not exist.

Besides the beginning and ending refresh/insertion, about every two seconds the v$sqlstats row is refreshed with the latest statistics.

For a select statement the execution count is incremented when the selectbegins.

For a procedure, even when calling other procedures, the execution count is incremented when the main procedure ends.

Thanks for reading!

Craig.

5 comments:

Interesting, I also had some test cases particularly on the DBA_HIST_SQLSTAT, on the tiddlers were answers to the questions:1) How to get SQLs that are run within anyparticular period2) And cases of missing SQL in DBA_HIST_SQLSTAT

This may also help you on your research, or open more questions about this area. See it here.. http://karlarao.tiddlyspot.com/#DBA_HIST_SQLSTAT

Karl and for those interested in the collection aspect, which was very tricky to solve. Take close look at the collection pl/sql code, which I do NOT show in the blog but you can examine by looking at the main experimental file that is downloadable. The collection loop is only around 50 lines and I comment it. Examining that should help explain why I did what I did and how I did it. There is a link in the post, but here it is again: http://filezone.orapub.com/Research/vsqlstat_change/vsqlstat_collection.txt

I just noticed I was lazy with the word "insert." I have made the appropriate corrections and reposted. There is only a single row in v$sqlstat for a SQL statement or procedure (ID'd by sql_id and plan_hash_value). When the stats are refreshed the existing row is "updated."

About the anomaly at 26.4519: the change was probably triggered by the parse call preceding the execute.About the difference of when executions are incremented for SELECTs vs. PL/SQL: executions is updated AFTER the execute call completes. For SELECTs, execute calls are usually short; for non-select DML and PL/SQL blocks execute calls are where all the work is done.