Which PLAN_HASH_VALUE Appears in V$SQLAREA?

28032012

March 28, 2012

A recent question on the OTN forums asked which PLAN_HASH_VALUE appears in V$SQLAREA when there are multiple child cursors for a single SQL_ID value, when some child cursors have a different execution plan. Certainly, this bit of information must be in the Oracle Database documentation. Let’s check the V$SQLAREA documentation for Oracle Database 11.2:

“Numeric representation of the SQL plan for this cursor. Comparing one PLAN_HASH_VALUE to another easily identifies whether or not two plans are the same (rather than comparing the two plans line by line).”

Well, that was not helpful, but it does remind me of something that I saw when I went out for a drive in the countryside this past weekend (all within about a 50 meter radius – click a picture to see a larger view of the picture):

OK, now that the initial frustration of not obtaining an answer from the documentation has subsided, let’s put together a quick test case to see if we are able to help the OP find an answer to his question. We will borrow a slightly modified version of a test script that generates skewed data which was used in another article:

For the initial test (in Oracle Database 11.2.0.2), I will use the BIND_AWARE hint to save Oracle from having to determine that the execution plan could (should) depend on the bind variable value, rather than having to rely on adaptive cursor sharing to eventually obtain the same effect:

As shown above, the execution plan changed, thus the Plan hash value in the execution plan changed to 3617692013, and that change corresponded with the PLAN_HASH_VALUE for the SQL_ID in V$SQLAREA changing to the value 3617692013 – the same value shown in the execution plan for the most recently executed child number.

Let’s trying again without changing the bind variable value to see what happens:

As shown above, the PLAN_HASH_VALUE for the SQL_ID found in V$SQLAREA switched back to the original value – it is again showing the PLAN_HASH_VALUE of the most recently executed child cursor.

—

But wait, there’s more information. Pete Finnigan recently reminded me of a problem that I had several years ago when I tried to untangle the string of synonyms and views to see the definition of various Oracle Database performance views. He not only reminded me of the problem that I had years ago, but showed me the process of untangling the (evoke suitable picture from above) that I learned and forgot several times over the subsequent years.

Let’s get started by determining what the V$SQLAREA synonym points at:

So, the view V_$SQLAREA selects from V$SQLAREA which selects from GV$SQLAREA which selects from X$KGLCURSOR_CHILD_SQLID where KGLOBT02 != 0 (COMMAND_TYPE column in V$SQLAREA). (Note, must be logged in as the SYS user for the following SQL statement.)

So, in summary the synonym V$SQLAREA points to the view V_$SQLAREA which selects from V$SQLAREA which selects from GV$SQLAREA which selects from X$KGLCURSOR_CHILD_SQLID where KGLOBT02 != 0, which has as a base table of X$KGLOB just like 8 other fixed tables. If you repeat the above steps for V$SQL, you will find that it is based on X$KGLCURSOR_CHILD, which also has X$KGLOB as its base table.

Here is a piece of SQL that joins the underlying fixed table for V$SQLAREA with V$SQL to hopefully determine if the PLAN_HASH_VALUE for the most recently executed child cursor for each SQL_ID is always what appears in V$SQLAREA (I do not suggest running this SQL statement in a production environment – the test database instance in my case was bounced a couple of hours ago). The word DIFFERENT is output if the LAST_ACTIVE_TIME from V$SQLAREA does NOT match the LAST_ACTIVE_TIME for a specific child in V$SQL (Note, must be logged in as the SYS user for the following SQL statement.):

I have an output from tkprof that Elapsed time from wait events is higher than “elapsed” in tkprof output. However, I do not know where in your blog I can post it. So aplogies for this. A comment has been it is due to using different clock granularities. Please advise if you are interested in me posting and where.

I have been struggling to find time lately to write anything on this blog (I am suffering a bit from lack of ideas), or even to look through message threads on OTN or read blogs.

That is an interesting problem that you mention. If you look in the raw trace file, do you see any waits associated with CURSOR #0? Another thought is that SQL statements at recursive dep=1 or dep=2 might be affecting the tkprof output.

I will hopefully have some time later today – it might make for an interesting blog article to explore an example of the wait time being greater than the elapsed time. How large is the trace file?

I am running the following simple OLTP type tests. Selecting 1.7 million rows into an associateive array and updating another table where object_id match (via index key)

Table tdash has 1.7 Million rows based on all_objects + padding1 varchar(4000) + padding2 varchar(4000). There is a unique index on object_id
Table testwrites was created as CREATE TABLE testwrites AS SELECT * FROM tdash WHERE ROWNUM <=1000000. A unque index was created on object_id of testwrites table.

Now why my elapsed times for updates from wait events 2304.22+ 99.78 = 2404 sec is HIGHER than my elapsed time of 1245.21 sec? I don’t think it can be explained purely due to different clock granularities. Additionally,

Also I have “db file sequential read” = 2,160,263 physical IOs and from the rsaw trace file one physical IO in this case is 32 blocks. I also have “db file scattered read” = 27,769 physical IOs. In contrast the number of my disk reads is 1,064,667 blocks which is far less that sum of physical IOs. Something is not right somewhere.

It appears that what you are seeing is a potential bug in the version of tkprof that you are using, but the potential bug is unconfirmed at the moment. The version of tkprof is probably prior to 11.1.0.7. The potential bug appears to be related to attempting to subtract recursive child statistics from the associated from the parent UPDATE SQL statement statistics (the update statement has a dep=1, while the recursive child statistics will have a dep=2, great-grandchild will have a cursive dep=3, etc.).

I set up a test (60,000 rows in T1 from ALL_OBJECTS, 50,000 in T2) with Oracle Database 11.2.0.2, and the 32 bit Oracle Client 11.2.0.1 on Windows with patch 10155837 (fixes ODBC update problems related to bind variables), and found that that version of tkprof apparently correctly associates the parent and child statistics (I wonder about the grandchildren statitsics).

For comparison, my program (Hyper-Extended Oracle Performance Monitor), does not subtract child recursive statistics from the parent statistics, so double-counting of statistics is very likely when dep=1 (or greater) statements are found in the trace file. My program output:

One of the outputs of my program is also an Excel spreadsheet with the wait events for each cursor. That spreadsheet confirms the tkprof output, that there were 31 db file scattered read waits (for a total of 416 blocks, all close to the start of the trace – Oracle was trying to fill the empty buffer cache) and 36 db file sequential read waits (36 total blocks). Notice also that the CPU time and elapsed times differ, in addition to the number of physical blocks read and consistent gets.

Which version of Oracle11g R2 Database are you using (11.2.0.1, 11.2.0.2, 11.2.0.3)?

Which version of the Oracle client are you using? You can determine the Oracle client by using the tnsping utility immedaitely after using tkprof:

The test case that I used to produce the above output had a unique index on the OBJECT_ID column of table T1, but not the OBJECT_ID column of table T2. Below is the output from tkprof with the index on table T2:

Notice now that the elapsed time is just less than half of what my program outputs, while the numbers were very close when a full table scan was the only option. The same is true for the CPU time. I might have to experiment a bit more to see what is happening.

Actually this work of mine was related to comparing a typical OLTP load on Solid State Disks (SSD) vs the same load on normal hard disks (HDD) for “writes”. So using the script that I explained earier I simulated UPDATES/DELETES/INSERTS via index search. In our case we are processing 100 rows at a time.Now typically the UPDATES and DELETES affect 1000,000 existing rows in table testwrites. But remember that tdash has 1.7 million rows and above 1000,000 UPDATE AND DELETES return zero rows. However, INSERT will happen. So it total a million rows (in chuncks of 100 rows at a time) will be UPDATED, DELETED and RE-INSERTED. Above 1000,000 only INSERTS will happen and obviously index rows will be added. So there will be plenty of recursive rows.

So if I run the whole thing (UPDATES/DELETES/INSERTS) and look at the number of recursive statement for HDD I get

That 4,321 elapsed seconds includes 4,312.28 sec from recursive statements plus 9.15 sec for non-recursive statements (not shown). These recursive statements occurred because of dynamic storage extension. Dynamic storage extension occurred in our case because table testwrites and its index had to extend beyond their allocated space (that is, new extents were allocated). The wait times included 3,660.65 sec for db file sequential reads and 244.64 sec waits for db file scattered reads.

Now this summary seems to be correct. However, it typically does not answer (at least directly) the throughput of writes. Not suprising noone waits for writes. So I wanted to devise a mechanisom for measuring writes (exclusing ORION or other scripts). I thought abou it and decided to use AWR for this purpose. Went back to my original script and used checkpoint prior and following PL/SQL block and with snapshot

Now I can work out the volume of redos and writes generated (another topic for itself). Also for clarity with SSD tests redo files were on SSD. With HDD tests redo files were put on HDD. So after a bit of head scratching with tkprof output, I decided to look at the overall statistics for PL/SQL block. just compare the SQL execute times pretty much as they are taken at the highest level (like outside of the loop).

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: