June 2009

I have been working on a system where many Application Engine programs are running throughout the day, and are frequently collecting Optimizer statistics with the %UpdateStats macro on many working storage tables. Concurrent calls to dbms_stats are typical.

There are two new behaviours in Oracle 10g RDBMS that can in extreme cases, in combination with a system that calls dbms_stats very frequently, create a significant performance overhead.

From Oracle 10g, histograms may, by default, be collected automatically. That means that rows are concurrently deleted from and inserted into histgrm$ and hist_head$, leading to contention and consistent read.

Also from Oracle 10g, every time you collect statistics on a table the old statistics are retained in the SYS.WRI$_OPTSTAT%HISTORY tables. If histograms have previously been collected, these are also copied. DBMS_STATS has the additional overhead of writing this history. I found in excess of 10,000 versions of previous statistics for some tables, because the batch processes have updated statistics on working storage tables that many times.

dbms_stats also appears to be responsible for purging history older than the retention limit. The default retention period is 31 days. I have seen concurrent calls to dbms_stats blocked on row level locks on the statistics history tables. For me, this occurred 31 days after the system went live on a significantly increased volume.

Statistics history was designed to work in conjunction with schema wide statistics jobs that only refreshed stale statistics. There is an option on gather_schema_stats to collect only statistics on tables where the current statistics are stale. However, there is no such option on gather_table_stats. If you have decided to call this procedure for a particular table, then it is assumed you know you need to refresh the statistics. However, by calling dbms_stats from a batch program you can end up calling it much more frequently than is really necessary.

Recommendations

Disable statistics history by using dbms_stats.alter_stats_history_retention to set the retention period to zero. Unfortunately this can only be set at database level. The statistics history is there in case you want to revert to a previous version of the statistics should a new set of statistics produce a problem, but it is only used rarely, and I think this is a necessary sacrifice.

Use Oracle Optimizer Dynamic Sampling. However, I suggest increasing the level from the default of 2 to 4 to increase the situations in which it is used.

Introduce the new version of the PL/SQL wrapper package for dbms_stats so that you can specify the records for which statistics will be explicitly collected, and whether histograms are to be collect. Thus you can reduce the number of calls to dbms_stats.

I have been working on a system where many Application Engine programs are running throughout the day, and are frequently collecting Optimizer statistics with the %UpdateStats macro on many working storage tables. Concurrent calls to dbms_stats are typical.

There are two new behaviours in Oracle 10g RDBMS that can in extreme cases, in combination with a system that calls dbms_stats very frequently, create a significant performance overhead.

From Oracle 10g, histograms may, by default, be collected automatically. That means that rows are concurrently deleted from and inserted into histgrm$ and hist_head$, leading to contention and consistent read.

Also from Oracle 10g, every time you collect statistics on a table the old statistics are retained in the SYS.WRI$_OPTSTAT%HISTORY tables. If histograms have previously been collected, these are also copied. DBMS_STATS has the additional overhead of writing this history. I found in excess of 10,000 versions of previous statistics for some tables, because the batch processes have updated statistics on working storage tables that many times.

dbms_stats also appears to be responsible for purging history older than the retention limit. The default retention period is 31 days. I have seen concurrent calls to dbms_stats blocked on row level locks on the statistics history tables. For me, this occurred 31 days after the system went live on a significantly increased volume.

Statistics history was designed to work in conjunction with schema wide statistics jobs that only refreshed stale statistics. There is an option on gather_schema_stats to collect only statistics on tables where the current statistics are stale. However, there is no such option on gather_table_stats. If you have decided to call this procedure for a particular table, then it is assumed you know you need to refresh the statistics. However, by calling dbms_stats from a batch program you can end up calling it much more frequently than is really necessary.

Recommendations

Disable statistics history by using dbms_stats.alter_stats_history_retention to set the retention period to zero. Unfortunately this can only be set at database level. The statistics history is there in case you want to revert to a previous version of the statistics should a new set of statistics produce a problem, but it is only used rarely, and I think this is a necessary sacrifice.

Use Oracle Optimizer Dynamic Sampling. However, I suggest increasing the level from the default of 2 to 4 to increase the situations in which it is used.

Introduce the new version of the PL/SQL wrapper package for dbms_stats so that you can specify the records for which statistics will be explicitly collected, and whether histograms are to be collect. Thus you can reduce the number of calls to dbms_stats.

Sometimes when I'm asked to check a particular SQL in a production like environment at first I only get rather limited access, e.g. a read-only user that has the required privileges to perform the SQL statement, but doesn't own the objects.

If I'm now supposed to analyse a rather complex SQL with dozens of table and/or view references I like to use the ALTER SESSION SET CURRENT_SCHEMA command to switch any unqualified object references to the schema set via CURRENT_SCHEMA rather than editing the SQL and add all the object references.

This basically works just fine, but I recently have come across an interesting anomaly in 10.2.0.4: If the optimizer uses dynamic sampling as part of the parsing then it doesn't do so anymore if the CURRENT_SCHEMA has been altered to a non-default schema.

The following simple test case demonstrates the issue in 10.2.0.4 Win32:

SQL> SQL> -- Check the 10053 trace fileSQL> -- It will show that the dynamic samping query failsSQL> -- and therefore no dynamic sampling will be performedSQL> -- presumably because the current_schema is not obeyedSQL> -- by the recursively executed querySQL> alter session set tracefile_identifier = 'dynamic_sampling_current_schema_10053';

So as you can see the simply query uses dynamic sampling as instructed (and in 10.2 would use dynamic sampling by default anyway since the table has no statistics), but when using the CURRENT_SCHEMA trick and then an unqualified reference, the dynamic sampling is not performed and the estimate is based on hardcoded defaults.

Checking the 10053 optimizer trace file confirms that the dynamic sampling is attempted, but the recursive query fails, and the 10046 SQL trace shows that an "ORA-00942 table or view does not exist" error is encountered while parsing the recursive dynamic sampling query.

So be careful when using this (rather unusual I have to admit) setup, the execution plans might be different if you're using a non-default CURRENT_SCHEMA and dynamic sampling is involved.

The issue seems to be fixed in 11.1.0.7 (dynamic sampling is performed as expected when using non-default CURRENT_SCHEMA), however I couldn't identify a corresponding bug on Metalink. If anyone comes across a bug number or document ID this would be helpful.

Richard Feynman defined science as "the belief in the ignorance of experts." Science begins by questioning established ideas. ...Even those ideas promoted by so-called experts.

The value of science that's obvious to everybody is the chance you might discover some valuable truth that nobody else has discovered before. That's the glamorous idea that might motivate you to begin the hard work that science sometimes requires. Science is also valuable to you when you learn that an established idea, no matter how much you may not like it, really is true after all. That second value of science is not as glamorous, but it's just as important. My little prayer with respect to that possibility is, "If an idea I believe is wrong, please let me find out before anybody else does."

Everyone can do science. Not just "scientists"; all of us. But you need to do science "right," or it's not science. Do it right, and you accumulate a little bit of truth. Do it wrong, and and you've wasted your time, or worse, you've doomed yourself to waste more of your time in the future, too.

The difference between "right" and "wrong" in science is not some snooty, bureaucratic concept. You don't need a license or a blessing to do science right. You just need to ensure that the cause-effect relationships you choose to believe are actually correct. One of the rules for doing science right is that you measure instead of just asserting your opinion.

Different people have different thresholds of skepticism. Some people believe new ideas, whether they're true or false, with very little persuasion. The people who are persuaded easily to believe false things cannot contribute much useful new knowledge to their communities (irrespective of how much they might publish).

Just a minor thing to consider: By default in 10g and later index statistics are generated along with an index creation (option COMPUTE STATISTICS in previous releases enabled by default), so a newly created index usually has computed statistics.

10g also introduced the option to lock table statistics.

Now if you lock statistics in 10g in later using DBMS_STATS.LOCK_TABLE_STATS or LOCK_SCHEMA_STATS and create an index on a locked table the statistics for the index will not be generated along with the CREATE INDEX command. Unfortunately there is no corresponding "FORCE" option in CREATE INDEX available to overwrite that behaviour that I'm aware of so it looks like you're only left with two choices:

1. Use a separate DBMS_STATS.GATHER_INDEX_STATS call with the FORCE=>true option to override the lock on the statistics

2. Temporarily unlock the table statistics before creating the index

The first option can be costly if the index is large, the second option requires additional steps to be taken, and it obviously needs to be ensured that the table statistics are not modified while they are unlocked (e.g. by the default statistics job in 10g and later).

A small testcase run on 10.2.0.4 Win32 follows to demonstrate the issue. I got the same result on 11.1.0.7 Win32.

I presented about various performance myths in my ‘battle of the nodes’ presentation. One of the myth was that how spawning parallel query slaves across multiple RAC instances can cause major bottleneck in the interconnect. In fact, that myth was direct result of a lessons learnt presentation from a client engagement. Client was suffering from performance issues with enormous global cache waits running in to 30+ms average response time for global cache CR traffic and crippling application performance. Essentially, their data warehouse queries were performing hundreds of parallel queries concurrently with slaves spawning across three node RAC instances.

Of course, I had to hide the client details and simplified using a test case to explain the myth. Looks like either a)my test case is bad or b) some sort of bug I encountered in 9.2.0.5 version c) I made a mistake in my analysis somewhere. Most likely it is the last one . Greg Rahn questioned that example and this topic deserves more research to understand this little bit further. At this point, I don’t have 9.2.0.5 and database is in 10.2.0.4 and so we will test this in 10.2.0.4.

udpsnoop

UDP is one of the protocol used for cache fusion traffic in RAC and it is the Oracle recommended protocol. In this article, UDP traffic size must be measured. Measuring Global cache traffic using AWR reports was not precise. So, I decided to use a dtrace tool kit tool:udpsnoop.d to measure the traffic between RAC nodes. There are two RAC nodes in this setup. You can read more about udpsnoop.d. That tool udpsnoop.d can be downloaded from dtrace toolkit . Output of this script is of the form:

If you’ve read EnterpriseDB’s latest press release, you’ll notice a new feature called Infinite Cache. While it may sound revolutionary, it is not a new technology. In fact, it was done for MySQL quite awhile ago under the Waffle Grid project. While I consider this type of caching relatively simple, I don’t want Oracle to be left out. As […]

These allow to group multiple temporary tablespaces into a single group and assign a user this group of tablespaces instead of a single temporary tablespace.

This raises some interesting questions, and for some of these I don't find answers in the official documentation. Some of these questions are:

- Can a single workarea execution allocate space from more than one temporary tablespace, e.g. to support large serial sort operations?

A workarea belongs to a single operation of an execution plan. There are several different types of operations that require a workarea, among them are sorts, hash joins, group bys and sort/merge joins.

This workarea can fit into available PGA memory, but can also spill to disk in case there is insufficient memory available to support the operation.

Furthermore this implies that a execution of a single SQL statement can require multiple workareas, e.g. a quite simple statement might need two workareas for two hash joins and a third one for a subsequent sort order by operation.

Note that there are other types of operations that don't require a workarea, e.g. a nested loop join doesn't require a workarea (and therefore will never acquire temporary space).

Details about workareas can be obtained from various dynamic performance views, e.g. V$SQL_WORKAREA, V$SQL_WORKAREA_ACTIVE and V$SQL_WORKAREA_HISTOGRAM.

- Can multiple workareas of a single session allocate space from different tablespaces?

- According to the documentation different sessions of the same user can use different temporary tablespaces from the group. Is this correct?

- What about parallel execution? The documentation says that the parallel slaves can use different temporary tablespaces. Is this correct?

Here's a simple testcase that generates two small temporary tablespaces and assigns them to the same group. A test user is assigned first a single temporary tablespace and afterwards the group to see if this makes any difference.

It uses a carefully crafted table for which a sort operation doesn't fit into a single small temporary tablespace but is supposed to fit if both temporary tablespaces of the group can be used.

The session is deliberately using a very small sort_area_size in manual workarea policy mode to force the sort operation to spill to disk.

The following are the results from 11.1.0.7 Win32. Similar results can be seen from 10.2.0.4 Win32. The database used a 8KB default block size and a locally managed tablespace with manual segment space management (no ASSM) for the data but that shouldn't matter here much.

SQL> SQL> exec open :r for select * from test_temp_grp order by object_name

PL/SQL procedure successfully completed.

SQL> SQL> -- Sort still doesn't fit into available temp space (but it should as you'll see later)SQL> -- so obviously a single workarea can't use space from different tablespacesSQL> declare 2 rec test_temp_grp%rowtype; 3 begin 4 fetch :r into rec; 5 end; 6 /declare*ERROR at line 1:ORA-01652: unable to extend temp segment by 13 in tablespace TEMP1_GRPORA-06512: at line 4

SQL> SQL> -- What if we have multiple workareas per sessionSQL> exec open :r for select * from test_temp_grp where rownum <= 5000 order by object_name

SQL> SQL> -- So it looks like a session is limited to a particular tablespace taken from the groupSQL> -- What happens to a second session of the same userSQL> -- One tablespace is still in use by cursor RSQL> -- Check V$SORT_USAGE if interestedSQL> SQL> -- Run the same query in second sessionSQL> -- You might need multiple attempts to get this workingSQL> -- Depending on the tablespace taken from the groupSQL> -- and then close second session to free temp spaceSQL> -- Press ENTER to continueSQL> SQL> pause

SQL> SQL> -- This works if the second session is assigned a different tablespace taken from the groupSQL> -- So different sessions of the same user will potentially be assigned to different tablespacesSQL> SQL> -- What about parallel executionSQL> exec close :r

SQL> SQL> -- This worksSQL> -- So the PX sessions can be assigned to different tablespaces taken from the groupSQL> -- Check V$SORT_USAGE if interestedSQL> declare 2 rec test_temp_grp%rowtype; 3 begin 4 fetch :r into rec; 5 end; 6 /

PL/SQL procedure successfully completed.

SQL> SQL> spool off

So in summary the following can be deduced:

1. A single workarea execution can't use space from more a single tablespace even when the user is assigned a temporary tablespace group. Or in other words: A single workarea execution can only allocate a single temporary segment, and segments in general can't span multiple tablespaces in Oracle, i.e. they must fit into a single tablespace.

2. Even multiple workareas for a single session can't use space from the multiple temporary tablespaces available. So this looks like a session attribute, i.e. assigned to the session, and all workareas of that session will use the single temporary tablespace assigned (taken from the group)

3. Multiple sessions of the same user can be assigned to different temporary tablespaces taken from the group.

4. Parallel execution slave sessions can do the same, they can be assigned to different temporary tablespaces taken from the group.

So in order to leverage the power of temporary tablespace groups you need either multiple sessions running serial SQL or the parallel execution option.

The second part of my two-hour Helsinki presentation brings the message that in order to avoid PL/SQL spaghetti when taking the "fat database" approach, one must employ a layered PL/SQL code architecture inside the DBMS. This means that UI-code, BL-code and DL-code should not be merged together inside single pieces of PL/SQL code blocks, but instead remain completely separated. Now, for UI-code