Monday, October 17, 2011

Volatile Data, Dynamic Sampling And Shared Cursors

For the next couple of weeks I'll be picking up various random notes I've made during the sessions that I've attended at OOW. This particular topic was also a problem discussed recently at one of my clients, so it's certainly worth to be published here.

In one of the optimizer related sessions it was mentioned that for highly volatile data - for example often found in Global Temporary Tables (GTT) - it's recommended to use Dynamic Sampling rather than attempting to gather statistics. In particular for GTTs gathering statistics is problematic because the statistics are used globally and shared across all sessions. But GTTs could have a completely different data volume and distribution per session so sharing the statistics doesn't make sense in such scenarios.

So using Dynamic Sampling sounds like a reasonable advice and it probably is in many such cases.

However, there is still a potential problem even when resorting to Dynamic Sampling. If the cursors based on Dynamic Sampling get shared between sessions then they won't be re-optimized even if a GTT in one session is completely different from the one of the session that created the shared cursor previously.

So you can still end up with shared cursors and execution plans that are inappropriate to share across the different sessions. Using Dynamic Sampling doesn't address this issue. It addresses the issue if the cursors do not get shared, for example if they use literals and these literals differ so that different cursors will be generated based on the text matching.

Now this is probably an extreme case of data distribution differences but if you run it you'll see it makes the point obvious: In the second session the data distribution of the GTT is completely different, and although there are no statistics gathered on the GTT and hence Dynamic Sampling gets used to arrive at an execution plan, the plan gets shared in the second session (there is only a child number 0) - but the plan is completely inappropriate for the data distribution of the GTT in the that session, you just need to look at the E-Rows and A-Rows columns of the runtime profile:

Imagine a more complex plan with joins and a larger data volume and this is a receipt for disaster.

If this problem cannot be addressed from application side by helping the database to generate different cursors for the different data distributions (for example by simply adding different predicates that don't change the result like 1 = 1, 2 = 2 etc.) then you might be able to handle the issue by using Virtual Private Database (VPD, aka. Row Level Security / RLS, Fine Grained Access Control / FGAC). I've already demonstrated the general approach in the past here, but in this case a slightly more sophisticated approach could make sense.

By adding the following code and RLS policy I can drive Oracle to perform a re-optimization only in those cases where it is appropriate. This limits the damage that the general approach does to the Shared Pool by generating potentially numerous child cursors unconditionally.

create or replace package body pkg_rls_force_hard_parse is -- Cache the count in session state g_cnt number;

function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2 is s_predicate varchar2(100); begin -- Only execute query once in session -- Change if re-evaluation is desired if g_cnt is null then select count(*) into g_cnt from gtt_dyn where id = 10 and rownum <= 10; end if;

-- We end up with exactly two child cursors -- with the desired different plans -- These child cursors will be shared accordingly if g_cnt > 1 then s_predicate := '1 = 1'; else s_predicate := '2 = 2'; end if;

Now if you repeat above exercise - ideally with SQL trace enabled to see the additional acitivity caused by the RLS policy - you'll notice that the different sessions will end up with different child cursors and execution plans based on the check made.

Now the reason why the view is in place might become obvious: A RLS policy on the base table would have lead to an infinite recursive execution of the RLS policy function due to the query performed within the function. There are other obvious options how to deal with that, for example storing the RLS policy function in a separate schema with the EXEMPT ACCESS POLICY privilege should also work.

Notice how a second child cursor has been generated and that the cardinality estimate is now much closer to the reality.

Adaptive Cursor Sharing / Cardinality Feedback

I was curious to see if recent features like Adaptive Cursor Sharing or Cardinality Feedback would be able to solve the issue when using the 11g releases.

Cardinality Feedback (introduced in 11.2) unfortunately doesn't get used in the scenario described here, because Dynamic Sampling disables Cardinality Feedback in the current implementation.

Note that the usage of bind variables also disables Cardinality Feedback for those parts of the plan affected by the bind variables - as described in the Optimizer blog post that can be found here.

So may be Adaptive Cursor Sharing (ACS, introduced in 11.1) can come to rescue in case bind variables get used.

Of course the usage of bind variables increases the probability of cursor sharing in above scenario. As already outlined in a previous note ACS is a "reactive" and "non-persistent" feature, so it will only be able to correct things that have already been going wrong at least once. Furthermore if the ACS information gets aged out of the Shared Pool again mistakes will have to be repeated to get recognized by ACS.

However it is interesting to note that I wasn't able to get ACS working in a slightly modified scenario like this (without the RLS policy in place of course):

There are some interesting points to notice when running this example:

1. A cursor that uses non-equal operators like above less or equal together with bind variables usually gets marked as "bind-sensitive" and will be monitored by ACS. But in the above case the cursor was not marked as bind sensitive and hence ACS didn't even bother to monitor

2. Consequently the two sessions share the single child cursor and the problem is not addressed by ACS even in subsequent executions

3. It looks like that again the usage of Dynamic Sampling disables ACS

Looking at the way ACS manages the Cursor Sharing criteria (check V$SQL_CS_SELECTIVITY for example) I see the problem that ACS probably wouldn't support the fact that the same value for the bind variable resulted in a completely different selectivity range.

May be this is an explanation why ACS is not activated for cursors that use Dynamic Sampling - ACS may only be able to cope with different bind value ranges that lead to different selectivities.

So even when using bind variables and 11g with ACS it looks like that only the RLS policy approach allows to address this issue from a database-only side. Ideally the application should be "data-aware" in such cases and help the database accordingly to arrive at reasonable execution plans by actively unsharing the cursors.

2 comments:

Another solution would be to calculate statistics on the GTT. Then we could force the use of DS by using a DYNAMIC_SAMPLING hint at level 6 and invalidate the cursor before the query is executed by using the dbms_stats.set_table_stats procedure with NO_INVALIDATE = FALSE.