Two minor partition gotchas on partition extent sizing and on the number of partitions in an interval partitioned.

First – old news – note that initial extent size for partitioned tables changed in 11.2.0.2.
Somehow this passed me by and I was late to the large extent party – duh!
Influenced by parameter _partition_large_extents, the default was changed to 8MB rather than 64K for autoallocated ASSM extent sizing.

A friend highlighted this to me on his DB where a lot of space had been wasted on small or empty partitions.
If empty, why was space wasted? Did not deferred segment creation benefit?
Deferred segment creation was off on this DB because of historic occurrence of bug 12535346 with deferred constraints combined with other features with “deferred” behaviour (deferred segment creation or interval partition).

Second gotcha – if you are considering INTERVAL PARTITIONING, particularly with a RANGE of 1, think about the impact of the maximum number of interval partitions.

Although we only have two materialised partitions, the possible future partition boundaries inbetween are effectively reserved for numbers between the lower and upper boundaries.

There’s not a lot we can do about this UNLESS you know that you’re not going to need those reserved partitions OR if you’re happy for other numbers to go into existing partitions, effectively having ranges much larger than the original 1.

At the moment, if we go higher than our reserved boundaries it fails, but for example if the key “1048573” comes along with is within our reserved boundaries, it will create a new partition:

The ETL process commonly verifies that certain constraints are true. For example, it can validate all of the foreign keys in the data coming into the fact table. This means that you can trust it to provide clean data, instead of implementing constraints in the data warehouse. You create a RELY constraint as follows:

Tangent alert!
Now, if I wasn’t going to go off on this tangent because the article will be long enough anyway but when running up the article, a number of questions about
“hy was this not being rewritten?” came up so, we might as well cover more bases.

We can’t use the materialized view because it’s stale.
To use it we have to consider whether we want to run with query_rewrite_integrity set to stale_tolerated.
In this case, not.

If I continue to leave the update uncommitted and check another session, the it continues to use the MV.

But, another tangent!!
In that other session, if I set statistics_level to all, then it won’t use the MV and DBMS_MVIEW.EXPLAIN_REWRITE can’t tell me why.
I’m going to conveniently leave that investigation for another time but just show the evidence:

If we trace my session, we can find out what is contributing to this unpleasant two seconds.
I’m going to cheat, flush the shared pool, do the update and commit and check v$sql
Here is a summary of the significant contributing sql snippets.

If you have read Alberto’s blogs then you will notice the problematic HASH JOIN RIGHT SEMI in the INSERT
And you will also be familiar with a couple of solutions which now present themselves.

It’s great that there are known workarounds to this problem but, as I hope to demonstrate, the fact that there are at least three separate codepaths through the fast refresh mechanism worries me particularly when we see that
at least one of these alternative code paths causes bugs with other variations on the MV refresh mechanism. What a mess!!

First solution to the problem is paramter _mv_refresh_use_stats.
Repeat update & commit:

So, to briefly summarise the above, we have at least three clear codepaths through the fast refresh mechanism which result in completely different internal SQL being generated.
Multiple code paths are added complexity when it comes to combining other features, upgrades, testing, etc

It’s then no great surprise when such multiple codepaths lead to other bugs.

So which approach are we meant to do?

Should we just take the default longer commit on the chin?

I’m not sure.

But I can come up with an illustration of why we should think twice about doing down the “_mv_refresh_use_stats” approach.
Firstly it’s an underscore parameter so we should really get Oracle Support approval before setting it (which means a fair bit of back and forth until you get someone who knows what you’re going on about)

Secondly, we can quickly find some bugs around the usage of this parameter with the newer COMMIT SCN (also covered in Alberto’s blogs above)

alter session set "_mv_refresh_use_stats" = true;
session SET altered.
update t1 set flag = 'Y' where oid = 1000001;
1 rows updated
commit;
SQL Error: ORA-12008: error in materialized view refresh path
ORA-00904: "SNAPTIME$$": invalid identifier
12008. 00000 - "error in materialized view refresh path"
*Cause: Table SNAP$_<mview_name> reads rows from the view
MVIEW$_<mview_name>, which is a view on the master table
(the master may be at a remote site). Any
error in this path will cause this error at refresh time.
For fast refreshes, the table <master_owner>.MLOG$_<master>
is also referenced.
*Action: Examine the other messages on the stack to find the problem.
See if the objects SNAP$_<mview_name>, MVIEW$_<mview_name>,
<mowner>.<master>@<dblink>, <mowner>.MLOG$_<master>@<dblink>
still exist.

We don’t actually need COMMIT SCN in our ON-COMMIT MV because we’re using the XID.
It’s a newer mechanism for FAST REFRESH ON DEMAND

I know that I don’t need COMMIT SCN in my case above but the fact that it’s so easy to find an issue means that I’m reluctant to do down the parameter route.

Note that there are quite a few bug notes around this in Oracle Support.
A lot of the advice is conflicting, not least because the behaviour of “_mv_refresh_use_stats” changes mid 10.1 something.

At the moment, I think I’m going to go with the locked stats on the MLOGs but it’s worrying.

Only advanced compression, which requires Advanced Compression license, supports drop on compressed tables but even in this situation there is no real drop but internally the database sets the column UNUSED to avoid long-running decompression and recompression operations.

Why might this matter?

I can think of at least one time when this might matter – partition exchange.
Ok, the circumstances are probably niche and more relevant for a DEV environment than PROD.
But imagine you add a column to one side of the partition exchange operation and then think better of it and immediately drop it.

alter table t2 exchange partition for (1) with table t1;
SQL Error: ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
14097. 00000 - "column type or size mismatch in ALTER TABLE EXCHANGE PARTITION"
*Cause: The corresponding columns in the tables specified in the
ALTER TABLE EXCHANGE PARTITION are of different type or size
*Action: Ensure that the two tables have the same number of columns
with the same type and size.

Until I recreate the table or use DBMS_REDEFINITION.
Or until I do the same to the other side of the exchange operation:

3. You don’t use scope when setting in a PDB otherwise
ORA-02096: specified initialization parameter is not modifiable with this option

SQL> alter session set container = orcl;
Session altered.
SQL> alter system set inmemory_size=400M scope=spfile;
alter system set inmemory_size=400M scope=spfile
*
ERROR at line 1:
ORA-02096: specified initialization parameter is not modifiable with this option
SQL> alter system set inmemory_size=400M;
System altered.

4. If you try to set the PDB inmemory_size larger than the CDB then you get
ORA-02097: parameter cannot be modified because specified value is invalid

7. While we showed in point (4) that a single PDB inmemory_size cannot exceed the CDB size, we can overallocate such “maximum quotas” across all PDBs – i.e. the sum of inmemory_size for all PDBs can exceed CDB size (prompts future investigations):

This particular example is based on something I needed to do in the real world, copying data from one database into another over a db link. Datapump is not available to me. Tables in question happen to be partitioned by a date-like number (boo!) hence some of the specific actions in the detail.

I think it’s a good example of how to use dbms_parallel_execute but also it might be interesting to see how we might combine that functionality with parallel sessions each operating on a single numeric partition.

For setup, let’s create a suitable source table on a remote db.
In this example, I’m recreating the entries in dba_objects for every day for a couple of years.

Now, let’s see how long it takes to do an INSERT SELECT over a db link.
Time is often not a good measure but in this case I’m primarily interested in how long it takes to copy a whole bunch of tables from A to B over a db link.

As mentioned, you could parallelise bits of it either side but the benefit is limited, it might even make things worse thanks to BUFFER SORT operation.

Next let’s compare to method with DBMS_PARALLEL_EXECUTE.

We want some parallel threads to work on independent partitions, doing direct path inserts, concurrently.

First I’m just going to create a view on the SOURCE DB to make my chunking on daily partition interval simpler.

I could create this on the TARGET DB with references to the dictionary tables over db link but it could be significantly slower depending on the number of partitioned tables and whether predicates are being pushed.

Either you except that the Oracle Optimizer has a wealth of complicated strategies and, in this complex effort to get the best executions it can, will inevitably either get it wrong sometimes (or take some extra executions to realise it’s wrong).

Or you stick your head in the sand and raise a bug for every unexpected poor execution or plan flip.

But let’s say that above else you wanted Plan Stability.

This is an widespread desire.

What would be the best strategy?
And to what lengths would you take it?

SQL Plan Management features are designed to give you some degree of stability.

You could baseline statements which have given you a problem in the past.

What if that is not deemed adequate?

So, on the one hand, you could baseline everything you could find.

Taking a long history of AWR and taking regularly snaps of V$SQL, you could put all in a SQL Plan Baseline.

But, depending on your application, you might regularly get new SQL appear unprotected by a baseline.
In one Hibernate application I know, a change in the application domain model often results in the table aliases changing, meaning that there is lots of brand new, but very familiar, SQL.
So, you then become dependant on having a reliable testing infrastructure which will generate nearly all the SQL you’re likely to get in advance of production.

In addition, you might want multiple plans in your baseline – for ACS for example – and then, once that extra bindset awareness is flushed out of memory, you then need a couple of executions to rediscover it (unless you use a SQL Patch to inject BIND_AWARE into the specific sql statements).

It’s certainly no guarantee of stability.

What is the alternative?

I like to argue that most problematic executions stem from two sources:

The other day I made a comment to someone that “if you’re heavily reliant on ACS, perhaps you should rethink whether you should be using literals rather than bind variables”.

So, you might then take the position that plan instability stems from increasing complexity in the optimizer.

In which case, maybe a viable strategy might be to turn off much of the complexity:

Bind Variable Peeking

Cardinality feedback

Adaptive Cursor Sharing

Dynamic Sampling

Adaptive direct path reads

12c adaptive execution plans

Anything with the word “adaptive” in it?

Default Stats job and/or default method_opt histogram behaviour

This seems quite radical to many. Perhaps justifiably.

Perhaps it would be nice if there were new optimizer strategies – OPTIMIZER_MODE? – perhaps “AGGRESSIVE_PERFORMANCE” (maybe “ADAPTIVE_PERFORMANCE_WITH_THE_ODD_HICCUP” ?) vs “PERFORMANCE_STABILITY” which were an umbrella of many of the features above.

Thoughts?

To what extent are you concerned with performance stability, above all else?

How many of the optimizer features in each release are actually aligned with that widespread concern?