Monday, April 27, 2015

This is just a heads-up for those thinking about using Edition Based Redefinition (EBR) and enabling it on an existing schema with objects. Although EBR isn't exactly a new feature its current adoption level is probably not that high (which probably changes in future as Oracle E-Business Suite uses EBR now as part of their default upgrade procedure as far as I understood).
I was recently contacted by someone who enabled EBR on an existing schema using ALTER USER ... ENABLE EDITIONS and had to use the "FORCE" option since there were (according to the official ALTER USER documentation) "objects that are not editionable and that depend on editionable type objects in the schema. ... In this case, all the objects that are not editionable and that depend on the editionable type objects in the schema being editions-enabled become invalid".
Although one could say it is clearly mentioned in the documentation, the consequences are probably not that obvious to everyone if those non-editionable objects are tables having columns based on user-defined types. So I state it here to make it hopefully clear enough:
If you use the FORCE option of ALTER USER ... ENABLE EDITIONS to enable editions on an existing schema already containing objects and among those objects are tables having columns based on user-defined types, then effectively those tables will become invalid and stay invalid. There is no officially documented way to reverse this step or compile tables to become valid again (there's no ALTER TABLE COMPILE or similar). The table cannot be accessed any longer and all data contained is hence officially gone, too.
This means the affected tables need to be restored from a (hopefully existing and usable) backup (I don't know if there's a dictionary hack available that is officially sanctioned by Oracle support to make the table valid again).
In my opinion the FORCE option should check if the schema contains such dependent tables and in such a case error out with an error message that the table needs to be dropped first (or moved to a different schema) before ENABLE EDITIONS can succeed. This would make the situation much clearer, rather than leaving tables in INVALID state behind that cannot be fixed/recompiled afterwards.
Below is a simple test case that demonstrates the issue:

Monday, April 20, 2015

In general it is known that Function-Based Indexes (FBIs) can no longer be used by the optimizer if the expression contains literals and CURSOR_SHARING = FORCE / SIMILAR (deprecated) turns those literals into bind variables. Jonathan Lewis described the issue quite a while ago here in detail.
In a recent OTN thread this issue was raised again, but to my surprise when I played around with a test case that mimicked the OP's problem query I found that (certain) Oracle versions have some built-in logic that enable FBI usage for certain cases where you would expect them to be not usable.
If you test the following code on versions from 10.2.0.4 (possibly earlier) up to and including version 11.2.0.3 then you'll notice some interesting details:

Looking at the statement text that results from "CURSOR_SHARING = force" we can spot the expected bind variables instead of the literals, and this should result in a corresponding predicate that doesn't match the FBI expression. However, when looking at the filter expression in the predicate section (when forcing a full table scan) we can spot something interesting: It still shows the literal, which doesn't correspond to the predicate of the rewritten query text.
The next execution shows that the FBI really can be used despite the bind variable replacement taking place, and the final execution shows that the cursor sharing works correctly in that sense that a new child cursor got created for the same SQL text with a different plan and different predicate section when using a different literal in the original SQL text. V$SQL_SHARED_CURSOR shows "HASH_MATCH_FAILED" which is described as "No existing child cursors have the unsafe literal bind hash values required by the current cursor", which makes sense and probably means that the corresponding bind variable is marked as "unsafe" internally.
This optimisation shows only up if there is a suitable FBI - if there's no corresponding expression the SQL text and predicate section match. Furthermore it only supports certain expressions - Jonathan's example shows that in general it's true that these rewrites prevent FBI usage. And obviously it ceases to work in 11.2.0.4 and 12c. Whether this is a bug or a feature I don't know, but since it only seems to apply to certain expressions it's probably not that relevant anyway.
As Jonathan points out in his note you can always work around the general problem by hiding the expression in a view, and since 11g of course a proper virtual column definition is the better approach, which doesn't expose this problem either.
Even better would be the proper usage of bind variables and not using forced cursor sharing, but there are still many installations out there that rely on that feature.

How long would you expect it to run to return all rows (no tricks like expensive regular expressions or user-defined PL/SQL functions)?
Probably should take just a blink, given the tiny tables with just 10000 rows each.
However, these are the runtime statistics for a corresponding execution:

How is it possible to burn more than 20 seconds of CPU time with that execution plan?
The actual rows produced correspond pretty much to the estimated cardinalities (except for the final hash join), so that doesn't look suspect at first glance.
What becomes obvious from the SQL Monitoring output is that all the time is spent on the hash join operation ID = 1.
Of course at that point (at the latest) you should tell me off for not having you shown the predicate section of the plan and the corresponding query in first place.
So here is the predicate section and the corresponding query:

There are two important aspects to this query and the plan: First, the join expression (without corresponding expression statistics) between T1 and T2 is sufficiently deceptive to hide from the optimizer that in fact this produces a cartesian product (mimicking real life multi table join expressions that lead to bad estimates) and second, the table T3 is joined to both T1 and an expression based on T1 and T2, which means that this expression can only be evaluated after the join to T1 and T2.
With the execution plan shape enforced via my hints (but could be a real life execution plan shape preferred by the optimizer) T3 and T1 are joined first, producing an innocent 10K rows row source, which is then joined to T2. And here the accident happens inside the hash join operation:
If you look closely at the predicate section you'll notice that the hash join operation has both, an ACCESS operation and a FILTER operation. The ACCESS operation performs based on the join between T1 and T2 a lookup into the hash table, which happens to be a cartesian product, so produces 10K times 10K rows, and only afterwards the FILTER (representing the T3 to T1/T2 join expression) is applied to these 100M rows, but matching only a single row in my example here, which is what the A-Rows shows for this operation.
So the point is that this excessive work and FILTER throwaway isn't very well represented in the row source statistics. Ideally you would need one of the following two modifications to get a better picture of what is going on:
Either the FILTER operator should be a separate step in the plan, which in theory would then look like this:

Which would make the excess rows produced by the ACCESS part of the hash join very obvious, but is probably for performance reasons not a good solution, because then the data would have to flow from one operation to another one rather than being processed within the HASH JOIN operator, which means increased overhead.
Or an additional rowsource statistics should be made available:

Which I called here "Actually evaluated rows" and in addition to this case here of combined ACCESS and FILTER operations could also be helpful for other FILTER cases, for example even for simple full table scan to see how many rows were evaluated, and not only how many rows matched a possible filter (what A-Rows currently shows).
In a recent OTN thread this topic came up again, and since I also came across this phenomenon a couple of times recently I thought to put this note together. Note that Martin Preiss has submitted a corresponding database idea on the OTN forum.
Expanding on this idea a bit further, it could be useful to have an additional "Estimated evaluated rows (EE-Rows)" calculated by the optimizer and shown in the plan. This could also be used to improve the optimizer's cost model for such cases, because at present it looks like the optimizer doesn't consider additional FILTER predicates on top of ACCESS predicates when calculating the CPU cost of operations like HASH JOINs.
Note that this problem isn't specific to HASH JOIN operations, you can get similar effects with other join methods, like NESTED LOOP joins, or even simple INDEX lookup operations, where again the ACCESS part isn't very selective but only the FILTER applied afterwards filters matching rows.
Here are some examples with the given setup:

The former one exhibits exactly the same problem as the HASH JOIN example, only that the FILTER is evaluated in the inner row source of a NESTED LOOP join after the index access operation.
The latter one shows as variation the classic partial "index access" due to a range comparison in between - although the entire expression can be evaluated on index level, the access part matches every index entry, so the range scan actually needs to walk the entire index at each loop iteration and the FILTER is then applied to all the index values evaluated.