Milano, Italia

For anyone into using their SQL skills creatively, and getting out of the boring SQL-coding daily routine ... here is a puzzle that is both entertaining and challenging, and with a real prize for the winner!

Official abstract:
"In this challenge (see page 25), the Wicked Witch of the West needs help in creating a magic spell to ensure that the Third Annual Witching & Wizarding Ball is a grand success. The winner will receive the August Order of the Wooden Pretzel in keeping with the Steven Feuerstein’s observation that "some people can perform seeming miracles with straight SQL, but the statements end up looking like pretzels created by somebody who is experimenting with hallucinogens." There are currently four knights of the August Order of the Wooden Pretzel: Alberto Dell’Era (Italy) who won the first challenge in 2009 and Andre Araujo (Australia), Rob van Wijk (Netherlands), and Ilya Chuhnakov (Russia) who won the second challenge in 2011."

I remember having a lot of fun when I joined the first edition of this Challenge - I hope the same for you :)

Tracing a session is extremely useful when you need to investigate how a client interacts with the database - the client could be an application of yours, a third-party application, or an Oracle module such as dbms_stats or dbms_mview. To get the perfect picture of the client-server dialogue, you "simply" need to consider all EXEC lines in the trace file, and associate to each line the executed statement and the bind variable values; a very tedious and error-prone task when done manually, that Xtrace can make for you (and for free).

Let's see the tool in action. Consider tracing a call to this stored procedure, that executes a recursive SQL statement :

create or replace procedure test_proc( p_x int )
is
begin
for i in 1..p_x loop
for k in (select count(*) from t where x > i) loop
null;
end loop;
end loop;
end;

Here is the output of Xtrace:
Reading it bottom-up, you can see that the client called the SP, which in turn executed recursively (note the indentation) the SQL statement twice.

You can also ask Xtrace to display the bind variable values used for each execution:
So - the client passed the value "2" for :p_x to the SP, which in turn executed the SQL statement first passing "1" for :B1, and then passing "2".

Interested ? Try it live (requires Java Web Start):

When Xtrace opens up, press the "options" button and then the "EXEC FLOW analysis" button. Enable/disable the bind variable values using the "display BINDS under EXEC" checkbox; color the statements as you like.

We introduced Xtrace in this post; the Xtrace home page contains the tool (which can be used online or downloaded) - and a manual for advanced uses.

Xtrace is a graphical tool that can navigate Oracle trace files, manipulate them, and optionally get them back as a text file. It actually makes (much) more, but in this first post we are going to focus on its basic browsing capabilities.

Let’s see the tool in action on the trace file produced by this simple PL/SQL block:

Even for this artificially simple trace file, it takes a lot of effort to read and understand it; for example, it takes a while to associate the recursive SQL lines to the execution of the PL/SQL blocks (the “EXEC #26” line).

With Xtrace, the trace reading experience is remarkably much better:

Note the indentation by recursive level (which is provided out-of-the -box) and the color of the lines by statement (that takes perhaps a minute in order to be set up).
You can try this example live by pressing the “Launch” button above if you are interested; in particular, try the “Options” button of the middle pane, and the “set color” popup menus of the top pane.

Suggestion: you might even check the hyperlinks that links together the lines; for example, the xct pointer that links the SQL recursive calls to the parent “EXEC #26” (check the interactive manual for more information).

A devastating performance degradation of materialized view fast refreshes can happen in versions after 9i - and can be healed rather easily by simply setting the hidden parameter _mv_refresh_use_stats or, a bit surprisingly, by locking statistics on the logs. The problem can manifest at least in the currently-latest patchsets of 10g, 11gR1 and 11gR2 (10.2.0.4, 11.1.0.7 and 11.2.0.1), seems to hit a lot of people, and its root cause are the utilization of wrong hints by the Oracle refresh engine.

We will investigate the join-only MV case only, since this is the case I have investigated after a question by Christo Kutrovsky, factoring in some observations by Taral Desai and some Support notes; I have some clues that something similar may happen for other types of MVs.

1 - two big base tables joined together by the MV;
2 - only a small fraction of rows modified (actually one deleted, two updated, one inserted);
3 - all tables and indexes with fresh statistics collected;
4 - MV logs with no statistic collected AND with not-locked statistics;
5 - indexes present on the joined columns;
6 - indexes present on the rowid columns of the MV.

Points 1 and 2 make for the ideal scenario for incremental ("fast") refreshes to be effective; 3 is very common as well, since you normally have many other statements issued on the tables; the relevance of 4 will be clear later, but it happens very often in real life, since people might perhaps consider collecting stats on the log, but locking their statistics is usually not made, at least in my experience.

To understand the importance of points 5 and 6, please check this post of mine; note how those indexes are a necessary prerequisite for the sanity of the DEL and INS steps of the MV process. Without them, the refresh cannot be incremental since it has no physical way to read and propagate only the modified rows and those related to them, but it must scan (uselessly) most of the base tables and MV. But in other for the refresh to be incremental ("fast"), those indexes have to be actually used...

the issue

Let's illustrate the issue focusing on the DEL step (the easier to discuss about). In the above mentioned post, we have seen that the DEL step uses a single SQL statement whose text, leaving out minor technical details and hints, is:

That is: get the rowid of all modified rows from the log, and use the rowid-based index to delete the "old image" of them from the MV (inserting their "new image" is the job of the INS step). This is truly incremental, since the resource usage and elapsed time are proportional to the number of rows logged in the MV log, not to the dimension of the tables.

Oops, the indexes are not used ... hence the DEL step overhead is proportional to the size of the MV, and that can be definitely unacceptable.

That is due to the engine injecting an HASH_SJ hint in the outermost nested subquery:

... WHERE "TEST_T1_ROWID" IN (SELECT /*+ NO_MERGE HASH_SJ */ ...

This is recognized as a bug in many scenarios (start from Oracle Support note 578720.1 and follow the references to explore some of them) even if I have not found a clear and exhaustive note that documents the behaviour.

remedy one: set "_mv_refresh_use_stats"

To get back to the healthy plan, simply set "_mv_refresh_use_stats" to "true" (ask Oracle Support first of course for permission); this makes for a set of hint much more adequate for a fast refresh:

... WHERE "TEST_T1_ROWID" IN (SELECT /*+ NO_MERGE NO_SEMIJOIN */ ...

Note: The root cause for this bug is probably due to a change hinted in note 875532.1 - in 10.2.0.3 the meaning of _mv_refresh_use_stats was reversed, but not the default, hence (by mistake?) activating a different piece of the engine code.

The very same problem happens for the INS step; I won't go into much details here (please check the test case spools provided above if interested), but in 9.2.0.8 the base table modified rows are directly fetched using the rowid contained in the log:

Whose resource consumption is, of course, proportional to the size of the base table.

Even in this case, this is due to the nasty HASH_SJ hint:

... FROM "TEST_T1" "MAS$" WHERE ROWID IN (SELECT /*+ HASH_SJ */ ...

If you set _mv_refresh_use_stats, you get back the 9.2.0.8 plan - and thus you are back to incremental for both the DEL and INS steps. As a side note, a cardinality hint is used, where the cardinality is set to the correct value (6 in my test case):

Very interestingly, instead of setting the hidden parameter, you have another way to get back to the healthy plan: gather statistics on the MV logs when they are empty AND lock them (as suggested in note 578720.1, albeit not in this scenario and even if setting the parameter is not necessary; thanks to Taral Desai for pointing me to the note). In this case, no hint at all is injected beside a NO_MERGE for the DEL step:

So, the engine is confident that the CBO will come out with a good plan, and it does not inject any "intelligent" hint. Possibly, and intriguing, this is because by locking the statistics, I am assuring the engine that these statistics are representative of the data anytime. So, locking the statistics is not meant only as a way to prevent dbms_stats from changing them ... it is deeper than that. At least in this case, you are taking responsibility for them, and Oracle will take that in consideration.

This post investigates the improvements that have been made in 11gR2 to the fast refresh engine of materialized views (MVs) that are set to be automatically refreshed at commit time. We speak about join-only materialized views only in this post, as always with the help of a test case.

As noted in the post of mine "11gR2: materialized view logs changes", in 11gR2 a new column, xid$$, is now part of materialized view logs; this column records the id of the transaction that logged the changes of the base table which the log is defined on. It is important to stress that this column is added regardless of the type of the MV log, that is, to both the brand-new "commit SCN-based" logs and the old fashioned "timestamp-based" ones. That means that both types of MV logs can take advantage of the new improvements - albeit I haven't tested whether MVs (logs) migrated from a previous version are automatically upgraded by the migration scripts and get the new xid$$ column added.

algorithm before 11gR2

In versions before 11gR2, the refresh algorithm for on-commit MVs was the same as the one for on-demand ones, with only minor variants. That is, the algorithm was almost completely the same, just triggered by the commit event instead of by the user.

For an in-depth analysis of the algorithm, I will refer the reader to the discussion about the on-demand algorithm in the post "fast refresh of join-only materialized views - algorithm summary"; in passing, the test case for this post is in fact the very same three-table join MV, just redefined as "on commit" instead of "on demand". To recap, the "old" algorithm (until 11.1.0.7) was:

1) new log rows are inserted with snaptime$$=4000 A.D;
2) at refresh time (commit time), a snapshot of the new rows is taken, that is, all new rows are marked with snaptime$$= "commit time", using the statement

3) all modifications whose snaptime$$ is between the date of the last refresh (excluded) and the commit date(included) are propagated to the MV. The propagation consists of two steps.
First a DEL step:

Note that the only small difference from the on-demand case is the absence of the "as of snapshot" clause, but the statements are otherwise identical. Note also that the rows in the MV log are identified in both statements by snaptime, using the subquery

4) all obsolete log rows are deleted, that is, all rows whose snaptime$$ is less than or equal the lowest of all refresh times are removed from the log, using the the statement

delete from mlog$_test_t1
where snaptime$$ <= :1

algorithm starting from 11gR2

In 11gR2, the on-commit algorithm is still almost the same as the on-demand one; the "only" change is how modified rows to be propagated are identified, and in general, how logs are managed. Not surprisingly, log rows are now directly identified by the transaction id, which is logged in xid$$. In detail:

1) new log rows are inserted with xid$$ = transaction id;
2) at refresh time (commit time), no snapshot is taken, that is, the MV log is not updated at all;
3) all modifications made by the committing transaction are propagated to the MV, still using the same two steps.
The DEL step is now:

Hence, the big difference from the previous versions case is that rows in the MV log are identified very simply by the transaction that logged them (the committing transaction, of course), by the subquery

select chartorowid(log.m_row$$)
from mlog$_test_t1
where xid$$ = :1

4) all obsolete log rows are deleted, that is, the rows logged by the committing transaction are removed, using the the statement

delete from mlog$_test_t1
where where xid$$ = :1

The new algorithm is for sure much simpler and more elegant. Performance is improved since the snapshot step has been removed, and the other steps are more or less as expensive as before.

practical implications: an example

I strongly believe that studying the internals is the best way to learn how to make the best use of any feature. Let's see an example of how the few bits of "internal knowledge" I shared here can be used in practice - that is, how a little investment in investigation makes for huge savings in effort afterwards, and huge gains in effectiveness of your work as well.

It is well-known that it can be sometimes beneficial, in pre-11gR2, to place an index on the log (indexing the log is even suggested by support note 258252 "MATERIALIZED VIEW REFRESH: Locking, Performance, Monitoring"). The scenario that benefits the most from such an index is when the log is composed of mostly-empty blocks, and hence an index access is preferable over a full table(log) scan; you get mostly-empty blocks, for example, when there are peeks in activity on the master tables that keep the log High Water Mark very high.

From the above discussion, it is obvious that in pre-11gR2, the best index for join-only MVs was on (snaptime$$, m_row$$) - not on snaptime$$ alone as it is sometimes suggested - to make the refresh operation an index-only one.

Starting from 11gR2, the best index is now on (xid$$, m_row$$). Not only that, but having no snapshot step, and hence no update on the index, makes the indexing option even more attractive.

Could you see these implications so easily, without knowing the internals? I don't.

In this post we are going to discuss some 11gR2 changes to materialized view logs that are aimed at increasing the performance of the fast-refresh engine of materialized views (MVs), especially the on-commit variant.

The MV logs, in 10gr2, now comes in two flavours: the traditional (and still the default) timestamp-based one and the brand new commit SCN-based one; you choose the latter type by specifing the "WITH COMMIT SCN" clause at MV log creation time. Interestingly, the "old" timestamp-based implementation has been changed as well. Let's examine both with the help, as usual, of a test case.

Timestamp-based MV logs (the "old" type)

The test case configures an MV log as "log everything", that is, it activates all the logging options:

the only difference is the new column xid$$ (transaction id) that uniquely identifies the transaction that made the changes to the row. For the curious, the number is a combination of the elements of the triplet (undo segment number, undo slot, undo sequence); it is simply the binary concatenation of the three numbers shifted by (48, 32, 0) bits respectively (as checked in the script).

The xid$$ column is used by the 11gR2 on-commit fast refresh engine, which can now easily retrieve the changes made by the just-committed transaction by its xid; at the opposite, the on-demand fast refresh one keeps using snaptime$$ as it did in previous versions. I will speak about this in more detail in an upcoming post.

Commit SCN-based MV logs (the "new" type in 11gR2)

Let's recreate the same MV log, this time adding the commit SCN clause (new in 11GR2):

so, the only difference from the 11gR2 timestamp-based case is that snaptime$$ is no longer a column of the MV log; the only difference from the pre-11gR2 is that snaptime$$ has been replaced with xid$$.

For this log flavour only, the mapping between the xid that modified the table and its commit-time SCN is now tracked in a new view, all_summap (probably named after "SUMmary MAP", "summary" being yet another synonym for "MV"), which is (as of 11.2.0.1) a straight "select *" of the dictionary table sys.snap_xcmt$. To illustrate, the script makes one insert, one update and one delete on the base table, which translates into 4 rows inside the MV log with the same xid:

hence, it is now possible to know the infinite-precision time (the SCN) when every modification became visible to an external observer (the commit SCN) by simply joining the MV log and all_summap (or sys.snap_xcmt$). Note that the commit SCN is not propagated to the MV log at all.

commit SCN-based MV logs for on-demand fast refresh

This new xid$$ column and commit-SCN mapping table are leveraged by the fast refresh of on-demand MVs as follows (on-commit ones do not need the SCN as they know exactly the xid of the committed transaction; again we will see that in an upcoming post).

With "old style" timestamp-based MV logs, the refresh is performed by using a "mark-and-propagate" algorithm, which is essentially (check this post for some additional details):
1) new log rows are inserted with snaptime$$=4000 A.D;
2) at refresh time, a snapshot of the new rows is taken, that is, all new rows are marked with snaptime$$=sysdate;
3) all modifications whose snaptime$$ is between the date of the last refresh (excluded) and sysdate(included) are propagated to the MV;
4) all obsolete log rows are deleted, that is, all rows whose snaptime$$ is less than or equal the lowest of all refresh times are removed from the log.

With "new style" SCN-based MV logs, the algorithm is, instead:
1) new log rows are inserted with xid$$=transaction id of modifing transaction;
2) at refresh time, the current SCN is retrieved (no snapshot is performed);
3) all modifications whose xid maps to a row in all_summap whose commit_scn is between the SCN of the last refresh (excluded) and the retrieved current SCN(included) are propagated to the MV;
4) obsolete rows are removed from the log as before, this time using the SCN instead of snaptime$$.

The main advantage is that the snapshot is not performed, thus removing the redo and undo generated by the update, and obviously the log visit (usually a full table scan) as well - at the cost of an additional join with all_summap (or sys.snap_xcmt$) later; if the join is calculated efficiently, that is very likely advantageous "in general" (but as always, it depends on your scenario).

It might be (rarely) beneficial to index xid$$, as it is (rarely) beneficial to index snaptime$$. In that case, having no snapshot performed reduces both the undo and redo generated for the index maintenance.

As a side and "philosophical" note, it is also worth noting that the new logging mechanism records more information - now we know which transactions modified the table and the infinite-precision time (the SCN) of modifications, and this is much more informative about the history of the logged table than the mostly meaningless refresh time contained in snaptime$$. This is definitely a better utilization of storage.

I plan to blog about how the new MV log impact fast refreshes in 11gR2 in the near future, focusing on join-only MVs; so stay tuned if you're interested.