Tuesday, May 26, 2009

User objects created in the SYS schema and the (cost based) optimizer

The answer - as already disclosed by Nicolas Gasparotto - to the question that I asked here in the "Weekend Quiz" is to run the script as SYS user, and then run the query shown against these objects in the SYS schema (tested against 10g XE, 10.2.0.4 and 11.1.0.7 on Win32).

Note: It's not recommended to create any non-SYS objects in the SYS schema and you should only perform this (if at all) in a test database.

All this came up in this recent OTN forum thread where it became obvious that the issue can only be reproduced if the objects are owned by SYS.

There are two interesting points to derive from this (apart from the obvious that one should not create any user objects in the SYS schema):

1. The optimizer seems to treat objects owned by SYS differently, in particular regarding the transformations applied. Note that the crucial point is not that the query is executed as SYS user, but that the objects are owned by the SYS user. Granting appropriate privileges to a non-SYS user on the objects owned by SYS allows to reproduce the issue even with a non-SYS user.

2. It's something to remind if there is the need to understand a performance issue with a recursive dictionary query performed on SYS-owned objects. Although you obviously can't influence the SQL generated by Oracle itself it might help to understand the issue and take appropriate steps to rectify the issue.

Oh, by the way, have I already mentioned that it's really a bad idea to create user objects in the SYS schema?

3 comments:

I spent nearly all day yesterday trying to figure out answer to your quiz but I could not (I resisted the temptation to read comments...:)). Now, I am curious to know how this plan change (Nested loop Semi to Hash join Semi and index access to materialized table access of TEST2) can be accomplished by say, rewriting the query (without changing its meaning) or changing any other settings? It is assumed that tables are not created in SYS schema. I would like to understand on what basis CBO takes the decision to use Hash Join Semi over Nested Loop Semi, especially in 10g and onwards. Would you mind throwing some light?

I haven't had the time to investigate the details of the plan when the objects reside in the SYS schema, but one obvious issue is that the view merging doesn't take place as it does when the objects are normal user objects.

I can only speculate about the reasons for this.

Usually the decision between the NESTED LOOP and HASH JOIN operation are simply cost driven, and in the case of this particular plan the NESTED LOOP operation is using the cost of the inner view which is simply the cost of a full table scan on TEST2 and therefore running twice the full table scan is more costly than performing the hash join.

So this issue is apparently driven by the fact that the transformations applied are not equal to the "normal" transformations therefore the costs and options evaluated are quite different.

IF you want to mimic the SYS plan, you can get pretty close to it when using the transformed statement that is shown in the 10053 optimizer trace file along with the outline hints: