Sunday, March 20, 2011

This is just a short heads-up for those that come across an execution plan showing the PX COORDINATOR FORCED SERIAL operation. I don't have official confirmation but according to my tests a plan with such an operation effectively means: Cost for parallel execution but execute serially (You might remember that I've recently mentioned in another post that there is the possibility to have a plan executed in parallel but costed serially, weird isn't it). Why such an operation exists is not clear to me - obviously it would make much more sense to cost straight away for serial execution in such a case. Probably there is a good reason, otherwise such an operation didn't exist but I think at least the costing is questionable in current versions.

So the message effectively is this: Review the execution plans in such cases - due to the cost reduction introduced by parallel execution it might be favoured by the optimizer over more efficient (serial) execution plans that would be chosen if the costing was based on serial execution.

So far I've only be able to get this operation in conjunction with user-defined functions that are not enabled for parallel execution, therefore parallel execution with such functions is questionable anyway from a performance perspective, however I don't know if there are other scenarios where this operation gets used. Furthermore it is not entirely clear to me under which circumstances the optimizer resorts to that operation - slight modifications to the SQL sometimes showed that the operation was replaced by the normal PX COORDINATOR operation without making entirely obvious why it was possible to use normal parallel execution in those cases.

Here is a simple test case that shows that due to the cost reduction of the (alleged) parallel execution the optimizer favours a full table scan although the serial cost of the same plan is higher than an index-based plan. Since the actual execution is serial it should have been using the index-based plan.

set doc offdoc------------------------------------------------------------------------------------ File name: px_parallel_force_serial_testcase.sql---- Purpose: Simple testcase for a "PX COORDINATOR FORCED SERIAL" operation-- It looks like in the versions mentioned below-- this operation means "cost for parallel" but "execute serial"-- The test case shows that this effectively allows the optimizer-- to reject plans that would be chosen if the plan was costed-- serially. Here the plan based on INDEX RANGE SCAN costs more-- than a degree 4 parallel full table scan but costs less than-- a serial full table scan.---- Author: Randolf Geist http://oracle-randolf.blogspot.com---- Last tested: March 2011---- Versions: 10.2.0.4-- 10.2.0.5-- 11.1.0.7-- 11.2.0.1-- 11.2.0.2--------------------------------------------------------------------------------#

set echo on timing on linesize 200 pagesize 999 long 1000000 feedback 1 tab off autotrace off trimspool on

-- This query will be executed serially, but obviously costed like being executed in parallel-- Why is it not costed serially if the optimizer knows that it is going to be executed serially?--explain plan forselect /*+ monitor */ distinct pk_par_test.f_par_test_serial(filler) as inst_sidfrom twhere id between 1 and 200;

Elapsed: 00:00:00.04SQL>SQL> -- This query will be executed serially, but obviously costed like being executed in parallelSQL> -- Why is it not costed serially if the optimizer knows that it is going to be executed serially?SQL> --explain plan forSQL> select /*+ monitor */ distinct pk_par_test.f_par_test_serial(filler) as inst_sid 2 from t 3 where id between 1 and 200;

Elapsed: 00:00:00.00SQL>SQL> -- Uses the index, because the indexed plan is cheaper than a serial full table scanSQL> select /*+ monitor */ distinct pk_par_test.f_par_test_serial(filler) as inst_sid 2 from t 3 where id between 1 and 200;

Friday, March 11, 2011

Continuing from the previous installment of this series I'll cover in this post some of the inevitable classics regarding Parallel Execution Control. So forgive me if you're bored by the repetition of known facts - however I still see these things too often used incorrectly, therefore I decided: This is worth to mention and remember!

- Up to and including version 10.2 PARALLEL (without any parameters) and NOLOGGING are valid keywords only in DDL commands

- Applies to all versions: NOLOGGING can not be used as a hint. It can only be specified as part of DDL, for example ALTER INDEX ... REBUILD PARALLEL NOLOGGING.The only DML operation that can be performed as NOLOGGING operation is a direct-path insert (that can be enabled for parallel DML). For that to work you need to set the NOLOGGING attribute on table / partition level independently from the direct-path insert, in can not be specified as part of the direct-path insert DML command.

- Applies to all versions: For NOLOGGING DDL operations the NOLOGGING keyword needs to be be used at the correct position in the DDL statement:

CREATE TABLE T PARALLEL NOLOGGING AS SELECT ...

A mistake often seen, because it doesn't raise an error in many cases:

CREATE TABLE T PARALLEL AS SELECT ... FROM TAB NOLOGGING

Here NOLOGGING is merely an alias for TAB therefore syntactically correct but not a NOLOGGING operation

Note that most production systems deemed to hold critical data nowadays run in FORCE LOGGING mode anyway (for example due to standby database(s) requiring the full generation of redo), so you probably don't want to bother too much with NOLOGGING anyway.

- Up to and including version 10.2: If you want to use PARALLEL as a hint as part of query / DML statements you need to specify an alias/object_name as parameter, otherwise the hint is invalid. So this is invalid and therefore does nothing:

SELECT /*+ PARALLEL */ ... FROM TABLE T

This is a valid PARALLEL hint for alias T:

SELECT /*+ PARALLEL(T) */ ... FROM TABLE T

You can optionally specify the parallel degree after the alias: PARALLEL(T 16) or PARALLEL(T DEFAULT). If you don't specify a degree DEFAULT will be used which is PARALLEL_THREADS_PER_CPU * CPU_COUNT * INSTANCE_COUNT (not considering additional rules like adaptive multi-user limitations, resource manager plans etc.) and not the (parallel) degree specified on object level (that gets used without any explicit PARALLEL hint).

- Starting with version 11.1 (undocumented) and 11.2 (documented): Release 11g supports now so-called "statement-level" PARALLEL (and NO_PARALLEL) hints, so above invalid syntax IS actually valid from version 11g on which might lead to some interesting side-effects for those that used the hint incorrectly so far and migrate to 11g.

The following rules for statement level PARALLEL hints seem to apply:

- /*+ PARALLEL */ seems to effectively mean PARALLEL(AUTO) (see below) but use at least a degree of 2 for costing and do not consider serial costs for operations that can be run in parallel. The description in the manual seems to be wrong that says that the statement is always executed in parallel. The usual rule applies that a serial plan with a lower cost will be favored if available

- /*+ PARALLEL(DEFAULT) is mentioned in the manual but seems not to be valid in my tests

- /*+ PARALLEL(<degree>) */ Use the defined DOP, where <degree> is integer

- From 11.2 on: /*+ PARALLEL(MANUAL) */ Use the degree defined on object level and therefore seems to be a bit redundant - why use a hint that results in the same as not using the hint (apart from the DISABLE PARALLEL QUERY / PARALLEL hint quirk mentioned here)

- From 11.2 on: /*+ PARALLEL(AUTO) */ Use the auto DOP policy introduced in 11.2 including serial costing of operations that can be run in parallel (the only difference to the "statement level" PARALLEL hint without a parameter I can see)

By the way, the NO_PARALLEL hint can now also be used on "statement level" and seems to effectively prevent PARALLEL and PARALLEL_INDEX costing. It seems to correspond to the (undocumented) hint OPT_PARAM('parallel_execution_enabled', 'false'). By undocumented I mean that this parameter to the recently documented OPT_PARAM hint is not documented officially.

Sunday, March 6, 2011

7th April: I'll be doing a one-day seminar in Switzerland on performance troubleshooting. This is loosely based on the Chapters 8 and 9 of the "Expert Oracle Practices" book that I have co-authored. If you want to get an idea of some of the details covered you can have a look at the material of my corresponding presentation. Of course the seminar will go much deeper than the presentation, since we'll spend a whole day on troubleshooting. This is being organized by Oracle University as part of their "Celebrity Seminar" program. You can find the details and booking options here.

13th - 15th April: I'll be at MOW 2011 in Legoland, Denmark doing one or two presentations and meeting a lot of fellow OakTable members. One of the most interesting aspects of this conference (besides the rather special venue) is that it not only offers premier Oracle stuff, but also Microsoft SQL Server, Java and Open Source.