Parallel Reasoning

This article is a light-weight exploration of parallel query which is available in the Enterprise Edition of Oracle Database. There are a lot of great articles on the Internet that discuss parallel query benefits and pitfalls, and Metalink contains a couple of decent articles also. For example, Metalink Doc ID 263153.1 includes the following bit of advice:

“The parallel hint on the object make sure that we access this object in parallel.”

Well, that’s odd. Doc ID 263153.1 first doesn’t apply to Oracle 11.2.0.1, and then does apply to Oracle 11.2.0.1. Notice that the calculated total query cost of the first plan is 343 and the calculated total query cost of the second plan is 329. Time for another test:

There, we now have a cost of 2 – the query should now finish instantaneously. :-) But why didn’t the query excute in parallel when PARALLEL(2) was specified in the hint? A search of the Internet provides the answer:

“The hint parallel(t1, N) does not tell Oracle to run the query in parallel at degree N – it tells the optimizer that when it is calculating the cost of a tablescan on table t1 it should cost for parallel degree N, and if that plan then happens to be the cheapest of all the plans to consider that that’s the plan to use.”

“But the hint we supplied is only relevant to the tablescan – and we can see it being used in the arithmetic above: the cost_io is the serial I/O cost of the tablescan, the resp_io is the ‘parallel response’ time for the query which is (cost_io / (0.9 * degree)).”

Calculated costs of the access paths matter… so that’s why the first query did not execute in parallel.

For those who decide to rebuild indexes for whatever reason, and decide to do so in parallel, you might be setting yourself up for headaches:

I haven’t enabled parallelism for my table! How can Oracle go parallel without my consent? Parallel index (re)build will persistently set the index parallel degree in data dictionary to the value used during build!

—

Oracle 11.2.0.1 introduced a couple of changes to parallel query, and a couple of those changes are listed below:

“Beginning with Oracle Database 11g Release 2, the PARALLEL and NO_PARALLEL hints are statement-level hints and supersede the earlier object-level hints: PARALLEL_INDEX, NO_PARALLEL_INDEX, and previously specified PARALLEL and NO_PARALLEL hints. For PARALLEL, if you specify integer, then that degree of parallelism will be used for the statement. If you omit integer, then the database computes the degree of parallelism. All the access paths that can use parallelism will use the specified or computed degree of parallelism.”

“When the parameter PARALLEL_DEGREE_POLICY is set to AUTO, Oracle Database automatically decides if a statement should execute in parallel or not and what DOP it should use. Oracle Database also determines if the statement can be executed immediately or if it is queued until more system resources are available. Finally, it decides if the statement can take advantage of the aggregated cluster memory or not.

The following example illustrates computing the DOP the statement should use:

As I mentioned at the beginning of this article, I only intend to provide a light-weight exploration of parallel query, so I will stop here… or better yet after providing a couple of potentially helpful Metalink article references that you can review using your shiny new Apple iPad:

Doc ID 263153.1 “FAQ’s about Parallel/Noparallel Hints” (briefly mentioned at the beginning of this article)

Doc ID 203238.1 “Using Parallel Execution”

Doc ID 826893.1 “Invalid Hint in 10g Can Cause Other Hints To Be Ignored, Including Parallel Hints”

Be on the lookout for hard hitting parallel query articles by other authors (such as Doug Burns), and try not to abuse this feature. I will leave you with this paraphrase from the book “Practical Oracle 8i”

Parallel query’s objective is to maximize resource utilization in order to reduce query execution time. Oracle may select a high CPU consumption plan using a full tablescan, rather than a lower CPU consuming index range scan – this situation may cause performance problems in an OLTP system. Parallel query is a data crunching, non-scalable, batch processor.

7 responses

There was a bit of humor embedded in this article, in case anyone missed it. Just to clarify:
* Don’t run SQL statements in parallel(1000) just to drive the cost of the query toward 0. There were a couple of recent comments on this blog pointing back to an asktom article that partially explains why.
* Don’t try viewing Oracle’s support documents on a shiny new Apple iPad – the iPad does not support Adobe Flash, and well, the new Oracle’s support site is Flashy.

Hopefully, this article provides access to a couple of helpful resources, and might even prompt someone else to write something that is well beyond a light-weight exploration of recent parallel query enhancements.

Thanks for posting the link to the HTML version of the Oracle support site – now if Oracle would just redirect metalink.oracle.com to http://supporthtml.oracle.com I could probably remember how to access the non-Flash version of the website.

Currently bogged with a issue in parallel query tuning.
Prelude – Server – CMT T5120. 32 Cpu’s. The Batch jobs when transferred from old server runs much slow. CMT servers encourage for parallel processing in batch jobs. So when we parallelized 2 queries, it was running much improved time (nearly 1/2 the time taken). But when we started parallelizing the rest, it degraded the performance of the rest.

I might be stating the obvious here, but parallel query is not a magic fix for all types of performance problems. Consider these two situations:
1. You have 32 CPUs (or cores), your query is not using parallel execution yet it is using 100% of one CPU (or core), the overall CPU utilization average is 50%, and the IO subsystem is not heavily utilized.
2. You have 32 CPUs (or cores), your query is not using parallel execution yet it is using 100% of one CPU (or core), the overall CPU utilization average is 50%, and the IO subsystem is nearly 100% utilized (either maximum number of I/Os per second or maximum throughput).

How might the outcome be different in both of the situations if parallel query is enabled?

It is quite possible that the first two queries that you helped with parallel query were in fact CPU bound (the single CPU utilized was the bottleneck), while the remaining queries were I/O bound (the I/O subsystem was the bottleneck). In either case, it is best not to guess. Instead, capture a 10046 trace at level 8 or 12 for the batch job and determine *why* the other queries execute more slowly when parallelized. You might also capture delta values of V$OSSTAT periodically during the execution of the batch job so that you can see, for instance, if you now have 200 parallel slaves fighting for the 32 CPUs, or if you now have 200 parallel slaves all waiting for I/O.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: