Parallel Execution

In the final part of this instalment I want to focus on the possible optimisation of remote access that I outlined in the initial part, which is based on the idea of running multiple concurrent remote branches of a UNION ALL to overcome the query coordinator bottleneck of straightforward remote queries that need to transfer larger amounts of data.For that purpose I now simply change my sample query to access the serial table T2 via the DB link defined in the setup of the initial part, like that:

In the first part of this series I've focused on the parallel degree chosen by the optimizer when dealing with the new concurrent UNION ALL feature.I've shown that for the variant with serial branches only in the UNION ALL in principle the number of branches dictates the parallel degree determined, even in cases of more complex plans that mix such a serial branch only UNION ALL operator with some other parallel stuff for example via a join.In this part I'll focus on the runtime behaviour of the feature, but before doing so let me show you what happens if you start mixing serial and parallel branches in the UNION ALL, like that (using the identical table setup as in the previous part):

12c introduces the concurrent UNION ALL operator that allows multiple branches below the UNION ALL to become active concurrently - which is a significant change. Before the introduction of this feature Oracle never executed multiple branches of an execution plan concurrently (in terms of Parallel Execution) - the parallelism so far was about executing the same operations of a particular branch of the execution plan by multiple processes / sessions.

In this blog post I want to cover some aspects of the the new HYBRID HASH adaptive distribution method that I haven't covered yet in my other posts.As far as I know it serves two purposes for parallel HASH and MERGE JOINs, adaptive broadcast distribution and hybrid distribution for skewed join expressions. In the first part of this post I want to focus on former one (goto part 2).

1. Adaptive Broadcast Distribution For Small Left Row Sources

It allows the PX SEND / RECEIVE operation for the left (smaller estimated row source) of the hash join to decide dynamically at runtime, actually at each execution, if it should use either a BROADCAST or HASH distribution, and correspondingly for the other row source to use then either a ROUND-ROBIN or a HASH distribution, too.

It’s easy to make mistakes, or overlook defects, when constructing parallel queries – especially if you’re a developer who hasn’t been given the right tools to make it easy to test your code. Here’s a little trap I came across recently that’s probably documented somewhere, which could be spotted easily if you had access to the OEM SQL Monitoring screen, but would be very easy to miss if you didn’t check the execution plan very carefully. I’ll start with a little script to generate some data:

Oracle 12c is the first release since a couple of years that adds significant new functionality in the area of Parallel Execution operators, plan shapes and runtime features. Although 11gR2 added the new Auto DOP feature along with In-Memory Parallel Execution and Statement Queueing, the 12c features are more significant because they introduce new operators that can change both execution plan shape and runtime behaviour.

Here is a list of new features that are worth to note (and not necessarily mentioned in the official documentation and white papers by Oracle):

- The new HYBRID HASH adaptive distribution method, that serves two purposes for parallel HASH and MERGE JOINs:

A new version 4.21 of the XPLAN_ASH utility is available for download. I publish this version because it will be used in the recent video tutorials explaining the Active Session History functionality of the script.

This is mainly a maintenance release that fixes some incompatibilities of the 4.2 version with less recent versions (10.2 and 11.2.0.1).

As an extra however, this version now differentiates between general CPU usage and in-memory CPU usage (similar to 12.1.0.2 Real-Time SQL Monitoring). This is not done in all possible sections of the output yet, but the most important ones are already covered.

Now that I've shown in the previous post in general that sometimes Parallel Execution plans might end up with unnecessary BUFFER SORT operations, let's have a look what particular side effects are possible due to this.

What would you say if someone tells you that (s)he just did a simple, straightforward "SELECT * FROM TABLE" that took several minutes to execute without returning, only to then error out with "ORA-01652 unable to extend temp segment", and the TABLE in question is actually nothing but a simple, partitioned heap table, so no special tricks, no views, synonyms, VPD etc. involved, it's really just a plain simple table?

When using Parallel Execution, depending on the plan shape and the operations used, Oracle sometimes needs to turn non-blocking operations into blocking operations, which means in this case that the row source no longer passes its output data directly to the parent operation but buffers some data temporarily in PGA memory / TEMP. This is either accomplished via the special HASH JOIN BUFFERED operation, or simply by adding BUFFER SORT operations to the plan.The reason for such a behaviour in parallel plans is the limitation of Oracle Parallel Execution that allows only a single data redistribution to be active concurrently.