15.3. Parallel
Plans

Because each worker executes the parallel portion of the plan
to completion, it is not possible to simply take an ordinary
query plan and run it using multiple workers. Each worker would
produce a full copy of the output result set, so the query would
not run any faster than normal but would produce incorrect
results. Instead, the parallel portion of the plan must be what
is known internally to the query optimizer as a partial plan; that is, it must be constructed so
that each process which executes the plan will generate only a
subset of the output rows in such a way that each required output
row is guaranteed to be generated by exactly one of the
cooperating processes. Generally, this means that the scan on the
driving table of the query must be a parallel-aware scan.

15.3.1. Parallel Scans

The following types of parallel-aware table scans are
currently supported.

In a parallel sequential
scan, the table's blocks will be divided
among the cooperating processes. Blocks are handed out
one at a time, so that access to the table remains
sequential.

In a parallel bitmap heap
scan, one process is chosen as the leader.
That process performs a scan of one or more indexes and
builds a bitmap indicating which table blocks need to be
visited. These blocks are then divided among the
cooperating processes as in a parallel sequential scan.
In other words, the heap scan is performed in parallel,
but the underlying index scan is not.

In a parallel index
scan or parallel
index-only scan, the cooperating processes
take turns reading data from the index. Currently,
parallel index scans are supported only for btree
indexes. Each process will claim a single index block and
will scan and return all tuples referenced by that block;
other process can at the same time be returning tuples
from a different index block. The results of a parallel
btree scan are returned in sorted order within each
worker process.

Other scan types, such as scans of non-btree indexes, may
support parallel scans in the future.

15.3.2. Parallel Joins

Just as in a non-parallel plan, the driving table may be
joined to one or more other tables using a nested loop, hash
join, or merge join. The inner side of the join may be any kind
of non-parallel plan that is otherwise supported by the planner
provided that it is safe to run within a parallel worker. For
example, if a nested loop join is chosen, the inner plan may be
an index scan which looks up a value taken from the outer side
of the join.

Each worker will execute the inner side of the join in full.
This is typically not a problem for nested loops, but may be
inefficient for cases involving hash or merge joins. For
example, for a hash join, this restriction means that an
identical hash table is built in each worker process, which
works fine for joins against small tables but may not be
efficient when the inner table is large. For a merge join, it
might mean that each worker performs a separate sort of the
inner relation, which could be slow. Of course, in cases where
a parallel plan of this type would be inefficient, the query
planner will normally choose some other plan (possibly one
which does not use parallelism) instead.

15.3.3. Parallel Aggregation

PostgreSQL supports
parallel aggregation by aggregating in two stages. First, each
process participating in the parallel portion of the query
performs an aggregation step, producing a partial result for
each group of which that process is aware. This is reflected in
the plan as a Partial Aggregate
node. Second, the partial results are transferred to the leader
via Gather or Gather Merge. Finally, the leader
re-aggregates the results across all workers in order to
produce the final result. This is reflected in the plan as a
Finalize Aggregate node.

Because the Finalize Aggregate
node runs on the leader process, queries which produce a
relatively large number of groups in comparison to the number
of input rows will appear less favorable to the query planner.
For example, in the worst-case scenario the number of groups
seen by the Finalize Aggregate
node could be as many as the number of input rows which were
seen by all worker processes in the Partial Aggregate stage. For such cases, there
is clearly going to be no performance benefit to using parallel
aggregation. The query planner takes this into account during
the planning process and is unlikely to choose parallel
aggregate in this scenario.

Parallel aggregation is not supported in all situations.
Each aggregate must be safe for parallelism and must
have a combine function. If the aggregate has a transition
state of type internal, it must
have serialization and deserialization functions. See CREATE
AGGREGATE for more details. Parallel aggregation is
not supported if any aggregate function call contains
DISTINCT or ORDER BY clause and is also not supported for
ordered set aggregates or when the query involves GROUPING SETS. It can only be used when all
joins involved in the query are also part of the parallel
portion of the plan.

15.3.4. Parallel Plan Tips

If a query that is expected to do so does not produce a
parallel plan, you can try reducing parallel_setup_cost
or parallel_tuple_cost.
Of course, this plan may turn out to be slower than the serial
plan which the planner preferred, but this will not always be
the case. If you don't get a parallel plan even with very small
values of these settings (e.g. after setting them both to
zero), there may be some reason why the query planner is unable
to generate a parallel plan for your query. See Section 15.2
and Section 15.4 for
information on why this may be the case.

When executing a parallel plan, you can use EXPLAIN (ANALYZE, VERBOSE) to display
per-worker statistics for each plan node. This may be useful in
determining whether the work is being evenly distributed
between all plan nodes and more generally in understanding the
performance characteristics of the plan.

Submit correction

If you see anything in the documentation that is not correct, does not match
your experience with the particular feature or requires further clarification,
please use
this form
to report a documentation issue.