The FDW callback functions GetForeignRelSize, GetForeignPaths, GetForeignPlan, and PlanForeignModify must fit into the workings of
the PostgreSQL planner. Here are
some notes about what they must do.

The information in root and
baserel can be used to reduce the amount
of information that has to be fetched from the foreign table (and
therefore reduce the cost). baserel->baserestrictinfo is particularly
interesting, as it contains restriction quals (WHERE clauses) that should be used to filter the
rows to be fetched. (The FDW itself is not required to enforce
these quals, as the core executor can check them instead.)
baserel->reltargetlist can be used to
determine which columns need to be fetched; but note that it only
lists columns that have to be emitted by the ForeignScan plan node, not columns that are
used in qual evaluation but not output by the query.

Various private fields are available for the FDW planning
functions to keep information in. Generally, whatever you store
in FDW private fields should be palloc'd, so that it will be
reclaimed at the end of planning.

baserel->fdw_private is a
void pointer that is available for FDW
planning functions to store information relevant to the
particular foreign table. The core planner does not touch it
except to initialize it to NULL when the baserel node is created. It is useful for passing
information forward from GetForeignRelSize to GetForeignPaths and/or GetForeignPaths to GetForeignPlan, thereby avoiding
recalculation.

GetForeignPaths can identify the
meaning of different access paths by storing private information
in the fdw_private field of
ForeignPath nodes. fdw_private is declared as a List pointer, but could actually contain anything
since the core planner does not touch it. However, best practice
is to use a representation that's dumpable by nodeToString, for use with debugging support
available in the backend.

GetForeignPlan can examine the
fdw_private field of the selected
ForeignPath node, and can generate
fdw_exprs and fdw_private lists to be placed in the
ForeignScan plan node, where they
will be available at execution time. Both of these lists must be
represented in a form that copyObject knows how to copy. The fdw_private list has no other restrictions and
is not interpreted by the core backend in any way. The fdw_exprs list, if not NIL, is expected to
contain expression trees that are intended to be executed at run
time. These trees will undergo post-processing by the planner to
make them fully executable.

In GetForeignPlan, generally the
passed-in target list can be copied into the plan node as-is. The
passed scan_clauses list contains the
same clauses as baserel->baserestrictinfo, but may be
re-ordered for better execution efficiency. In simple cases the
FDW can just strip RestrictInfo nodes
from the scan_clauses list (using
extract_actual_clauses) and put all
the clauses into the plan node's qual list, which means that all
the clauses will be checked by the executor at run time. More
complex FDWs may be able to check some of the clauses internally,
in which case those clauses can be removed from the plan node's
qual list so that the executor doesn't waste time rechecking
them.

As an example, the FDW might identify some restriction clauses
of the form foreign_variable=sub_expression, which it determines can be
executed on the remote server given the locally-evaluated value
of the sub_expression. The actual
identification of such a clause should happen during GetForeignPaths, since it would affect the cost
estimate for the path. The path's fdw_private field would probably include a
pointer to the identified clause's RestrictInfo node. Then GetForeignPlan would remove that clause from
scan_clauses, but add the sub_expression to fdw_exprs to ensure that it gets massaged into
executable form. It would probably also put control information
into the plan node's fdw_private
field to tell the execution functions what to do at run time. The
query transmitted to the remote server would involve something
like WHERE foreign_variable = $1, with the
parameter value obtained at run time from evaluation of the
fdw_exprs expression tree.

The FDW should always construct at least one path that depends
only on the table's restriction clauses. In join queries, it
might also choose to construct path(s) that depend on join
clauses, for example foreign_variable=local_variable. Such clauses will
not be found in baserel->baserestrictinfo but must be sought in
the relation's join lists. A path using such a clause is called a
"parameterized path". It must identify
the other relations used in the selected join clause(s) with a
suitable value of param_info; use
get_baserel_parampathinfo to
compute that value. In GetForeignPlan, the local_variable portion of the join clause
would be added to fdw_exprs, and
then at run time the case works the same as for an ordinary
restriction clause.

When planning an UPDATE or DELETE, PlanForeignModify can look up the RelOptInfo struct for the foreign table and
make use of the baserel->fdw_private
data previously created by the scan-planning functions. However,
in INSERT the target table is not
scanned so there is no RelOptInfo for
it. The List returned by PlanForeignModify has the same restrictions as
the fdw_private list of a ForeignScan plan node, that is it must contain
only structures that copyObject
knows how to copy.

For an UPDATE or DELETE against an external data source that
supports concurrent updates, it is recommended that the
ForeignScan operation lock the rows that
it fetches, perhaps via the equivalent of SELECT FOR UPDATE. The FDW may also choose to lock
rows at fetch time when the foreign table is referenced in a
SELECT FOR UPDATE/SHARE; if it does not,
the FOR UPDATE or FOR SHARE option is essentially a no-op so far as
the foreign table is concerned. This behavior may yield semantics
slightly different from operations on local tables, where row
locking is customarily delayed as long as possible: remote rows
may get locked even though they subsequently fail locally-applied
restriction or join conditions. However, matching the local
semantics exactly would require an additional remote access for
every row, and might be impossible anyway depending on what
locking semantics the external data source provides.

Add Comment

Please use this form to add your own comments regarding your experience with
particular features of PostgreSQL, clarifications of the documentation, or
hints for other users. Please note, this is not a support
forum, and your IP address will be logged. If you have a question or need help,
please see the faq, try a
mailing list, or join
us on IRC.
Note that submissions containing URLs or other keywords commonly found in
'spam' comments may be silently discarded. Please contact the
webmaster if you think this
is happening to you in error.