We usually don't use "namespace" in user-facing error messages. Also,
in master this was replaced by another error message referring to
"temporary objects", so we might as well use that here to avoid
introducing too many variants.
Discussion: https://www.postgresql.org/message-id/bbd3f8d9-e3d5-e5aa-4305-7f0121c3fa94@2ndquadrant.com

indxpath.c needs a good deal more attention for covering indexes than
it's gotten. But so far as I can tell, the only really awful breakage
is in expand_indexqual_rowcompare (nee adjust_rowcompare_for_index),
which was only half fixed in c266ed31a. The other problems aren't
bad enough to take the risk of a just-before-wrap fix.
The problem here is that if the leading column of a row comparison
matches an index (allowing this code to be reached), and some later
column doesn't match the index, it'll nonetheless believe that that
column matches the first included index column. Typically that'll
lead to an error like "operator M is not a member of opfamily N" as
a result of fetching a garbage opfamily OID. But with enough bad
luck, maybe a broken plan would be generated.
Discussion: https://postgr.es/m/25526.1549847928@sss.pgh.pa.us

After commit 123cc697a8eb, we remove redundant FK action triggers during
partition ATTACH by merely deleting the catalog tuple, but that's wrong:
it should use performDeletion() instead. Repair, and make the comments
more explicit.
Per code review from Tom Lane.
Discussion: https://postgr.es/m/18885.1549642539@sss.pgh.pa.us

Renaming varchar_transform to varchar_support had a side effect
I hadn't foreseen: the core regression tests leave around a
transform object that relies on that function, so the name
change breaks cross-version upgrade tests, because the name
used in the older branches doesn't match.
Since the dependency on varchar_transform was chosen with the
aid of a dartboard anyway (it would surely not work as a
language transform support function), fix by just choosing
a different random builtin function with the right signature.
Also add some comments explaining why this isn't horribly unsafe.
I chose to make the same substitution in a couple of other
copied-and-pasted test cases, for consistency, though those
aren't directly contributing to the testing problem.
Per buildfarm. Back-patch, else it doesn't fix the problem.

warn_or_exit_horribly() was blithely passing a potentially-NULL
string pointer to a %s format specifier. That works (at least
to the extent of not crashing) on some platforms, but not all,
and since we switched to our own snprintf.c it doesn't work
for us anywhere.
Of the three string fields being handled this way here, I think
that only "owner" is supposed to be nullable ... but considering
that this is error-reporting code, it has very little business
assuming anything, so put in defenses for all three.
Per a crash observed on buildfarm member crake and then
reproduced here. Because of the portability aspect,
back-patch to all supported versions.

The previous ordering of these steps satisfied the nominal requirement
that set_rel_pathlist_hook could editorialize on the whole set of Paths
constructed for a base relation. In practice, though, trying to change
the set of partial paths was impossible. Adding one didn't work because
(a) it was too late to be included in Gather paths made by the core code,
and (b) calling add_partial_path after generate_gather_paths is unsafe,
because it might try to delete a path it thinks is dominated, but that
is already embedded in some Gather path(s). Nor could the hook safely
remove partial paths, for the same reason that they might already be
embedded in Gathers.
Better to call extensions first, let them add partial paths as desired,
and then gather. In v11 and up, we already doubled down on that ordering
by postponing gathering even further for single-relation queries; so even
if the hook wished to editorialize on Gather path construction, it could
not.
Report and patch by KaiGai Kohei. Back-patch to 9.6 where Gather paths
were added.
Discussion: https://postgr.es/m/CAOP8fzahwpKJRTVVTqo2AE=mDTz_efVzV6Get_0=U3SO+-ha1A@mail.gmail.com

This is not necessary anymore after 297d627e, but extensions that have
not been recompiled after the fix will not use the new definition of
heap_getattr(). While recompiling those extensions is obviously the
suggested course, it's cheap enough to retain the expansion in
GetTupleForTrigger().
Per suggestion from Andrew Gierth.
Discussion: 87va1x43ot.fsf@news-spur.riddles.org.uk

This uses the facility added in the preceding commit to fix
performance issues caused by rebuilding the hashtable (with its
comparator expression being the most expensive bit), after every
reset. That's especially important when the comparator is JIT
compiled.
Bug: #15592 #15486
Reported-By: Jakub Janeček, Dmitry Marakasov
Author: Andres Freund
Discussion:
https://postgr.es/m/15486-05850f065da42931@postgresql.org
https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
Backpatch: 11, where I broke this in bf6c614a2f2c5

This has the advantage that the comparator expression, the table's
slot, etc do not have to be rebuilt. Additionally the simplehash.h
hashtable within the tuple hashtable now keeps its previous size and
doesn't need to be reallocated. That both reduces allocator overhead,
and improves performance in cases where the input estimation was off
by a significant factor.
To avoid an API/ABI break, the new parameter is exposed via the new
BuildTupleHashTableExt(), and BuildTupleHashTable() now is a wrapper
around the former, that continues to allocate the table itself in the
tablecxt.
Using this fixes performance issues discovered in the two bugs
referenced. This commit however has not converted the callers, that's
done in a separate commit.
Bug: #15592 #15486
Reported-By: Jakub Janeček, Dmitry Marakasov
Author: Andres Freund
Discussion:
https://postgr.es/m/15486-05850f065da42931@postgresql.org
https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
Backpatch: 11, this is a prerequisite for other fixes

A hashtable reset just reset the hashtable entries, but does not free
memory.
Author: Andres Freund
Discussion: https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
Bug: #15592 #15486
Backpatch: 11, this is a prerequisite for other fixes

M src/include/lib/simplehash.h

Plug leak in BuildTupleHashTable by creating ExprContext in correct context.

In bf6c614a2f2c5 I added a expr context to evaluate the grouping
expression. Unfortunately the code I added initialized them while in
the calling context, rather the table context. Additionally, I used
CreateExprContext() rather than CreateStandaloneExprContext(), which
creates the econtext in the estate's query context.
Fix that by using CreateStandaloneExprContext when in the table's
tablecxt. As we rely on the memory being freed by a memory context
reset that means that the econtext's shutdown callbacks aren't being
called, but that seems ok as the expressions are tightly controlled
due to ExecBuildGroupingEqual().
Bug: #15592
Reported-By: Dmitry Marakasov
Author: Andres Freund
Discussion: https://postgr.es/m/20190114222838.h6r3fuyxjxkykf6t@alap3.anarazel.de
Backpatch: 11, where I broke this in bf6c614a2f2c5

While this isn't really supposed to happen, it can occur in OOM
situations and perhaps others. Instead of crashing, substitute
"(no message provided)".
I didn't worry about localizing this text, since we aren't
localizing anything else here; besides, if we're on the edge of
OOM, it's unlikely gettext() would work.
Report and fix by Sergio Conde Gómez in bug #15624.
Discussion: https://postgr.es/m/15624-4dea54091a2864e6@postgresql.org

Also clean up some discussion that had been left in a very confused
state thanks to half-hearted adjustments for the change to
standard_conforming_strings being the default.
Discussion: https://postgr.es/m/154954987367.1297.4358910045409218@wrigleys.postgresql.org

As reported in bug #15613 from Srinivasan S A, file_fdw and postgres_fdw
neglected to mark plain baserel foreign paths as parameterized when the
relation has lateral_relids. Other FDWs have surely copied this mistake,
so rather than just patching those two modules, install a band-aid fix
in create_foreignscan_path to rectify the mistake centrally.
Although the band-aid is enough to fix the visible symptom, correct
the calls in file_fdw and postgres_fdw anyway, so that they are valid
examples for external FDWs.
Also, since the band-aid isn't enough to make this work for parameterized
foreign joins, throw an elog(ERROR) if such a case is passed to
create_foreignscan_path. This shouldn't pose much of a problem for
existing external FDWs, since it's likely they aren't trying to make such
paths anyway (though some of them may need a defense against joins with
lateral_relids, similar to the one this patch installs into postgres_fdw).
Add some assertions in relnode.c to catch future occurrences of the same
error --- in particular, as backstop against core-code mistakes like the
one fixed by commit bdd9a99aa.
Discussion: https://postgr.es/m/15613-092be1be9576c728@postgresql.org

The modules RewindTest.pm and ServerSetup.pm are really only useful for
TAP tests, so they really belong in the TAP test directories. In
addition, ServerSetup.pm is renamed to SSLServer.pm.
The test scripts have their own directories added to the search path so
that the relocated modules will be found, regardless of where the tests
are run from, even on modern perl where "." is no longer in the
searchpath.
Discussion: https://postgr.es/m/e4b0f366-269c-73c3-9c90-d9cb0f4db1f9@2ndQuadrant.com
Backpatch as appropriate to 9.5

Otherwise functions that require collation information will not have
it if they are called in arguments to a CALL statement.
Reported-by: Jean-Marc Voillequin <Jean-Marc.Voillequin@moodys.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/1EC8157EB499BF459A516ADCF135ADCE39FFAC54%40LON-WGMSX712.ad.moodys.net

In commit f16241bef7c, we have changed the behavior for concurrent updates
that move row to a different partition, but forgot to update the docs.
Previously when an UPDATE command causes a row to move from one partition
to another, there is a chance that another concurrent UPDATE or DELETE
misses this row. However, now we raise a serialization failure error in
such a case.
Reported-by: David Rowley
Author: David Rowley and Amit Kapila
Backpatch-through: 11 where it was introduced
Discussion: https://postgr.es/m/CAKJS1f-iVhGD4-givQWpSROaYvO3c730W8yoRMTF9Gc3craY3w@mail.gmail.com

The previous tacit assumption that index_form_tuple() hides differences
in the TOAST state of its input datums was wrong. Normalize input
varlena datums by decompressing compressed values, and forming a new
index tuple for fingerprinting using uncompressed inputs. The final
normalized representation may actually be compressed once again within
index_form_tuple(), though that shouldn't matter. When the original
tuple is found to have no datums that are compressed inline, fingerprint
the original tuple directly.
Normalization avoids false positive reports of corruption in certain
cases. For example, the executor can apply toasting with some inline
compression to an entire heap tuple because its input has a single
external TOAST pointer. Varlena datums for other attributes that are
not particularly good candidates for inline compression can be
compressed in the heap tuple in passing, without the representation of
the same values in index tuples ever receiving concomitant inline
compression.
Add a test case to recreate the issue in a simpler though less realistic
way: by exploiting differences in pg_attribute.attstorage between heap
and index relations.
This bug was discovered by me during testing of an upcoming set of nbtree
enhancements. It was also independently reported by Andreas Kunert, as
bug #15597. His test case was rather more realistic than the one I
ended up using.
Bug: #15597
Discussion: https://postgr.es/m/CAH2-WznrVd9ie+TTJ45nDT+v2nUt6YJwQrT9SebCdQKtAvfPZw@mail.gmail.com
Discussion: https://postgr.es/m/15597-294e5d3e7f01c407@postgresql.org
Backpatch: 11-, where heapallindexed verification was introduced.

create_lateral_join_info() computes a bunch of information about lateral
references between base relations, and then attempts to propagate those
markings to appendrel children of the original base relations. But the
original coding neglected the possibility of indirect descendants
(grandchildren etc). During v11 development we noticed that this was
wrong for partitioned-table cases, but failed to realize that it was just
as wrong for any appendrel. While the case can't arise for appendrels
derived from traditional table inheritance (because we make a flat
appendrel for that), nested appendrels can arise from nested UNION ALL
subqueries. Failure to mark the lower-level relations as having lateral
references leads to confusion in add_paths_to_append_rel about whether
unparameterized paths can be built. It's not very clear whether that
leads to any user-visible misbehavior; the lack of field reports suggests
that it may cause nothing worse than minor cost misestimation. Still,
it's a bug, and it leads to failures of Asserts that I intend to add
later.
To fix, we need to propagate information from all appendrel parents,
not just those that are RELOPT_BASERELs. We can still do it in one
pass, if we rely on the append_rel_list to be ordered with ancestor
relationships before descendant ones; add assertions checking that.
While fixing this, we can make a small performance improvement by
traversing the append_rel_list just once instead of separately for
each appendrel parent relation.
Noted while investigating bug #15613, though this patch does not fix
that (which is why I'm not committing the related Asserts yet).
Discussion: https://postgr.es/m/3951.1549403812@sss.pgh.pa.us

Previously heap_getattr() returned NULL for attributes with a fast
default value (c.f. 16828d5c0273), as it had no handling whatsoever
for that case.
A previous fix, 7636e5c60f, attempted to fix issues caused by this
oversight, but just expanding OLD tuples for triggers doesn't actually
solve the underlying issue.
One known consequence of this bug is that the check for HOT updates
can return the wrong result, when a previously fast-default'ed column
is set to NULL. Which in turn means that an index over a column with
fast default'ed columns might be corrupt if the underlying column(s)
allow NULLs.
Fix by handling fast default columns in heap_getattr(), remove now
superfluous expansion in GetTupleForTrigger().
Author: Andres Freund
Discussion: https://postgr.es/m/20190201162404.onngi77f26baem4g@alap3.anarazel.de
Backpatch: 11, where fast defaults were introduced

Contrary to the comment on 772d4b76, only paths starting with "./" or
"../" are considered relative to the current working directory by perl's
"do" function. So this patch converts all the relevant cases to use "./"
paths. This only affects MSVC.
Backpatch to all live branches.

DST law changes in Kazakhstan, Metlakatla, and São Tomé and Príncipe.
Kazakhstan's Qyzylorda zone is split in two, creating a new zone
Asia/Qostanay, as some areas did not change UTC offset.
Historical corrections for Hong Kong and numerous Pacific islands.

Historically we've had each release branch include all prior branches'
notes, including minor-release changes, back to the beginning of the
project. That's basically an O(N^2) proposition, and it was starting to
catch up with us: as of HEAD the back-branch release notes alone accounted
for nearly 30% of the documentation. While there's certainly some value
in easy access to back-branch notes, this is getting out of hand.
Hence, switch over to the rule that each branch contains only its own
release notes. So as to not make older notes too hard to find, each
branch will provide URLs for the immediately preceding branches'
release notes on the project website.
There might be value in providing aggregated notes across all branches
somewhere on the website, but that's a task for another day.
Discussion: https://postgr.es/m/cbd4aeb5-2d9c-8b84-e968-9e09393d4c83@postgresql.org

Commit 62215de29 turns out to have been not quite on-the-mark.
When we are forced to postpone dumping of a materialized view into
the dump's post-data section (because it depends on a unique index
that isn't created till that section), we may also have to postpone
dumping other matviews that depend on said matview. The previous fix
didn't reliably work for such cases: it'd break the dependency loops
properly, producing a workable object ordering, but it didn't
necessarily mark all the matviews as "postponed_def". This led to
harmless bleating about "archive items not in correct section order",
as reported by Tom Cassidy in bug #15602. Less harmlessly,
selective-restore options such as --section might misbehave due to
the matview dump objects not being properly labeled.
The right way to fix it is to consider that each pre-data dependency
we break amounts to moving the no-longer-dependent object into
post-data, and hence we should mark that object if it's a matview.
Back-patch to all supported versions, since the issue's been there
since matviews were introduced.
Discussion: https://postgr.es/m/15602-e895445f73dc450b@postgresql.org

Rather than define ld_library_path_ver with a big nested $(if), just
put the overriding values in the makefiles for the relevant ports.
Also add a variable for port makefiles to append their own stuff to
with_temp_install, and use it to set LD_LIBRARY_PATH_RPATH=1 on
FreeBSD which is needed to make LD_LIBRARY_PATH override DT_RPATH
if DT_RUNPATH is not set (which seems to depend in unpredictable ways
on the choice of compiler, at least on my system).
Backpatch for the benefit of anyone doing regression tests on FreeBSD.
(For other platforms there should be no functional change.)

The option is ignored on Windows, and GUC data_directory_mode already
mentioned that within its description in the documentation.
Author: Michael Paquier
Reported-by: Haribabu Kommi, David Steele
Discussion: https://postgr.es/m/CAJrrPGefxTG43yk6BrOC7ZcMnCTccG9+inCSncvyys_t8Ev9cQ@mail.gmail.com
Backpatch-through: 11

Add PG_CFLAGS, PG_CXXFLAGS, and PG_LDFLAGS variables to pgxs.mk which
will be appended or prepended to the corresponding make variables.
Notably, there was previously no way to pass custom CXXFLAGS to third
party extension module builds, COPT and PROFILE supporting only CFLAGS
and LDFLAGS.
Backpatch all the way down to ease integration with existing
extensions.
Author: Christoph Berg
Reviewed-by: Andres Freund, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/20181113104005.GA32154@msg.credativ.de
Backpatch-through: 9.4

To avoid deadlock, backend acquires a lock on heap pages in block
number order. In certain cases, lock on heap pages is dropped and
reacquired. In this case, the locks are dropped for reading in
corresponding VM page/s. The issue is we re-acquire locks in bufferId
order whereas the intention was to acquire in blockid order.
This commit ensures that we will always acquire locks on heap pages in
blockid order.
Reported-by: Nishant Fnu
Author: Nishant Fnu
Reviewed-by: Amit Kapila and Robert Haas
Backpatch-through: 9.4
Discussion: https://postgr.es/m/5883C831-2ED1-47C8-BFAC-2D5BAE5A8CAE@amazon.com

M src/backend/access/heap/hio.c

Fix use of dangling pointer in heap_delete() when logging replica identity

When logging the replica identity of a deleted tuple, XLOG_HEAP_DELETE
records include references of the old tuple. Its data is stored in an
intermediate variable used to register this information for the WAL
record, but this variable gets away from the stack when the record gets
actually inserted.
Spotted by clang's AddressSanitizer.
Author: Stas Kelvish
Discussion: https://postgr.es/m/085C8825-AD86-4E93-AF80-E26CDF03D1EA@postgrespro.ru
Backpatch-through: 9.4

The bug was that determining which columns are part of the replica
identity index using RelationGetIndexAttrBitmap() would run
eval_const_expressions() on index expressions and predicates across
all indexes of the table, which in turn might require a snapshot, but
there wasn't one set, so it crashes. There were actually two separate
bugs, one on the publisher and one on the subscriber.
To trigger the bug, a table that is part of a publication or
subscription needs to have an index with a predicate or expression
that lends itself to constant expressions simplification.
The fix is to avoid the constant expressions simplification in
RelationGetIndexAttrBitmap(), so that it becomes safe to call in these
contexts. The constant expressions simplification comes from the
calls to RelationGetIndexExpressions()/RelationGetIndexPredicate() via
BuildIndexInfo(). But RelationGetIndexAttrBitmap() calling
BuildIndexInfo() is overkill. The latter just takes pg_index catalog
information, packs it into the IndexInfo structure, which former then
just unpacks again and throws away. We can just do this directly with
less overhead and skip the troublesome calls to
eval_const_expressions(). This also removes the awkward
cross-dependency between relcache.c and index.c.
Bug: #15114
Reported-by: Петър Славов <pet.slavov@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/152110589574.1223.17983600132321618383@wrigleys.postgresql.org/

Previously llvmjit.h #error'ed when USE_LLVM was not defined, to
prevent it from being included from code not having #ifdef USE_LLVM
guards - but that's not actually that useful after, during the
development of JIT support, LLVM related code was moved into a
separately compiled .so. Having that #error means cpluspluscheck
doesn't work when llvm support isn't enabled, which isn't great.
Similarly add USE_LLVM guards to llvmjit_emit.h, and additionally make
sure it compiles standalone.
Per complaint from Tom Lane.
Author: Andres Freund
Discussion: https://postgr.es/m/19808.1548692361@sss.pgh.pa.us
Backpatch: 11, where JIT support was added

Given the right timing, psql could emit "connection to server was lost"
rather than one of the other messages that this test script checked for.
It looks like commit 4247db625 may have made this more likely, but
I don't really believe it was impossible before then. Rather than
stress about it, just add that spelling as one of the crash-successfully-
detected cases.
Discussion: https://postgr.es/m/19344.1548554028@sss.pgh.pa.us

Previously, \g would successfully execute the COPY command, but
the target specification if any was ignored, so that the data was
always dumped to the regular query output target. This seems like
a clear bug, so let's not just fix it but back-patch it.
While at it, adjust the documentation for \copy to recommend
"COPY ... TO STDOUT \g foo" as a plausible alternative.
Back-patch to 9.5. The problem exists much further back, but the
code associated with \g was refactored enough in 9.5 that we'd
need a significantly different patch for 9.4, and it doesn't
seem worth the trouble.
Daniel Vérité, reviewed by Fabien Coelho
Discussion: https://postgr.es/m/15dadc39-e050-4d46-956b-dcc4ed098753@manitou-mail.org

Since LISTEN is (still) disallowed, UNLISTEN must be a no-op in a
hot-standby session, and so there's no harm in allowing it. This
change allows client code to not worry about whether it's connected
to a primary or standby server when performing session-state-reset
type activities. (Note that DISCARD ALL, which includes UNLISTEN,
was already allowed, making it inconsistent to reject UNLISTEN.)
Per discussion, back-patch to all supported versions.
Shay Rojansky, reviewed by Mi Tar
Discussion: https://postgr.es/m/CADT4RqCf2gA_TJtPAjnGzkC3ZiexfBZiLmA-mV66e4UyuVv8bA@mail.gmail.com

A report from Andrew Dunstan showed that an ecpglib breakage that
causes repeated query failures could lead to infinite loops in some
ecpg test scripts, because they contain "while(1)" loops with no
exit condition other than successful test completion. That might
be all right for manual testing, but it seems entirely unacceptable
for automated test environments such as our buildfarm. We don't
want buildfarm owners to have to intervene manually when a test
goes wrong.
To fix, just change all those while(1) loops to exit after at most
100 iterations (which is more than any of them expect to iterate).
This seems sufficient since we'd see discrepancies in the test output
if any loop executed the wrong number of times.
I tested this by dint of intentionally breaking ecpg_do_prologue
to always fail, and verifying that the tests still got to completion.
Back-patch to all supported branches, since the whole point of this
exercise is to protect the buildfarm against future mistakes.
Discussion: https://postgr.es/m/18693.1548302004@sss.pgh.pa.us

We were failing to set conislocal correctly for constraints in
partitions after partition detach, leading to those constraints becoming
undroppable. Fix by setting the flag correctly. Existing databases
might contain constraints with the conislocal wrongly set to false, for
partitions that were detached; this situation should be fixable by
applying an UPDATE on pg_constraint to set conislocal true. This
problem should otherwise be innocuous and should disappear across a
dump/restore or pg_upgrade.
Secondarily, when constraint drop was attempted in a partitioned table,
ATExecDropConstraint would try to recurse to partitions after doing
performDeletion() of the constraint in the partitioned table itself; but
since the constraint in the partitions are dropped by the initial call
of performDeletion() (because of following dependencies), the recursion
step would fail since it would not find the constraint, causing the
whole operation to fail. Fix by preventing recursion.
Reported-by: Amit Langote
Diagnosed-by: Amit Langote
Author: Amit Langote, Álvaro Herrera
Discussion: https://postgr.es/m/f2b8ead5-4131-d5a8-8016-2ea0a31250af@lab.ntt.co.jp

The pgbench regression test supposed that srandom() with a specific value
would result in deterministic output from random(), as required by POSIX.
It emerges however that OpenBSD is too smart to be constrained by mere
standards, so their random() emits nondeterministic output anyway.
While a workaround does exist, what seems like a better fix is to stop
relying on the platform's srandom()/random() altogether, so that what
you get from --random-seed=N is not merely deterministic but platform
independent. Hence, use a separate pg_jrand48() random sequence in
place of random().
Also adjust the regression test case that's supposed to detect
nondeterminism so that it's more likely to detect it; the original
choice of random_zipfian parameter tended to produce the same output
all the time even if the underlying behavior wasn't deterministic.
In passing, improve pgbench's docs about random_zipfian().
Back-patch to v11 where this code was introduced.
Fabien Coelho and Tom Lane
Discussion: https://postgr.es/m/4615.1547792324@sss.pgh.pa.us

Apparently, some builds of MinGW contain a version of
_configthreadlocale() that always returns -1, indicating failure.
Rather than treating that as a curl-up-and-die condition, soldier on
as though the function didn't exist. This leaves us without thread
safety on such MinGW versions, but we didn't have it anyway.
Discussion: https://postgr.es/m/d06a16bc-52d6-9f0d-2379-21242d7dbe81@2ndQuadrant.com

I (Álvaro) forgot to do this in eb7ed3f30634, leading to undroppable
constraints after partitions are detached. Repair.
Reported-by: Amit Langote
Author: Amit Langote
Discussion: https://postgr.es/m/c1c9b688-b886-84f7-4048-1e4ebe9b1d06@lab.ntt.co.jp

In other places that has been changed from http://www.oasis-open.org/
https://www.oasis-open.org/ but there's a place where the change was
missed.
Discussion: https://postgr.es/m/20190121.222844.399814306477973879.t-ishii%40sraoss.co.jp

ecpglib attempts to force the LC_NUMERIC locale to "C" while reading
server output, to avoid problems with strtod() and related functions.
Historically it's just issued setlocale() calls to do that, but that
has major problems if we're in a threaded application. setlocale()
itself is not required by POSIX to be thread-safe (and indeed is not,
on recent OpenBSD). Moreover, its effects are process-wide, so that
we could cause unexpected results in other threads, or another thread
could change our setting.
On platforms having uselocale(), which is required by POSIX:2008,
we can avoid these problems by using uselocale() instead. Windows
goes its own way as usual, but we can make it safe by using
_configthreadlocale(). Platforms having neither continue to use the
old code, but that should be pretty much nobody among current systems.
(Subsequent buildfarm results show that recent NetBSD versions still
lack uselocale(), but it's not a big problem because they also do not
support non-"C" settings for LC_NUMERIC.)
Back-patch of commits 8eb4a9312 and ee27584c4.
Michael Meskes and Tom Lane; thanks also to Takayuki Tsunakawa.
Discussion: https://postgr.es/m/31420.1547783697@sss.pgh.pa.us

Seems to be from a bad case of copy-and-paste-itis in commit 665d1fad9.
It wouldn't be quite so annoying if it didn't contradict the comment
half a dozen lines above.
David Rowley
Discussion: https://postgr.es/m/CAKJS1f95Dyf8Qkdz4W+PbCmT-HTb54tkqUCC8isa2RVgSJ_pXQ@mail.gmail.com

Detaching a partition from a partitioned table that's constrained by
foreign keys requires additional action triggers on the referenced side;
otherwise, DELETE/UPDATE actions there fail to notice rows in the table
that was partition, and so are incorrectly allowed through. With this
commit, those triggers are now created. Conversely, when a table that
has a foreign key is attached as a partition to a table that also has
the same foreign key, those action triggers are no longer needed, so we
remove them.
Add a minimal test case verifying (part of) this.
Authors: Amit Langote, Álvaro Herrera
Discussion: https://postgr.es/m/f2b8ead5-4131-d5a8-8016-2ea0a31250af@lab.ntt.co.jp

Back in commit 100340e2dcd0, we made relcache entries keep lists of the
foreign keys applying to the relation -- but we forgot to update
CacheInvalidateHeapTuple to flush those entries when new FKs got created
or existing ones updated/deleted. No bugs appear to have been reported
that would be explained by this ommission, but I noticed the problem
while working on an unrelated bugfix which clearly showed it. Fix by
adding relcache flush on relevant foreign key changes.
Backpatch to 9.6, like the aforementioned commit.
Discussion: https://postgr.es/m/201901211927.7mmhschxlejh@alvherre.pgsql
Reviewed-by: Tom Lane

This reverts commit 458a1244f1fcf407874482a93b7631ecf5303d6e.
It has portability problems on Windows, which will require
a little bit of research to fix.
Discussion: https://postgr.es/m/20202.1548035461@sss.pgh.pa.us

M src/bin/pgbench/t/001_pgbench_with_server.pl

Postpone generating tlists and EC members for inheritance dummy children.

Previously, in set_append_rel_size(), we generated tlists and EC members
for dummy children for possible use by partition-wise join, even if
partition-wise join was disabled or the top parent was not a partitioned
table, but adding such EC members causes noticeable planning speed
degradation for queries with certain kinds of join quals like
"(foo.x + bar.y) = constant" where foo and bar are partitioned tables in
cases where there are lots of dummy children, as the EC members lists
grow huge, especially for the ECs derived from such join quals, which
makes the search for the parent EC members in add_child_rel_equivalences()
very time-consuming. Postpone the work until such children are actually
involved in a partition-wise join.
Reported-by: Sanyo Capobiango
Analyzed-by: Justin Pryzby and Alvaro Herrera
Author: Amit Langote, with a few additional changes by me
Reviewed-by: Ashutosh Bapat
Backpatch-through: v11 where partition-wise join was added
Discussion: https://postgr.es/m/CAO698qZnrxoZu7MEtfiJmpmUtz3AVYFVnwzR%2BpqjF%3DrmKBTgpw%40mail.gmail.com

This reverts commit bf070ce09e05943d6484de0ec17c7b02f2690a6d.
Per discussion, it's not desirable to add valgrind suppressions for
outside our own code base (e.g. glibc in this case), especially when
the suppressions may be platform-specific. There are better ways to
deal with that, e.g. by providing local suppressions.
Discussion: https://www.postgresql.org/message-id/flat/90ac0452-e907-e7a4-b3c8-15bd33780e62%402ndquadrant.com

Recent OpenBSD (at least 5.9 and up) has a version of getopt(3)
that will not cope with the "-:" spec we use to accept double-dash
options in postgres.c and postmaster.c. Admittedly, that's a hack
because POSIX only requires getopt() to allow alphanumeric option
characters. I have no desire to find another way, however, so
let's just do what we were already doing on Solaris: force use
of our own src/port/getopt.c implementation.
In passing, improve some of the comments around said implementation.
Per buildfarm and local testing. Back-patch to all supported branches.
Discussion: https://postgr.es/m/30197.1547835700@sss.pgh.pa.us

When creating a foreign key in a partitioned table, if some partitions
already have equivalent constraints, we wastefully create duplicates of
the constraints instead of attaching to the existing ones. That's
inconsistent with the de-duplication that is applied when a table is
attached as a partition. To fix, reuse the FK-cloning code instead of
having a separate code path.
Backpatch to Postgres 11. This is a subtle behavior change, but surely
a welcome one since there's no use in having duplicate foreign keys.
Discovered by Álvaro Herrera while thinking about a different problem
reported by Jesper Pedersen (bug #15587).
Author: Álvaro Herrera
Discussion: https://postgr.es/m/201901151935.zfadrzvyof4k@alvherre.pgsql

My commit 3de241dba86f introduced some code to create a clone of a
foreign key to a partition, but I put it in pg_constraint.c because it
was too close to the contents of the pg_constraint row. With the
previous commit that split out the constraint tuple deconstruction into
its own routine, it makes more sense to have the FK-cloning function in
tablecmds.c, mostly because its static subroutine can then be used by a
future bugfix.
My initial posting of this patch had this routine as static in
tablecmds.c, but sadly this function is already part of the Postgres 11
ABI as exported from pg_constraint.c, so keep it as exported also just
to avoid breaking any possible users of it.

My commit 3de241dba86f introduced some code (in tablecmds.c) to obtain
data from a pg_constraint row for a foreign key, that already existed in
ri_triggers.c. Split it out into its own routine in pg_constraint.c,
where it naturally belongs.
No functional code changes, only code movement.
Backpatch to pg11, because a future bugfix is simpler after this.

current_schema() gets called in the recently-added regression test from
c5660e0, and can be used in a parallel context, causing its call to fail
when creating a temporary schema.
Per buildfarm members crake and lapwing.
Discussion: https://postgr.es/m/20190118005949.GD1883@paquier.xyz

M src/test/regress/expected/temp.out
M src/test/regress/sql/temp.sql

Avoid assuming that we know the spelling of getopt_long’s error messages.

I've had enough of "fixing" this test case. Whatever value it has
is limited to verifying that pgbench fails for an unrecognized switch,
and we don't need to assume anything about what getopt_long prints in
order to do that.
Discussion: https://postgr.es/m/9427.1547701450@sss.pgh.pa.us

Attempting to use a temporary table within a two-phase transaction is
forbidden for ages. However, there have been uncovered grounds for
a couple of other object types and commands which work on temporary
objects with two-phase commit. In short, trying to create, lock or drop
an object on a temporary schema should not be authorized within a
two-phase transaction, as it would cause its state to create
dependencies with other sessions, causing all sorts of side effects with
the existing session or other sessions spawned later on trying to use
the same temporary schema name.
Regression tests are added to cover all the grounds found, the original
report mentioned function creation, but monitoring closer there are many
other patterns with LOCK, DROP or CREATE EXTENSION which are involved.
One of the symptoms resulting in combining both is that the session
which used the temporary schema is not able to shut down completely,
waiting for being able to drop the temporary schema, something that it
cannot complete because of the two-phase transaction involved with
temporary objects. In this case the client is able to disconnect but
the session remains alive on the backend-side, potentially blocking
connection backend slots from being used. Other problems reported could
also involve server crashes.
This is back-patched down to v10, which is where 9b013dc has introduced
MyXactFlags, something that this patch relies on.
Reported-by: Alexey Bashtanov
Author: Michael Paquier
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/5d910e2e-0db8-ec06-dd5f-baec420513c3@imap.cc
Backpatch-through: 10

Previously, parseCheckAggregates was run before
assign_query_collations, but this causes problems if any expression
has already had a collation assigned by some transform function (e.g.
transformCaseExpr) before parseCheckAggregates runs. The differing
collations would cause expressions not to be recognized as equal to
the ones in the GROUP BY clause, leading to spurious errors about
unaggregated column references.
The result was that CASE expr WHEN val ... would fail when "expr"
contained a GROUPING() expression or matched one of the group by
expressions, and where collatable types were involved; whereas the
supposedly identical CASE WHEN expr = val ... would succeed.
Backpatch all the way; this appears to have been wrong ever since
collations were introduced.
Per report from Guillaume Lelarge, analysis and patch by me.
Discussion: https://postgr.es/m/CAECtzeVSO_US8C2Khgfv54ZMUOBR4sWq+6_bLrETnWExHT=rFg@mail.gmail.com
Discussion: https://postgr.es/m/87muo0k0c7.fsf@news-spur.riddles.org.uk

These have been found while cross-checking for the use of unique words
in the documentation, and a wait event was not getting generated in a way
consistent to what the documentation provided.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/9b5a3a85-899a-ae62-dbab-1e7943aa5ab1@gmail.com

We were considering the INCLUDE columns as part of the key, allowing
unicity-violating rows to be inserted in different partitions.
Concurrent development conflict in eb7ed3f30634 and 8224de4f42cc.
Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20190109065109.GA4285@telsasoft.com

This will enable things like the buildfarm client to discover more
reliably if certain libraries have been installed.
Discussion: https://postgr.es/m/859e7c91-7ef4-d4b4-2ca2-8046e0cbee09@2ndQuadrant.com
Backpatch to all live branches.

This especially helps braces that surround code blocks. Back-patch to
v11, where commit 56fb890ace8ac0ca955ae0803c580c2074f876f6 first
appeared; before that, settings were even more distant from perltidy.
Reviewed by Andrew Dunstan.
Discussion: https://postgr.es/m/20190103055355.GB267595@gust.leadboat.com

Some systems don't ship with "python" by default anymore, only
"python3" or "python2" or some combination, so include those in the
configure search.
Discussion: https://www.postgresql.org/message-id/flat/1457.1543184081%40sss.pgh.pa.us#c9cc1199338fd6a257589c6dcea6cf8d

Some makefiles were trying to do this:
temp-install: EXTRA_INSTALL=contrib/test_decoding
but that no longer works as of commit aa019da52: the macro is now
consulted by the checkprep target, one level down, and apparently
gmake doesn't propagate such macro settings recursively.
The problem is masked since 42e61c774 because pgxs.mk also sets up
EXTRA_INSTALL, and correctly applies it to the checkprep target.
Unfortunately I'd not risked back-patching that to before v11.
Since aa019da52 was pushed back to v10, it broke test_decoding
there (the only module for which this actually makes a difference
at present).
Hence, back-patch 42e61c774 to v10. Also, remove some demonstrably
useless settings of EXTRA_INSTALL in v10 and v11 (they'd already
been cleaned up in HEAD).
Per buildfarm.
Discussion: https://postgr.es/m/CAEepm=1pEJdwv6DSGmOfpX0EaX7L7sT28c1nXpqvQvmLfEWb1g@mail.gmail.com

Up to now, createplan.c attempted to share PARAM_EXEC slots for
NestLoopParams across different plan levels, if the same underlying Var
was being fed down to different righthand-side subplan trees by different
NestLoops. This was, I think, more of an artifact of using subselect.c's
PlannerParamItem infrastructure than an explicit design goal, but anyway
that was the end result.
This works well enough as long as the plan tree is executing synchronously,
but the feature whereby Gather can execute the parallelized subplan locally
breaks it. An upper NestLoop node might execute for a row retrieved from
a parallel worker, and assign a value for a PARAM_EXEC slot from that row,
while the leader's copy of the parallelized subplan is suspended with a
different active value of the row the Var comes from. When control
eventually returns to the leader's subplan, it gets the wrong answers if
the same PARAM_EXEC slot is being used within the subplan, as reported
in bug #15577 from Bartosz Polnik.
This is pretty reminiscent of the problem fixed in commit 46c508fbc, and
the proper fix seems to be the same: don't try to share PARAM_EXEC slots
across different levels of controlling NestLoop nodes.
This requires decoupling NestLoopParam handling from PlannerParamItem
handling, although the logic remains somewhat similar. To avoid bizarre
division of labor between subselect.c and createplan.c, I decided to move
all the param-slot-assignment logic for both cases out of those files
and put it into a new file paramassign.c. Hopefully it's a bit better
documented now, too.
A regression test case for this might be nice, but we don't know a
test case that triggers the problem with a suitably small amount
of data.
Back-patch to 9.6 where we added Gather nodes. It's conceivable that
related problems exist in older branches; but without some evidence
for that, I'll leave the older branches alone.
Discussion: https://postgr.es/m/15577-ca61ab18904af852@postgresql.org

Since approximately PostgreSQL 10, it is no longer required that
environment variables at installation time such as PERL, PYTHON, TCLSH
be "full path names", so change that phrasing in the installation
instructions. (The exact time of change appears to differ for PERL
and the others, but it works consistently in PostgreSQL 10.)
Also while we're here document the defaults for PERL and PYTHON, but
since the search list for TCLSH is so long, let's leave that out so we
don't need to maintain a copy of that list in the installation
instructions.

This was an oversight in commit 16828d5c. If the table is going to be
rewritten, we simply clear all the missing values from all the table's
attributes, since there will no longer be any rows with the attributes
missing. Otherwise, we repackage the missing value in an array
constructed with the new type specifications.
Backpatch to release 11.
This fixes bug #15446, reported by Dmitry Molotkov
Reviewed by Dean Rasheed

For a long time, plpgsql has allowed trigger functions to parse
references to OLD and NEW even if the current trigger event type didn't
assign a value to one or the other variable; but actually executing such
a reference would fail. The v11 changes to use "expanded records" for
DTYPE_REC variables changed the behavior so that the unassigned variable
now reads as a null composite value. While this behavioral change was
more or less unintentional, it seems that leaving it like this is better
than adding code and complexity to be bug-compatible with the old way.
The change doesn't break any code that worked before, and it eliminates
a gotcha that often required extra code to work around.
Hence, update the docs to say that these variables are "null" not
"unassigned" when not relevant to the event type. And add a regression
test covering the behavior, so that we'll notice if we ever break it
again.
Per report from Kristjan Tammekivi.
Discussion: https://postgr.es/m/CAABK7uL-uC9ZxKBXzo_68pKt7cECfNRv+c35CXZpjq6jCAzYYA@mail.gmail.com

runtime.sgml said that you couldn't change SysV IPC parameters on OpenBSD
except by rebuilding the kernel. That's definitely wrong in OpenBSD 6.x,
and excavation in their man pages says it changed in OpenBSD 3.3.
Update NetBSD and OpenBSD sections to recommend adjustment of the SEMMNI
and SEMMNS settings, which are painfully small by default on those
platforms. (The discussion thread contemplated recommending that
people select POSIX semaphores instead, but the performance consequences
of that aren't really clear, so I'll refrain.)
Remove pointless discussion of SEMMNU and SEMMAP from the FreeBSD
section. Minor other wordsmithing.
Discussion: https://postgr.es/m/27582.1546928073@sss.pgh.pa.us

This patch changes the rule for whether or not a tuple seen by ANALYZE
should be included in its sample.
When we last touched this logic, in commit 51e1445f1, we weren't
thinking very hard about tuples being UPDATEd by a long-running
concurrent transaction. In such a case, we might see the pre-image as
either LIVE or DELETE_IN_PROGRESS depending on timing; and we might see
the post-image not at all, or as INSERT_IN_PROGRESS. Since the existing
code will not sample either DELETE_IN_PROGRESS or INSERT_IN_PROGRESS
tuples, this leads to concurrently-updated rows being omitted from the
sample entirely. That's not very helpful, and it's especially the wrong
thing if the concurrent transaction ends up rolling back.
The right thing seems to be to sample DELETE_IN_PROGRESS rows just as if
they were live. This makes the "sample it" and "count it" decisions the
same, which seems good for consistency. It's clearly the right thing
if the concurrent transaction ends up rolling back; in effect, we are
sampling as though IN_PROGRESS transactions haven't happened yet.
Also, this combination of choices ensures maximum robustness against
the different combinations of whether and in which state we might see the
pre- and post-images of an update.
It's slightly annoying that we end up recording immediately-out-of-date
stats in the case where the transaction does commit, but on the other
hand the stats are fine for columns that didn't change in the update.
And the alternative of sampling INSERT_IN_PROGRESS rows instead seems
like a bad idea, because then the sampling would be inconsistent with
the way rows are counted for the stats report.
Per report from Mark Chambers; thanks to Jeff Janes for diagnosing
what was happening. Back-patch to all supported versions.
Discussion: https://postgr.es/m/CAFh58O_Myr6G3tcH3gcGrF-=OExB08PJdWZcSBcEcovaiPsrHA@mail.gmail.com

Debian testing and newer now require that RSA and DHE keys are at
least 2048 bit long and no longer allow SHA-1 for signatures in
certificates. This is currently causing the ssl tests to fail there
because the test certificates and keys have been created in violation
of those conditions.
Update the parameters to create the test files and create a new set of
test files.
Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Reported-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/20180917131340.GE31460%40paquier.xyz

MinMaxExpr invokes the btree comparison function for its input datatype,
so it's only leakproof if that function is. Many such functions are
indeed leakproof, but others are not, and we should not just assume that
they are. Hence, adjust contain_leaked_vars to verify the leakproofness
of the referenced function explicitly.
I didn't add a regression test because it would need to depend on
some particular comparison function being leaky, and that's a moving
target, per discussion.
This has been wrong all along, so back-patch to supported branches.
Discussion: https://postgr.es/m/31042.1546194242@sss.pgh.pa.us

fe0a0b5, which has added a stronger random source in Postgres, has
introduced a thinko when creating a padding message which gets encrypted
for Elgamal. The padding message cannot have zeros, which are replaced
by random bytes. However if pg_strong_random() failed, the message
would finish by being considered in correct shape for encryption with
zeros.
Author: Tom Lane
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/20186.1546188423@sss.pgh.pa.us
Backpatch-through: 10

This closes a race condition in "make -j check-world"; the symptom was
EEXIST errors. Back-patch to v10, before which parallel check-world had
worse problems.
Discussion: https://postgr.es/m/20181224221601.GA3227827@rfd.leadboat.com

We already redirected other temp-install stderr and all temp-install
stdout in this way. Back-patch to v10, like the next commit.
Discussion: https://postgr.es/m/20181224221601.GA3227827@rfd.leadboat.com

Detect it the way pg_ctl's wait_for_postmaster() does. When pg_regress
spawned a postmaster that failed startup, we were detecting that only
with "pg_regress: postmaster did not respond within 60 seconds".
Back-patch to 9.4 (all supported versions).
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/20181231172922.GA199150@gust.leadboat.com

POSIX specifies that jrand48() returns a signed 32-bit value (in the
range [-2^31, 2^31)), but our code was returning an unsigned 32-bit
value (in the range [0, 2^32)). This doesn't actually matter to any
existing call site, because they all cast the "long" result to int32
or uint32; but it will doubtless bite somebody in the future.
To fix, cast the arithmetic result to int32 explicitly before the
compiler widens it to long (if widening is needed).
While at it, upgrade this file's far-short-of-project-style comments.
Had there been some peer pressure to document pg_jrand48() properly,
maybe this thinko wouldn't have gotten committed to begin with.
Backpatch to v10 where pg_jrand48() was added, just in case somebody
back-patches a fix that uses it and depends on the standard behavior.
Discussion: https://postgr.es/m/17235.1545951602@sss.pgh.pa.us

Isolation test suite of GIN predicate locking was criticized for being too slow,
especially under Valgrind. This commit is intended to accelerate it. Tests are
simplified in the following ways.
1) Amount of data is reduced. We're now close to the minimal amount of data,
which produces at least one posting tree and at least two pages of entry
tree.
2) Three isolation tests are merged into one.
3) Only one tuple is queried from posting tree. So, locking of index is the
same, but tuple locks are not propagated to relation lock. Also, it is
faster.
4) Test cases itself are simplified. Now each test case run just one INSERT
and one SELECT involving GIN, which either conflict or not.
Discussion: https://postgr.es/m/20181204000740.ok2q53nvkftwu43a%40alap3.anarazel.de
Reported-by: Andres Freund
Tested-by: Andrew Dunstan
Author: Alexander Korotkov
Backpatch-through: 11

According to README we acquire predicate locks on entry tree leafs and posting
tree roots. However, when ginFindLeafPage() is going to lock leaf in exclusive
mode, then it checks root for conflicts regardless whether it's a entry or
posting tree. Assuming that we never place predicate lock on entry tree root
(excluding corner case when root is leaf), this check is redundant. This
commit removes this check. Now, root conflict checking is controlled by
separate argument of ginFindLeafPage().
Discussion: https://postgr.es/m/CAPpHfdv7rrDyy%3DMgsaK-L9kk0AH7az0B-mdC3w3p0FSb9uoyEg%40mail.gmail.com
Author: Alexander Korotkov
Backpatch-through: 11

Inheritance trees can include temporary tables if the parent is
permanent, which makes possible the presence of multiple temporary
children from different sessions. Trying to issue a TRUNCATE on the
parent in this scenario causes a failure, so similarly to any other
queries just ignore such cases, which makes TRUNCATE work
transparently.
This makes truncation behave similarly to any other DML query working on
the parent table with queries which need to be issues on children. A
set of isolation tests is added to cover basic cases.
Reported-by: Zhou Digoal
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/15565-ce67a48d0244436a@postgresql.org
Backpatch-through: 9.4

I made a frontend fprintf() format use %m, forgetting that that's only
safe in HEAD not the back branches; prior to 96bf88d52 and d6c55de1f,
it would work on glibc platforms but not elsewhere. Revert to using
%s ... strerror(errno) as the code did before.
We could have left HEAD as-is, but for code consistency across branches,
I chose to apply this patch there too.
Per Coverity and a few buildfarm members.

At the end of recovery for the post-promotion process, a new history
file is created followed by the last partial segment of the previous
timeline. Based on the timing, the archiver would first try to archive
the last partial segment and then the history file. This can delay the
detection of a new timeline taken, particularly depending on the time it
takes to transfer the last partial segment as it delays the moment the
history file of the new timeline gets archived. This can cause promoted
standbys to use the same timeline as one already taken depending on the
circumstances if multiple instances look at archives at the same
location.
This commit changes the order of archiving so as history files are
archived in priority over other file types, which reduces the likelihood
of the same timeline being taken (still not reducing the window to
zero), and it makes the archiver behave more consistently with the
startup process doing its post-promotion business.
Author: David Steele
Reviewed-by: Michael Paquier, Kyotaro Horiguchi
Discussion: https://postgr.es/m/929068cf-69e1-bba2-9dc0-e05986aed471@pgmasters.net
Backpatch-through: 9.5

M src/backend/postmaster/pgarch.c

Disable WAL-skipping optimization for COPY on views and foreign tables

COPY can skip writing WAL when loading data on a table which has been
created in the same transaction as the one loading the data, however
this cannot work on views or foreign table as this would result in
trying to flush relation files which do not exist. So disable the
optimization so as commands are able to work the same way with any
configuration of wal_level.
Tests are added to cover the different cases, which need to have
wal_level set to minimal to allow the problem to show up, and that is
not the default configuration.
Reported-by: Luis M. Carril, Etsuro Fujita
Author: Amit Langote, Michael Paquier
Reviewed-by: Etsuro Fujita
Discussion: https://postgr.es/m/15552-c64aa14c5c22f63c@postgresql.org
Backpatch-through: 10, where support for COPY on views has been added,
while v11 has added support for COPY on foreign tables.

013ebc0a7b implements so-called GiST microvacuum. That is gistgettuple() marks
index tuples as dead when kill_prior_tuple is set. Later, when new tuple
insertion claims page space, those dead index tuples are physically deleted
from page. When this deletion is replayed on standby, it might conflict with
read-only queries. But 013ebc0a7b doesn't handle this. That may lead to
disappearance of some tuples from read-only snapshots on standby.
This commit implements resolving of conflicts between replay of GiST microvacuum
and standby queries. On the master we implement new WAL record type
XLOG_GIST_DELETE, which comprises necessary information. On stable releases
we've to be tricky to keep WAL compatibility. Information required for conflict
processing is just appended to data of XLOG_GIST_PAGE_UPDATE record. So,
PostgreSQL version, which doesn't know about conflict processing, will just
ignore that.
Reported-by: Andres Freund
Diagnosed-by: Andres Freund
Discussion: https://postgr.es/m/20181212224524.scafnlyjindmrbe6%40alap3.anarazel.de
Author: Alexander Korotkov
Backpatch-through: 9.6

For probably bogus reasons, we acquire only AccessShareLock on the
partition when we try to detach it from its parent partitioned table.
This can cause ugly things to happen if another transaction is doing
any sort of DDL to the partition concurrently.
Upgrade that lock to ShareUpdateExclusiveLock, which per discussion
seems to be the minimum needed.
Reported by Robert Haas.
Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com

"$user" in a search_path string is replaced by CURRENT_USER not
SESSION_USER. (It actually was SESSION_USER in the initial implementation,
but we changed it shortly later, and evidently forgot to fix the docs to
match.)
Noted by antonov@stdpr.ru
Discussion: https://postgr.es/m/159151fb45d490c8d31ea9707e9ba99d@stdpr.ru

When a partition is detached from its parent, we acquire locks on all
attached indexes to also detach them ... but we release those locks
immediately. This is a violation of the policy of keeping locks on user
objects to the end of the transaction. Bug introduced in 8b08f7d4820f.
It's unclear that there are any ill effects possible, but it's clearly
wrong nonetheless. It's likely that bad behavior *is* possible, but
mostly because the relation that the index is for is only locked with
AccessShareLock, which is an older bug that shall be fixed separately.
While touching that line of code, close the index opened with
index_open() using index_close() instead of relation_close().
No difference in practice, but let's be consistent.
Unearthed by Robert Haas.
Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com

Commit 40dae7ec537, which made the handling of interrupted nbtree page
splits more robust, removed an nbtree-specific end-of-recovery cleanup
step. This meant that it was no longer possible to complete an
interrupted page split during recovery. However, a reference to
recovery as a reason for using a NULL stack while inserting into a
parent page was missed. Remove the reference.
Remove a similar obsolete reference to recovery that was introduced much
more recently, as part of the btree fastpath optimization enhancement
that made it into Postgres 11 (commit 2b272734, and follow-up commits).
Backpatch: 11-, where the fastpath optimization was introduced.

"rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples,
which is sensible if it's to be multiplied by total-inner-tuples or a cost
value corresponding to scanning all the inner tuples. But in reality it
was (mostly) multiplied by inner_rows or a related cost, numbers that take
into account the possibility of stopping short of scanning the whole inner
relation thanks to a limited key range in the outer relation. This'd
still make sense if we could expect that stopping short would result in a
proportional decrease in the number of tuples that have to be rescanned.
It does not, however. The argument that establishes the validity of our
estimate for that number is independent of whether we scan all of the inner
relation or stop short, and experimentation also shows that stopping short
doesn't reduce the number of rescanned tuples. So the correct calculation
is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply
that by inner_rows or a corresponding cost value.
Most of the time this doesn't make much difference, but if we have
both a high rescan rate (due to lots of duplicate values) and an outer
key range much smaller than the inner key range, then the error can
be significant, leading to a large underestimate of the cost associated
with rescanning.
Per report from Vijaykumar Jain. This thinko appears to go all the way
back to the introduction of the rescan estimation logic in commit
70fba7043, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com

The new grammar pattern of ALTER INDEX SET STATISTICS able to use column
numbers on top of the existing column names introduced by commit 5b6d13e
forgot to add support for the feature in pg_dump, so defining statistics
on index columns was missing from the dumps, potentially causing silent
planning problems with a subsequent restore.
pg_dump ought to not use column names in what it generates as these are
automatically generated by the server and could conflict with real
relation attributes with matching patterns. "expr" and "exprN", N
incremented automatically after the creation of the first one, are used
as default attribute names for index expressions, and that could easily
match what is defined in other relations, causing the dumps to fail if
some of those attributes are renamed at some point. So to avoid any
problems, the new grammar with column numbers gets used.
Reported-by: Ronan Dunklau
Author: Michael Paquier
Reviewed-by: Tom Lane, Adrien Nayrat, Amul Sul
Discussion: https://postgr.es/m/CAARsnT3UQ4V=yDNW468w8RqHfYiY9mpn2r_c5UkBJ97NAApUEw@mail.gmail.com
Backpatch-through: 11, where the new syntax has been introduced.

This is an oversight from recent commit b13fd344. While on it, tweak
the previous test with a better name for the renamed primary key.
Detected by buildfarm member prion which forces relation cache release
with -DRELCACHE_FORCE_RELEASE. Back-patch down to 9.4 as the previous
commit.

When a constraint gets renamed, it may have associated with it a target
relation (for example domain constraints don't have one). Not
invalidating the target relation cache when issuing the renaming can
result in issues with subsequent commands that refer to the old
constraint name using the relation cache, causing various failures. One
pattern spotted was using CREATE TABLE LIKE after a constraint
renaming.
Reported-by: Stuart <sfbarbee@gmail.com>
Author: Amit Langote
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org

reap_child() basically ignored the possibility of either an error in
waitpid() itself or a child process failure on signal. We don't really
need to do more than report and crash hard, but proceeding as though
nothing is wrong is definitely Not Acceptable. The error report for
nonzero child exit status was pretty off-point, as well.
Noted while fooling around with child-process failure detection
logic elsewhere. It's been like this a long time, so back-patch to
all supported branches.

Commit ffa4cbd62 added logic to detect SIGPIPE failure of a COPY child
process, but it only worked correctly if the SIGPIPE occurred in the
immediate child process. Depending on the shell in use and the
complexity of the shell command string, we might instead get back
an exit code of 128 + SIGPIPE, representing a shell error exit
reporting SIGPIPE in the child process.
We could just hack up ClosePipeToProgram() to add the extra case,
but it seems like this is a fairly general issue deserving a more
general and better-documented solution. I chose to add a couple
of functions in src/common/wait_error.c, which is a natural place
to know about wait-result encodings, that will test for either a
specific child-process signal type or any child-process signal failure.
Then, adjust other places that were doing ad-hoc tests of this type
to use the common functions.
In RestoreArchivedFile, this fixes a race condition affecting whether
the process will report an error or just silently proc_exit(1): before,
that depended on whether the intermediate shell got SIGTERM'd itself
or reported a child process failing on SIGTERM.
Like the previous patch, back-patch to v10; we could go further
but there seems no real need to.
Per report from Erik Rijkers.
Discussion: https://postgr.es/m/f3683f87ab1701bea5d86a7742b22432@xs4all.nl

The idea here is to not call recordDependencyOn for the default collation,
since we know that's pinned. But what the code actually did was to record
the partition key's dependency on the opclass twice, instead.
Evidently introduced by sloppy coding in commit 2186b608b. Back-patch
to v10 where that came in.

When GIN vacuum deletes a posting tree page, it assumes that no concurrent
searchers can access it, thanks to ginStepRight() locking two pages at once.
However, since 9.4 searches can skip parts of posting trees descending from the
root. That leads to the risk that page is deleted and reclaimed before
concurrent search can access it.
This commit prevents the risk of above by waiting for every transaction, which
might wait to reference this page, to finish. Due to binary compatibility
we can't change GinPageOpaqueData to store corresponding transaction id.
Instead we reuse page header pd_prune_xid field, which is unused in index pages.
Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com
Author: Andrey Borodin, Alexander Korotkov
Reviewed-by: Alexander Korotkov
Backpatch-through: 9.4

On standby ginRedoDeletePage() can work concurrently with read-only queries.
Those queries can traverse posting tree in two ways.
1) Using rightlinks by ginStepRight(), which locks the next page before
unlocking its left sibling.
2) Using downlinks by ginFindLeafPage(), which locks at most one page at time.
Original lock order was: page, parent, left sibling. That lock order can
deadlock with ginStepRight(). In order to prevent deadlock this commit changes
lock order to: left sibling, page, parent. Note, that position of parent in
locking order seems insignificant, because we only lock one page at time while
traversing downlinks.
Reported-by: Chen Huajun
Diagnosed-by: Chen Huajun, Peter Geoghegan, Andrey Borodin
Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com
Author: Alexander Korotkov
Backpatch-through: 9.4

Before 218f51584d5 if posting tree page is about to be deleted, then the whole
posting tree is locked by LockBufferForCleanup() on root preventing all the
concurrent inserts. 218f51584d5 reduced locking to the subtree containing
page to be deleted. However, due to concurrent parent split, inserter doesn't
always holds pins on all the pages constituting path from root to the target
leaf page. That could cause a deadlock between GIN vacuum process and GIN
inserter. And we didn't find non-invasive way to fix this.
This commit reverts VACUUM behavior to lock the whole posting tree before
delete any page. However, we keep another useful change by 218f51584d5: the
tree is locked only if there are pages to be deleted.
Reported-by: Chen Huajun
Diagnosed-by: Chen Huajun, Andrey Borodin, Peter Geoghegan
Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com
Author: Alexander Korotkov, based on ideas from Andrey Borodin and Peter Geoghegan
Reviewed-by: Andrey Borodin
Backpatch-through: 10

postgres_fdw's postgresGetForeignPlan() assumes without checking that the
outer_plan it's given for a join relation must have a NestLoop, MergeJoin,
or HashJoin node at the top. That's been wrong at least since commit
4bbf6edfb (which could cause insertion of a Sort node on top) and it seems
like a pretty unsafe thing to Just Assume even without that.
Through blind good fortune, this doesn't seem to have any worse
consequences today than strange EXPLAIN output, but it's clearly trouble
waiting to happen.
To fix, test the node type explicitly before touching Join-specific
fields, and avoid jamming the new tlist into a node type that can't
do projection. Export a new support function from createplan.c
to avoid building low-level knowledge about the latter into FDWs.
Back-patch to 9.6 where the faulty coding was added. Note that the
associated regression test cases don't show any changes before v11,
apparently because the tests back-patched with 4bbf6edfb don't actually
exercise the problem case before then (there's no top-level Sort
in those plans).
Discussion: https://postgr.es/m/8946.1544644803@sss.pgh.pa.us

Our support for multiple-set-clauses in UPDATE assumes that the Params
referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan
in the targetlist of the plan node that calculates the updated row.
(Yeah, it's a hack...) In some PG branches it's possible that a Result
node gets inserted between the primary calculation of the update tlist
and the ModifyTable node. setrefs.c did the wrong thing in this case
and left the upper-level Params as Params, causing a crash at runtime.
What it should do is replace them with "outer" Vars referencing the child
plan node's output. That's a result of careless ordering of operations
in fix_upper_expr_mutator, so we can fix it just by reordering the code.
Fix fix_join_expr_mutator similarly for consistency, even though join
nodes could never appear in such a context. (In general, it seems
likely to be a bit cheaper to use Vars than Params in such situations
anyway, so this patch might offer a tiny performance improvement.)
The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff
was introduced, so back-patch that far. However, this may be a live
bug only in 9.6.x and 10.x, as the other branches don't seem to want
to calculate the final tlist below the Result node. (That plan shape
change between branches might be a mini-bug in itself, but I'm not
really interested in digging into the reasons for that right now.
Still, add a regression test memorializing what we expect there,
so we'll notice if it changes again.)
Per bug report from Eduards Bezverhijs.
Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com

This module overlooked this necessary fixup step on the results of
transformWhereClause(). It accidentally worked anyway, because the
constructed expression involved type "name" which is not collatable,
but it fell over while I was experimenting with changing "name" to
be collatable.
Back-patch, not because there's any live bug here in back branches,
but because somebody might use this code as a model for some real
application and then not understand why it doesn't work.

Unlike other ALTER ref pages, this one neglected to mention that
ALTER OWNER requires being a member of the new owning role.
Per bug #15546 from Stefan Kadow.
Discussion: https://postgr.es/m/15546-0558c75fd2025e7c@postgresql.org

Slow runs of buildfarm members chipmunk, hornet and mandrill saw the
shorter timeouts expire. The 180s timeout in poll_query_until has been
trouble-free since 2a0f89cd717ce6d49cdc47850577823682167e87 introduced
it two years ago, so use 180s more widely. Back-patch to 9.6, where the
first of these timeouts was introduced.
Reviewed by Michael Paquier.
Discussion: https://postgr.es/m/20181209001601.GC2973271@rfd.leadboat.com

Although copyfuncs.c has a check_stack_depth call in its recursion,
equalfuncs.c, outfuncs.c, and readfuncs.c lacked one. This seems
unwise.
Likewise fix planstate_tree_walker(), in branches where that exists.
Discussion: https://postgr.es/m/30253.1544286631@sss.pgh.pa.us

Previously, it would just pass back a partially-uninitialized tupdesc,
which doesn't seem like a safe or useful behavior.
Backpatch to v10 where this code came in.
Discussion: https://postgr.es/m/30830.1544384975@sss.pgh.pa.us

The stanza of ExecuteTruncate[Guts] that truncates a target table's toast
relation re-used the loop local variable "rel" to reference the toast rel.
This was safe enough when written, but commit d42358efb added code below
that that supposed "rel" still pointed to the parent table. Therefore,
the stats counter update was applied to the wrong relcache entry (the
toast rel not the user rel); and if we were unlucky and that relcache
entry had been flushed during reindex_relation, very bad things could
ensue.
(I'm surprised that CLOBBER_CACHE_ALWAYS testing hasn't found this.
I'm even more surprised that the problem wasn't detected during the
development of d42358efb; it must not have been tested in any case
with a toast table, as the incorrect stats counts are very obvious.)
To fix, replace use of "rel" in that code branch with a more local
variable. Adjust test cases added by d42358efb so that some of them
use tables with toast tables.
Per bug #15540 from Pan Bian. Back-patch to 9.5 where d42358efb came in.
Discussion: https://postgr.es/m/15540-01078812338195c0@postgresql.org

Remove dead code (which would be incorrect if it weren't dead),
per report from Pan Bian. Add a CHECK_FOR_INTERRUPTS in the
inner loop over child relations, because there's little point
in having one in the outer loop if there's not one here too.
Minor stylistic adjustments and comment improvements.
Seems to be aboriginal to this code (cf commit 665d1fad9).
Back-patch to v10 where that came in, not because any of this
is significant, but just to keep the branches looking similar.
Discussion: https://postgr.es/m/15539-06d00ef6b1e2e1bb@postgresql.org

Places that are testing for *printf failure ought to include the format
string in their error reports, since bad-format-string is one of the
more likely causes of such failure. This both makes it easier to find
and repair the mistake, and provides at least some useful info to the
user who stumbles across such a problem.
Also, tighten snprintf.c to report EINVAL for an invalid flag or
final character in a format %-spec (including the case where the
%-spec is missing a final character altogether). This seems like
better project policy, and it also allows removing an instruction
or two from the hot code path.
Back-patch the error reporting change in pvsnprintf, since it should be
harmless and may be helpful; but not the snprintf.c change.
Per discussion of bug #15511 from Ertuğrul Kahveci, which reported an
invalid translated format string. These changes don't fix that error,
but they should improve matters next time we make such a mistake.
Discussion: https://postgr.es/m/15511-1d8b6a0bc874112f@postgresql.org

It was pointed out that in the planner stats documentation under
Extended Statistics, one of the sentences was a bit awkward. Improve
that by rewording it slightly.
Discussion: https://postgr.es/m/154409976780.14137.2785644488950047100@wrigleys.postgresql.org

When an indexes is created on a partitioned table using ONLY (don't
recurse to partitions), it gets marked invalid until index partitions
are attached for each table partition. But there's no reason to do this
if there are no partitions ... and moreover, there's no way to get the
index to become valid afterwards, because all partitions that get
created/attached get their own index partition already attached to the
parent index, so there's no chance to do ALTER INDEX ... ATTACH PARTITION
that would make the parent index valid.
Fix by not marking the index as invalid to begin with.
This is very similar to 9139aa19423b, but the pg_dump aspect does not
appear to be relevant until we add FKs that can point to PKs on
partitioned tables. (I tried to cause the pg_upgrade test to break by
leaving some of these bogus tables around, but wasn't able to.)
Making this change means that an index that was supposed to be invalid
in the insert_conflict regression test is no longer invalid; reorder the
DDL so that the test continues to verify the behavior we want it to.
Author: Álvaro Herrera
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/20181203225019.2vvdef2ybnkxt364@alvherre.pgsql

Three issues are fixed in this patch:
- Base backups forgot to ignore files specific to EXEC_BACKEND, leading
to spurious warnings when checksums are enabled, per analysis from me.
- pg_verify_checksums forgot about files specific to EXEC_BACKEND,
leading to failures of the tool on any such build, particularly Windows.
This error was originally found by newly-introduced TAP tests in various
buildfarm members using EXEC_BACKEND.
- pg_verify_checksums forgot to count for temporary files and temporary
paths, which could be valid relation files, without checksums, per
report from Andres Freund. More tests are added to cover this case.
A new test case which emulates corruption for a file in a different
tablespace is added, coming from from Michael Banck, while I have coded
the main code and refactored the test code.
Author: Michael Banck, Michael Paquier
Reviewed-by: Stephen Frost, David Steele
Discussion: https://postgr.es/m/20181021134206.GA14282@paquier.xyz

This basically reverts commit d55241af705667d4503638e3f77d3689fd6be31,
leaving around a portion of the regression tests still adapted with
empty relation files, and corrupted cases. This is also proving to be
failing to check properly relation files located in a non-default
tablespace path.
Per discussion with various folks, including Stephen Frost, David
Steele, Andres Freund, Michael Banck and myself.
Reported-by: Michael Banck
Discussion: https://postgr.es/m/20181021134206.GA14282@paquier.xyz
Backpatch-through: 11

The source code comments documented this, but the user-facing docs, not
so much. Add a section to Appendix B that discusses it.
In passing, improve a couple other things in Appendix B --- notably,
a long-obsolete claim that time zone abbreviations are looked up in
a fixed table.
Per bug #15527 from Michael Davidson.
Discussion: https://postgr.es/m/15527-f1be0b4dc99ebbe7@postgresql.org

M doc/src/sgml/datetime.sgml

Ensure static libraries have correct mod time even if ranlib messes it up.

In at least Apple's version of ranlib, the output file is updated to have
a mod time equal to the max of the timestamps of its components, and that
data only has seconds precision. On a filesystem with sub-second file
timestamp precision --- say, APFS --- this can result in the finished
static library appearing older than its input files, which causes useless
rebuilds and possible outright failures in parallel makes.
We've only seen this reported in the field from people using Apple's
ranlib with a non-Apple make, because Apple's make doesn't know about
sub-second timestamps either so it doesn't decide rebuilds are needed.
But Apple's ranlib presumably shares code with at least some BSDen,
so it's not that unlikely that the same problem could arise elsewhere.
To fix, just "touch" the output file after ranlib finishes.
We seem to need this in only one place. There are other calls of
ranlib in our makefiles, but they are working on intermediate files
whose timestamps are not actually important, or else on an installed
static library for which sub-second timestamp precision is unlikely
to matter either. (Also, so far as I can tell, Apple's ranlib doesn't
mess up the file timestamp in the latter usage anyhow.)
In passing, change "ranlib" to "$(RANLIB)" in one place that was
bypassing the make macro for no good reason.
Per bug #15525 from Jack Kelly (via Alyssa Ross).
Back-patch to all supported branches.
Discussion: https://postgr.es/m/15525-a30da084f17a1faa@postgresql.org

This fixes an oversight from c6c3334 which forgot that if a subset of
WAL senders are stopping and in a sync state, other WAL senders could
still be waiting for a WAL position to be synced while committing a
transaction. However the subset of stopping senders would not release
waiters, potentially breaking synchronous replication guarantees. This
commit makes sure that even WAL senders stopping are able to release
waiters and are tracked properly.
On 9.4, this can also trigger an assertion failure when setting for
example max_wal_senders to 1 where a WAL sender is not able to find
itself as in synchronous state when the instance stops.
Reported-by: Paul Guo
Author: Paul Guo, Michael Paquier
Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com
Backpatch-through: 9.4

Move the responsibility for checking for and reporting a failure from
the only current BufFileSize() caller, logtape.c, to BufFileSize()
itself. Code within buffile.c is generally responsible for interfacing
with fd.c to report irrecoverable failures. This seems like a
convention that's worth sticking to.
Reorganizing things this way makes it easy to make the error message
raised in the event of BufFileSize() failure descriptive of the
underlying problem. We're now clear on the distinction between
temporary file name and BufFile name, and can show errno, confident that
its value actually relates to the error being reported. In passing, an
existing, similar buffile.c ereport() + errcode_for_file_access() site
is changed to follow the same conventions.
The API of the function BufFileSize() is changed by this commit, despite
already being in a stable release (Postgres 11). This seems acceptable,
since the BufFileSize() ABI was changed by commit aa551830421, which
hasn't made it into a point release yet. Besides, it's difficult to
imagine a third party BufFileSize() caller not just raising an error
anyway, since BufFile state should be considered corrupt when
BufFileSize() fails.
Per complaint from Tom Lane.
Discussion: https://postgr.es/m/26974.1540826748@sss.pgh.pa.us
Backpatch: 11-, where shared BufFiles were introduced.

Since commit 2f1d2b7a we have set PAM_RHOST to "[local]" for Unix
sockets. This caused Linux PAM's libaudit integration to make DNS
requests for that name. It's not exactly clear what value PAM_RHOST
should have in that case, but it seems clear that we shouldn't set it
to an unresolvable name, so don't do that.
Back-patch to 9.6. Bug #15520.
Author: Thomas Munro
Reviewed-by: Peter Eisentraut
Reported-by: Albert Schabhuetl
Discussion: https://postgr.es/m/15520-4c266f986998e1c5%40postgresql.org

During table rewrites (VACUUM FULL and CLUSTER), the main heap is logged
using XLOG / FPI records, and thus (correctly) ignored in decoding.
But the associated TOAST table is WAL-logged as plain INSERT records,
and so was logically decoded and passed to reorder buffer.
That has severe consequences with TOAST tables of non-trivial size.
Firstly, reorder buffer has to keep all those changes, possibly spilling
them to a file, incurring I/O costs and disk space.
Secondly, ReoderBufferCommit() was stashing all those TOAST chunks into
a hash table, which got discarded only after processing the row from the
main heap. But as the main heap is not decoded for rewrites, this never
happened, so all the TOAST data accumulated in memory, resulting either
in excessive memory consumption or OOM.
The fix is simple, as commit e9edc1ba already introduced infrastructure
(namely HEAP_INSERT_NO_LOGICAL flag) to skip logical decoding of TOAST
tables, but it only applied it to system tables. So simply use it for
all TOAST data in raw_heap_insert().
That would however solve only the memory consumption issue - the TOAST
changes would still be decoded and added to the reorder buffer, and
spilled to disk (although without TOAST tuple data, so much smaller).
But we can solve that by tweaking DecodeInsert() to just ignore such
INSERT records altogether, using XLH_INSERT_CONTAINS_NEW_TUPLE flag,
instead of skipping them later in ReorderBufferCommit().
Review: Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/flat/1a17c643-e9af-3dba-486b-fbe31bc1823a%402ndquadrant.com
Backpatch: 9.4-, where logical decoding was introduced

The function generated to perform JIT compiled tuple deforming failed
when HeapTupleHeader's t_hoff was bigger than a signed int8. I'd
failed to realize that LLVM's getelementptr would treat an int8 index
argument as signed, rather than unsigned. That means that a hoff
larger than 127 would result in a negative offset being applied. Fix
that by widening the index to 32bit.
Add a testcase with a wide table. Don't drop it, as it seems useful to
verify other tools deal properly with wide tables.
Thanks to Justin Pryzby for both reporting a bug and then reducing it
to a reproducible testcase!
Reported-By: Justin Pryzby
Author: Andres Freund
Discussion: https://postgr.es/m/20181115223959.GB10913@telsasoft.com
Backpatch: 11, just as jit compilation was

Unfortunately ac218aa4f6 missed the fact that a reference to
'pg_catalog.regnamespace'::regclass wouldn't work before that type is
known. Fix that, by replacing the regtype usage with a join to
pg_type.
Reported-By: Tom Lane
Author: Andres Freund
Discussion: https://postgr.es/m/8863.1543297423@sss.pgh.pa.us
Backpatch: 9.5-, like ac218aa4f6

When the regrole (0c90f6769) and regnamespace (cb9fa802b) types were
added in 9.5, pg_upgrade's check for reg* types wasn't updated. While
regrole currently is safe, regnamespace is not.
It seems unlikely that anybody uses regnamespace inside catalog tables
across a pg_upgrade, but the tests should be correct nevertheless.
While at it, reorder the types checked in the query to be
alphabetical. Otherwise it's annoying to compare existing and tested
for types.
Author: Andres Freund
Discussion: https://postgr.es/m/037e152a-cb25-3bcb-4f35-bdc9988f8204@2ndQuadrant.com
Backpatch: 9.5-, as regrole/regnamespace

latex_escaped_print() mistranslated \ and failed to provide any translation
for # ^ and ~, all of which would typically lead to LaTeX document syntax
errors. In addition it didn't translate < > and |, which would typically
render as unexpected characters.
To some extent this represents shortcomings in ancient versions of LaTeX,
which if memory serves had no easy way to render these control characters
as ASCII text. But that's been fixed for, um, decades. In any case there
is no value in emitting guaranteed-to-fail output for these characters.
Noted while fooling with test cases added by commit 9a98984f4. Back-patch
the code change to all supported versions.

Hstore data generated on pg 8.4 and pg_upgraded to current versions
remains in its original on-disk format unless modified. The same goes
for values generated by the addon hstore-new module on pre-9.0
versions. (The hstoreUpgrade function converts old values on the fly
when read in, but the on-disk value is not modified by this.)
Since old-format empty hstores (and hstore-new hstores) have
representations compatible with the new format, hstoreUpgrade thought
it could get away without modifying such values; but this breaks
hstore_hash (and the new hstore_hash_extended) which assumes
bit-perfect matching between semantically identical hstore values.
Only one bit actually differs (the "new version" flag in the count
field) but that of course is enough to break the hash.
Fix by making hstoreUpgrade unconditionally convert all old values to
new format.
Backpatch all the way, even though this changes a hash value in some
cases, because in those cases the hash value is already failing - for
example, a hash join between old- and new-format empty hstores will be
failing to match, or a hash index on an hstore column containing an
old-format empty value will be failing to find the value since it will
be searching for a hash derived from a new-format datum. (There are no
known field reports of this happening, probably because hashing of
hstores has only been useful in limited circumstances and there
probably isn't much upgraded data being used this way.)
Per concerns arising from discussion of commit eb6f29141be. Original
bug is my fault.
Discussion: https://postgr.es/m/60b1fd3b-7332-40f0-7e7f-f2f04f777747%402ndquadrant.com

ftoi4 and its sibling coercion functions did their overflow checks in
a way that looked superficially plausible, but actually depended on an
assumption that the MIN and MAX comparison constants can be represented
exactly in the float4 or float8 domain. That fails in ftoi4, ftoi8,
and dtoi8, resulting in a possibility that values near the MAX limit will
be wrongly converted (to negative values) when they need to be rejected.
Also, because we compared before rounding off the fractional part,
the other three functions threw errors for values that really ought
to get rounded to the min or max integer value.
Fix by doing rint() first (requiring an assumption that it handles
NaN and Inf correctly; but dtoi8 and ftoi8 were assuming that already),
and by comparing to values that should coerce to float exactly, namely
INTxx_MIN and -INTxx_MIN. Also remove some random cosmetic discrepancies
between these six functions.
This back-patches commits cbdb8b4c0 and 452b637d4. In the 9.4 branch,
also back-patch the portion of 62e2a8dc2 that added PG_INTnn_MIN and
related constants to c.h, so that these functions can rely on them.
Per bug #15519 from Victor Petrovykh.
Patch by me; thanks to Andrew Gierth for analysis and discussion.
Discussion: https://postgr.es/m/15519-4fc785b483201ff1@postgresql.org

1. Integer overflow in internal_size could result in memory corruption
in decompression since a zero-length array would be allocated and then
written to. This leads to crashes or corruption when traversing an
index which has been populated with sufficiently sparse values. Fix by
using int64 for computations and checking for overflow.
2. Integer overflow in g_int_compress could cause pessimal merge
choices, resulting in unnecessarily large ranges (which would in turn
trigger issue 1 above). Fix by using int64 again.
3. Even without overflow, array sizes could become large enough to
cause unexplained memory allocation errors. Fix by capping the sizes
to a safe limit and report actual errors pointing at gist__intbig_ops
as needed.
4. Large inputs to the compression function always consist of large
runs of consecutive integers, and the compression loop was processing
these one at a time in an O(N^2) manner with a lot of overhead. The
expected runtime of this function could easily exceed 6 months for a
single call as a result. Fix by performing a linear-time first pass,
which reduces the worst case to something on the order of seconds.
Backpatch all the way, since this has been wrong forever.
Per bug #15518 from report from irc user "dymk", analysis and patch by
me.
Discussion: https://postgr.es/m/15518-799e426c3b4f8358@postgresql.org

The documentation of CREATE/ALTER ROLE has been missing two things
related to PASSWORD:
- The password value provided needs to be quoted, some places of the
documentation marked the field with quotes, but not others, which led to
confusion.
- PASSWORD NULL was not provided consistently, with ENCRYPTED being not
compatible with it.
Reported-by: Steven Winfield
Author: Michael Paquier
Reviewed-by: David G. Johnston
Discussion: https://postgr.es/m/154282901979.1316.7418475422120496802@wrigleys.postgresql.org

populate_recordset_worker() failed to consider the possibility that the
supplied JSON data contains no rows, so that update_cached_tupdesc never
got called. This led to a null-pointer dereference since commit 9a5e8ed28;
before that it led to a bogus "set-valued function called in context that
cannot accept a set" error. Fix by forcing the update to happen.
Per bug #15514. Back-patch to v11 as 9a5e8ed28 was. (If we were excited
about the bogus error, we could perhaps go back further, but it'd take more
work to figure out how to fix it in older branches. Given the lack of
field complaints about that aspect, I'm not excited.)
Discussion: https://postgr.es/m/15514-59d5b4c4065b178b@postgresql.org

Documenting INCLUDE in the section about unique indexes is confusing,
as complained of by Emilio Platzer. Furthermore, it entirely failed
to explain why you might want to use the feature. The section about
index-only scans is really the right place; it already talked about
making such things the hard way. Rewrite that text to describe INCLUDE
as the normal way to make a covering index.
Also, move that section up a couple of places, as it now seems more
important than some of the stuff we had before it. It still has to
be after expression and partial indexes, since otherwise some of it
would involve forward references.
Discussion: https://postgr.es/m/154031939560.30897.14677735588262722042@wrigleys.postgresql.org

Per POSIX, WIFSIGNALED and related macros are provided by <sys/wait.h>.
Apparently on Linux they're also pulled in by some other inclusions,
but BSD-ish systems are pickier. Fixes portability issue in ffa4cbd62.
Per buildfarm.

Previously, any program launched by COPY TO/FROM PROGRAM inherited the
server's setting of SIGPIPE handling, i.e. SIG_IGN. Hence, if we were
doing COPY FROM PROGRAM and closed the pipe early, the child process
would see EPIPE on its output file and typically would treat that as
a fatal error, in turn causing the COPY to report error. Similarly,
one could get a failure report from a query that didn't read all of
the output from a contrib/file_fdw foreign table that uses file_fdw's
PROGRAM option.
To fix, ensure that child programs inherit SIG_DFL not SIG_IGN processing
of SIGPIPE. This seems like an all-around better situation since if
the called program wants some non-default treatment of SIGPIPE, it would
expect to have to set that up for itself. Then in COPY, if it's COPY
FROM PROGRAM and we stop reading short of detecting EOF, treat a SIGPIPE
exit from the called program as a non-error condition. This still allows
us to report an error for any case where the called program gets SIGPIPE
on some other file descriptor.
As coded, we won't report a SIGPIPE if we stop reading as a result of
seeing an in-band EOF marker (e.g. COPY BINARY EOF marker). It's
somewhat debatable whether we should complain if the called program
continues to transmit data after an EOF marker. However, it seems like
we should avoid throwing error in any questionable cases, especially in a
back-patched fix, and anyway it would take additional code to make such
an error get reported consistently.
Back-patch to v10. We could go further back, since COPY FROM PROGRAM
has been around awhile, but AFAICS the only way to reach this situation
using core or contrib is via file_fdw, which has only supported PROGRAM
sources since v10. The COPY statement per se has no feature whereby
it'd stop reading without having hit EOF or an error already. Therefore,
I don't see any upside to back-patching further that'd outweigh the
risk of complaints about behavioral change.
Per bug #15449 from Eric Cyr.
Patch by me, review by Etsuro Fujita and Kyotaro Horiguchi
Discussion: https://postgr.es/m/15449-1cf737dd5929450e@postgresql.org

Calling AC_CHECK_DECLS before we've finished setting up the compiler's
CFLAGS seems like a pretty risky proposition, especially now that the
first use of that macro will result in a test to see whether the compiler
gives warning or error for undeclared built-in functions. That answer
could very easily get changed later than where PGAC_LLVM_SUPPORT is
called; furthermore, it's hardly unlikely that flags such as -D_GNU_SOURCE
could change visibility of declarations. Hence, be a little less cavalier
about where to do LLVM-related tests. This results in v11 and HEAD doing
the warning-or-error check at the same place in the script as older
branches are doing it, which seems like a good thing.
Per further thought about commits 0b59b0e8b and 16fbac39f.

The test case that Autoconf uses to discover whether a function has
been declared doesn't work reliably with clang, because clang reports
a warning not an error if the name is a known built-in function.
On some platforms, this results in a lot of compile-time warnings about
strlcpy and related functions not having been declared.
There is a fix for this (by Noah Misch) in the upstream Autoconf sources,
but since they've not made a release in years and show no indication of
doing so anytime soon, let's just absorb their fix directly. We can
revert this when and if we update to a newer Autoconf release.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/26819.1542515567@sss.pgh.pa.us

This didn't actually work: COPY would fail to flush the right files, and
instead would try to flush a non-existing file, causing the whole
transaction to fail.
Cope by raising an error as soon as the command is sent instead, to
avoid a nasty later surprise. Of course, it would be much better to
make it work, but we don't have a patch for that yet, and we don't know
if we'll want to backpatch one when we do.
Reported-by: Tomas Vondra
Author: David Rowley
Reviewed-by: Amit Langote, Steve Singer, Tomas Vondra

On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure. In that case the only remaining copy of the
data is in the WAL. A subsequent fsync() could appear to succeed,
but not have flushed the data. That means that a future checkpoint
could apparently complete successfully but have lost data.
Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure. Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.
Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses. If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail. The GUC defaults
to off, meaning that we panic.
Back-patch to all supported releases.
There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error. A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.
Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de

Any Autoconf macro that uses AC_REQUIRES -- directly or indirectly --
must not be inside a plain shell "if" test; if it is, whatever code
gets pulled in by the AC_REQUIRES will also be inside that "if".
Instead of "if" we can use AS_IF, which knows how to get this right
(cf commit 01051a987).
The only immediate problem from getting this wrong was that AC_PROG_AWK
had to be run twice, once inside the "if llvm" block and once in the
main line. However, it broke a different patch I'm about to submit
more thoroughly.

wcsrtombs (called through wchar2char from common functions like lower,
upper, etc.) uses various optimizations that may look like access to
uninitialized data, triggering valgrind reports.
For example AVX2 instructions load data in 256-bit chunks, and gconv
does something similar with 32-bit chunks. This is faster than accessing
the bytes one by one, and the uninitialized part of the buffer is not
actually used. So suppress the bogus reports.
The exact stack depends on possible optimizations - it might be AVX, SSE
(as in the report by Aleksander Alekseev) or something else. Hence the
last frame is wildcarded, to deal with this.
Backpatch all the way back to 9.4.
Author: Tomas Vondra
Discussion: https://www.postgresql.org/message-id/flat/90ac0452-e907-e7a4-b3c8-15bd33780e62%402ndquadrant.com
Discussion: https://www.postgresql.org/message-id/20180220150838.GD18315@e733.localdomain

The comments above the PartitionedRelPruneInfo struct incorrectly
document how subplan_map and subpart_map are indexed. This seems to
have snuck in on 4e232364033.
Author: David Rowley <david.rowley@2ndquadrant.com>

The reformat_dat_file.pl script, added by 372728b0d49552641, supported
all the necessary options to make it work in a VPATH build, but the
makefile invocations didn't take VPATH into account. Fix that.
Discussion: https://postgr.es/m/20181115185303.d2z7wonx23mdfvd3@alap3.anarazel.de
Backpatch: 11-, where 372728b0d49552641 was merged

Commit 5373bc2a08 has added type for background workers but forgot to
update at one place in the documentation.
Reported-by: John Naylor
Author: John Naylor
Reviewed-by: Amit Kapila
Backpatch-through: 11
Discussion: https://postgr.es/m/CAJVSVGVmvgJ8Lq4WBxC3zV5wf0txdCqRSgkWVP+jaBF=HgWscA@mail.gmail.com

BufFileSize() can't use off_t, because it's only 32 bits wide on
some systems. BufFile objects can have many 1GB segments so the
total size can exceed 2^31. The only known client of the function
is parallel CREATE INDEX, which was reported to fail when building
large indexes on Windows.
Though this is technically an ABI break on platforms with a 32 bit
off_t and we might normally avoid back-patching it, the function is
brand new and thus unlikely to have been discovered by extension
authors yet, and it's fairly thoroughly broken on those platforms
anyway, so just fix it.
Defect in 9da0cc35. Bug #15460. Back-patch to 11, where this
function landed.
Author: Thomas Munro
Reported-by: Paul van der Linden, Pavel Oskin
Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/15460-b6db80de822fa0ad%40postgresql.org
Discussion: https://postgr.es/m/CAHDGBJP_GsESbTt4P3FZA8kMUKuYxjg57XHF7NRBoKnR%3DCAR-g%40mail.gmail.com

This hasn't been correct since 9.3 added "latex-longtable".
I left the phraseology "Unique abbreviations are allowed" alone.
It's correct as far as it goes, and we are studiously refraining
from specifying exactly what happens if you give a non-unique
abbreviation. (The answer in the back branches is "you get a
backwards-compatible choice", and the answer in HEAD will shortly
be "you get an error", but there seems no need to mention such
details here.)
Daniel Vérité
Discussion: https://postgr.es/m/cb7e1caf-3ea6-450d-af28-f524903a030c@manitou-mail.org

In commit ecfd55795, I removed sqlda.c's checks for ndigits != 0 on the
grounds that we should duplicate the state of the numeric value's digit
buffer even when all the digits are zeroes. However, that still isn't
quite right, because another possible state of the digit buffer is
buf == digits == NULL (this occurs for a NaN). As the code now stands,
it'll invoke memcpy with a NULL source address and zero bytecount,
which we know a few platforms crash on. Hence, reinstate the no-copy
short-circuit, but make it test specifically for buf != NULL rather than
some other condition. In hindsight, the ndigits test (added by commit
f2ae9f9c3) was almost certainly meant to fix the NaN case not the
all-zeroes case as the associated thread alleged.
As before, back-patch to all supported versions.
Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905C71161@g01jpexmbkw24

If a failure happens when a transaction is starting between the moment
the transaction status is changed from TRANS_DEFAULT to TRANS_START and
the moment the current user ID and security context flags are fetched
via GetUserIdAndSecContext(), or before initializing its basic fields,
then those may get reset to incorrect values when the transaction
aborts, leaving the session in an inconsistent state.
One problem reported is that failing a starting transaction at the first
query of a session could cause several kinds of system crashes on the
follow-up queries.
In order to solve that, move the initialization of the transaction state
fields and the call of GetUserIdAndSecContext() in charge of fetching
the current user ID close to the point where the transaction status is
switched to TRANS_START, where there cannot be any error triggered
in-between, per an idea of Tom Lane. This properly ensures that the
current user ID, the security context flags and that the basic fields of
TransactionState remain consistent even if the transaction fails while
starting.
Reported-by: Richard Guo
Diagnosed-By: Richard Guo
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAN_9JTxECSb=pEPcb0a8d+6J+bDcOZ4=DgRo_B7Y5gRHJUM=Rw@mail.gmail.com
Backpatch-through: 9.4

Numeric values with leading zeroes were incorrectly copied into a
SQLDA (SQL Descriptor Area), leading to wrong results in ECPG programs.
Report and patch by Daisuke Higuchi. Back-patch to all supported
versions.
Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905C71161@g01jpexmbkw24

A table with OIDs that was the first in the dump output would not get
dumped with OIDs enabled. Fix that.
The reason was that the currWithOids flag was declared to be bool but
actually also takes a -1 value for "don't know yet". But under
stdbool.h semantics, that is coerced to true, so the required SET
default_with_oids command is not output again. Change the variable
type to char to fix that.
Reported-by: Derek Nelson <derek@pipelinedb.com>

Commits 0e141c0fbb and baaf272ac9 introduced initialization of atomic
variables in InitProcess which means that it's not safe to look at those
for backends that aren't currently in use. Fix that by initializing them
during postmaster startup.
Reported-by: Andres Freund
Author: Amit Kapila
Backpatch-through: 9.6
Discussion: https://postgr.es/m/20181027104138.qmbbelopvy7cw2qv@alap3.anarazel.de

Coverty reports a possible buffer overrun in the code that populates the
pg_hba_file_rules view. It may not be a live bug due to restrictions
on options that can be used together, but let's increase MAX_HBA_OPTIONS
and correct a nearby misleading comment.
Back-patch to 10 where this code arrived.
Reported-by: Julian Hsiao
Discussion: https://postgr.es/m/CADnGQpzbkWdKS2YHNifwAvX5VEsJ5gW49U4o-7UL5pzyTv4vTg%40mail.gmail.com

classify_index_clause_usage() is O(N^2) in the number of distinct index
qual clauses it considers, because of its use of a simple search list to
store them. For nearly all queries, that's fine because only a few clauses
will be considered. But Alexander Kuzmenkov reported a machine-generated
query with 80000 (!) index qual clauses, which caused this code to take
forever. Somewhat remarkably, this is the only O(N^2) behavior we now
have for such a query, so let's fix it.
We can get rid of the O(N^2) runtime for cases like this without much
damage to the functionality of choose_bitmap_and() by separating out
paths with "too many" qual or pred clauses, and deeming them to always
be nonredundant with other paths. Then their clauses needn't go into
the search list, so it doesn't get too long, but we don't lose the
ability to consider bitmap AND plans altogether. I set the threshold
for "too many" to be 100 clauses per path, which should be plenty to
ensure no change in planning behavior for normal queries.
There are other things we could do to make this go faster, but it's not
clear that it's worth any additional effort. 80000 qual clauses require
a whole lot of work in many other places, too.
The code's been like this for a long time, so back-patch to all supported
branches. The troublesome query only works back to 9.5 (in 9.4 it fails
with stack overflow in the parser); so I'm not sure that fixing this in
9.4 has any real-world benefit, but perhaps it does.
Discussion: https://postgr.es/m/90c5bdfa-d633-dabe-9889-3cf3e1acd443@postgrespro.ru

Before this change the docs weren't adapted to the fact that
wal_segment_size is now measured in bytes, rather than multiples of
wal_block_size.
Author: David Steele
Discussion: https://postgr.es/m/68ea97d6-2ed9-f339-e57d-ab3a33caf3b1@pgmasters.net
Backpatch: 11-, like fc49e24fa69 itself.

Commit 15c729347 was a couple bricks shy of a load: we need to
ensure that expr->plan gets reset to NULL on any error exit,
if it's not supposed to be saved. Also ensure that the
stmt->target calculation gets redone if needed.
The easy way to exhibit a problem is to set up code that
violates the writable-argument restriction and then execute
it twice. But error exits out of, eg, setup_param_list()
could also break it. Make the existing PG_TRY block cover
all of that code to be sure.
Per report from Pavel Stehule.
Discussion: https://postgr.es/m/CAFj8pRAeXNTO43W2Y0Cn0YOVFPv1WpYyOqQrrzUiN6s=dn7gCg@mail.gmail.com

This patch fixes several related cases in which pg_shdepend entries were
never made, or were lost, for references to roles appearing in the ACLs of
schemas and/or types. While that did no immediate harm, if a referenced
role were later dropped, the drop would be allowed and would leave a
dangling reference in the object's ACL. That still wasn't a big problem
for normal database usage, but it would cause obscure failures in
subsequent dump/reload or pg_upgrade attempts, taking the form of
attempts to grant privileges to all-numeric role names. (I think I've
seen field reports matching that symptom, but can't find any right now.)
Several cases are fixed here:
1. ALTER DOMAIN SET/DROP DEFAULT would lose the dependencies for any
existing ACL entries for the domain. This case is ancient, dating
back as far as we've had pg_shdepend tracking at all.
2. If a default type privilege applies, CREATE TYPE recorded the
ACL properly but forgot to install dependency entries for it.
This dates to the addition of default privileges for types in 9.2.
3. If a default schema privilege applies, CREATE SCHEMA recorded the
ACL properly but forgot to install dependency entries for it.
This dates to the addition of default privileges for schemas in v10
(commit ab89e465c).
Another somewhat-related problem is that when creating a relation
rowtype or implicit array type, TypeCreate would apply any available
default type privileges to that type, which we don't really want
since such an object isn't supposed to have privileges of its own.
(You can't, for example, drop such privileges once they've been added
to an array type.)
ab89e465c is also to blame for a race condition in the regression tests:
privileges.sql transiently installed globally-applicable default
privileges on schemas, which sometimes got absorbed into the ACLs of
schemas created by concurrent test scripts. This should have resulted
in failures when privileges.sql tried to drop the role holding such
privileges; but thanks to the bug fixed here, it instead led to dangling
ACLs in the final state of the regression database. We'd managed not to
notice that, but it became obvious in the wake of commit da906766c, which
allowed the race condition to occur in pg_upgrade tests.
To fix, add a function recordDependencyOnNewAcl to encapsulate what
callers of get_user_default_acl need to do; while the original call
sites got that right via ad-hoc code, none of the later-added ones
have. Also change GenerateTypeDependencies to generate these
dependencies, which requires adding the typacl to its parameter list.
(That might be annoying if there are any extensions calling that
function directly; but if there are, they're most likely buggy in the
same way as the core callers were, so they need work anyway.) While
I was at it, I changed GenerateTypeDependencies to accept most of its
parameters in the form of a Form_pg_type pointer, making its parameter
list a bit less unwieldy and mistake-prone.
The test race condition is fixed just by wrapping the addition and
removal of default privileges into a single transaction, so that that
state is never visible externally. We might eventually prefer to
separate out tests of default privileges into a script that runs by
itself, but that would be a bigger change and would make the tests
run slower overall.
Back-patch relevant parts to all supported branches.
Discussion: https://postgr.es/m/15719.1541725287@sss.pgh.pa.us

This commit fixes a set of issues with ON COMMIT actions when used on
partitioned tables and tables with inheritance children:
- Applying ON COMMIT DROP on a partitioned table with partitions or on a
table with inheritance children caused a failure at commit time, with
complains about the children being already dropped as all relations are
dropped one at the same time.
- Applying ON COMMIT DELETE on a partition relying on a partitioned
table which uses ON COMMIT DROP would cause the partition truncation to
fail as the parent is removed first.
The solution to the first problem is to handle the removal of all the
dependencies in one go instead of dropping relations one-by-one, based
on a suggestion from Álvaro Herrera. So instead all the relation OIDs
to remove are gathered and then processed in one round of multiple
deletions.
The solution to the second problem is to reorder the actions, with
truncation happening first and relation drop done after. Even if it
means that a partition could be first truncated, then immediately
dropped if its partitioned table is dropped, this has the merit to keep
the code simple as there is no need to do existence checks on the
relations to drop.
Contrary to a manual TRUNCATE on a partitioned table, ON COMMIT DELETE
does not cascade to its partitions. The ON COMMIT action defined on
each partition gets the priority.
Author: Michael Paquier
Reviewed-by: Amit Langote, Álvaro Herrera, Robert Haas
Discussion: https://postgr.es/m/68f17907-ec98-1192-f99f-8011400517f5@lab.ntt.co.jp
Backpatch-through: 10

Previously it was possible to set client_min_messages to FATAL or PANIC,
which had the effect of suppressing transmission of regular ERROR messages
to the client. Perhaps that seemed like a useful option in the past, but
the trouble with it is that it breaks guarantees that are explicitly made
in our FE/BE protocol spec about how a query cycle can end. While libpq
and psql manage to cope with the omission, that's mostly because they
are not very bright; client libraries that have more semantic knowledge
are likely to get confused. Notably, pgODBC doesn't behave very sanely.
Let's fix this by getting rid of the ability to set client_min_messages
above ERROR.
In HEAD, just remove the FATAL and PANIC options from the set of allowed
enum values for client_min_messages. (This change also affects
trace_recovery_messages, but that's OK since these aren't useful values
for that variable either.)
In the back branches, there was concern that rejecting these values might
break applications that are explicitly setting things that way. I'm
pretty skeptical of that argument, but accommodate it by accepting these
values and then internally setting the variable to ERROR anyway.
In all branches, this allows a couple of tiny simplifications in the
logic in elog.c, so do that.
Also respond to the point that was made that client_min_messages has
exactly nothing to do with the server's logging behavior, and therefore
does not belong in the "When To Log" subsection of the documentation.
The "Statement Behavior" subsection is a better match, so move it there.
Jonah Harris and Tom Lane
Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us
Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org

The original code to propagate NOT NULL and default expressions
specified when creating a partition was mostly copy-pasted from
typed-tables creation, but not being a great match it contained some
duplicity, inefficiency and bugs.
This commit fixes the bug that NOT NULL constraints declared in the
parent table would not be honored in the partition. One reported issue
that is not fixed is that a DEFAULT declared in the child is not used
when inserting through the parent. That would amount to a behavioral
change that's better not back-patched.
This rewrite makes the code simpler:
1. instead of checking for duplicate column names in its own block,
reuse the original one that already did that;
2. instead of concatenating the list of columns from parent and the one
declared in the partition and scanning the result to (incorrectly)
propagate defaults and not-null constraints, just scan the latter
searching the former for a match, and merging sensibly. This works
because we know the list in the parent is already correct and there can
only be one parent.
This rewrite makes ColumnDef->is_from_parent unused, so it's removed
on branch master; on released branches, it's kept as an unused field in
order not to cause ABI incompatibilities.
This commit also adds a test case for creating partitions with
collations mismatching that on the parent table, something that is
closely related to the code being patched. No code change is introduced
though, since that'd be a behavior change that could break some (broken)
working applications.
Amit Langote wrote a less invasive fix for the original
NOT NULL/defaults bug, but while I kept the tests he added, I ended up
not using his original code. Ashutosh Bapat reviewed Amit's fix. Amit
reviewed mine.
Author: Álvaro Herrera, Amit Langote
Reviewed-by: Ashutosh Bapat, Amit Langote
Reported-by: Jürgen Strobel (bug #15212)
Discussion: https://postgr.es/m/152746742177.1291.9847032632907407358@wrigleys.postgresql.org