lo_put() surely should require UPDATE permission, the same as lowrite(),
but it failed to check for that, as reported by Chapman Flack. Oversight
in commit c50b7c09d; backpatch to 9.4 where that was introduced.
Tom Lane and Michael Paquier
Security: CVE-2017-7548

Commit 3eefc51053f250837c3115c12f8119d16881a2d7 claimed to make
pg_user_mappings enforce the qualifications user_mapping_options had
been enforcing, but its removal of a longstanding restriction left them
distinct when the current user is the subject of a mapping yet has no
server privileges. user_mapping_options emits no rows for such a
mapping, but pg_user_mappings includes full umoptions. Change
pg_user_mappings to show null for umoptions. Back-patch to 9.2, like
the above commit.
Reviewed by Tom Lane. Reported by Jeff Janes.
Security: CVE-2017-7547

Some authentication methods allowed it, others did not. In the client-side,
libpq does not even try to authenticate with an empty password, which makes
using empty passwords hazardous: an administrator might think that an
account with an empty password cannot be used to log in, because psql
doesn't allow it, and not realize that a different client would in fact
allow it. To clear that confusion and to be be consistent, disallow empty
passwords in all authentication methods.
All the authentication methods that used plaintext authentication over the
wire, except for BSD authentication, already checked that the password
received from the user was not empty. To avoid forgetting it in the future
again, move the check to the recv_password_packet function. That only
forbids using an empty password with plaintext authentication, however.
MD5 and SCRAM need a different fix:
* In stable branches, check that the MD5 hash stored for the user does not
not correspond to an empty string. This adds some overhead to MD5
authentication, because the server needs to compute an extra MD5 hash, but
it is not noticeable in practice.
* In HEAD, modify CREATE and ALTER ROLE to clear the password if an empty
string, or a password hash that corresponds to an empty string, is
specified. The user-visible behavior is the same as in the stable branches,
the user cannot log in, but it seems better to stop the empty password from
entering the system in the first place. Secondly, it is fairly expensive to
check that a SCRAM hash doesn't correspond to an empty string, because
computing a SCRAM hash is much more expensive than an MD5 hash by design,
so better avoid doing that on every authentication.
We could clear the password on CREATE/ALTER ROLE also in stable branches,
but we would still need to check at authentication time, because even if we
prevent empty passwords from being stored in pg_authid, there might be
existing ones there already.
Reported by Jeroen van der Ham, Ben de Graaff and Jelte Fennema.
Security: CVE-2017-7546

The callers for GetOldestSafeDecodingTransactionId() all inverted the
argument for the argument introduced in 2bef06d516460. Luckily this
appears to be inconsequential for the moment, as we wait for
concurrent in-progress transaction when assembling a
snapshot. Additionally this could only make a difference when adding a
second logical slot, because only a pre-existing slot could cause an
issue by lowering the returned xid dangerously much.
Reported-By: Antonin Houska
Discussion: https://postgr.es/m/32704.1496993134@localhost
Backport: 9.4-, where 2bef06d516460 was backpatched to.

We don't actually support session tickets, since we do not create an SSL
session identifier. But it seems that OpenSSL will issue a session ticket
on-demand anyway, which will then fail when used. This results in
reconnection failures when using ticket-aware client-side SSL libraries
(such as the Npgsql .NET driver), as reported by Shay Rojansky.
To fix, just tell OpenSSL not to issue tickets. At some point in the
far future, we might consider enabling tickets instead. But the security
implications of that aren't entirely clear; and besides it would have
little benefit except for very short-lived database connections, which is
Something We're Bad At anyhow. It would take a lot of other work to get
to a point where that would really be an exciting thing to do.
While at it, also tell OpenSSL not to use a session cache. This doesn't
really do anything, since a backend would never populate the cache anyway,
but it might gain some micro-efficiencies and/or reduce security
exposures.
Patch by me, per discussion with Heikki Linnakangas and Shay Rojansky.
Back-patch to all supported versions.
Discussion: https://postgr.es/m/CADT4RqBU8N-csyZuzaook-c795dt22Zcwg1aHWB6tfVdAkodZA@mail.gmail.com

Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end,
materialized view refreshes were occurring while the permissions on
referenced objects were still at defaults. This led to failures if,
say, an MV owned by user A reads from a table owned by user B, even
if B had granted the necessary privileges to A. We've had multiple
complaints about that type of restore failure, most recently from
Jordan Gigov.
The ideal fix for this would be to start treating ACLs as dependency-
sortable objects, rather than hard-wiring anything about their dump order
(the existing approach is a messy kluge dating to commit dc0e76ca3).
But that's going to be a rather major change, and it certainly wouldn't
lead to a back-patchable fix. As a short-term solution, convert the
existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack,
ie, normal objects then ACLs then matview refreshes. Because this happens
in RestoreArchive(), it will also fix the problem when restoring from an
existing archive-format dump.
(Note this means that if a matview refresh would have failed under the
permissions prevailing at dump time, it'll fail during restore as well.
We'll define that as user error rather than something we should try
to work around.)
To avoid performance loss in parallel restore, we need the matview
refreshes to still be parallelizable. Hence, clean things up enough
so that both ACLs and matviews are handled by the parallel restore
infrastructure, instead of reverting back to serial restore for ACLs.
There is still a final serial step, but it shouldn't normally have to
do anything; it's only there to try to recover if we get stuck due to
some problem like unresolved circular dependencies.
Patch by me, but it owes something to an earlier attempt by Kevin Grittner.
Back-patch to 9.3 where materialized views were introduced.
Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us

Commit 4d57e8381677 added support for getting I/O errors out of zlib,
but it introduced a portability problem for systems without zlib.
Repair by wrapping the zlib call inside #ifdef and restore the original
code in the other branch.
This serves to illustrate the inadequacy of the zlib abstraction in
pg_backup_archiver: there is no way to call gzerror() in that
abstraction. This means that the several places that call GZREAD and
GZWRITE are currently doing error reporting wrongly, but ENOTIME to get
it fixed before next week's release set.
Backpatch to 9.4, like the commit that introduced the problem.

For a table, the constraint can be considered validated immediately,
because the table must be empty. But for a foreign table this is
not necessarily the case.
Fixes a bug in commit f27a6b15e6566fba7748d0d9a3fc5bcfd52c4a1b.
Amit Langote, with some changes by me.
Discussion: http://postgr.es/m/d2b7419f-4a71-cf86-cc99-bfd0f359a1ea@lab.ntt.co.jp

Some error reports were reporting strerror(errno), which for some error
conditions coming from zlib are wrong, resulting in confusing reports
such as
pg_restore: [compress_io] could not read from input file: Success
which makes no sense. To correctly extract the error message we need to
use gzerror(), so let's do that.
This isn't as comprehensive or as neat as I would like, but at least it
should improve things in many common cases. The zlib abstraction in
compress_io does not seem to be applied consistently enough; we could
perhaps improve that, but it seems master-only material, not a bug fix
for back-patching.
This problem goes back all the way, but I decided to apply back to 9.4
only, because older branches don't contain commit 14ea89366 which this
change depends on.
Authors: Vladimir Kunschikov, Álvaro Herrera
Discussion: https://postgr.es/m/1498120508308.9826@infotecs.ru

FlightAware is still maintaining this, and indeed is seemingly being
more active with it than the pgtclng fork is. List both, for the
time being anyway.
In the back branches, also back-port commit e20f679f6 and other
recent updates to the client-interfaces list. I think these are
probably of current interest to users of back branches. I did
not touch the list of externally maintained PLs in the back
branches, though. Those are much more likely to be server version
sensitive, and I don't know which of these PLs work all the way back.
Discussion: https://postgr.es/m/20170730162612.1449.58796@wrigleys.postgresql.org

init_htab(), with #define HASH_DEBUG, prints a bunch of hashtable
parameters. It used to also print nentries, but commit 44ca4022f changed
that to "hash_get_num_entries(hctl)", which is wrong (the parameter should
be "hashp").
Rather than correct the coding, though, let's just remove that field from
the printout. The table must be empty, since we just finished building
it, so expensively calculating the number of entries is rather pointless.
Moreover hash_get_num_entries makes assumptions (about not needing locks)
which we could do without in debugging code.
Noted by Choi Doo-Won in bug #14764. Back-patch to 9.6 where the
faulty code was introduced.
Discussion: https://postgr.es/m/20170802032353.8424.12274@wrigleys.postgresql.org

XLByteToSeg and XLByteToPrevSeg calculate only a segment number. The
definition of these macros were modified by commit
dfda6ebaec6763090fb78b458a979b558c50b39b but the comment remain
unchanged.
Patch by Yugo Nagata. Back patched to 9.3 and beyond.

Previously the docs just said "5.8 or later". Experimentation shows
that while you can build on Unix from a git checkout with 5.8.0,
compiling recent PL/Perl requires at least 5.8.1, and you won't be
able to run the TAP tests with less than 5.8.3 because that's when
they added "prove". (I do not have any information on just what the
MSVC build scripts require.)
Since all these versions are quite ancient, let's not split hairs
in the docs, but just say that 5.8.3 is the minimum requirement.
Discussion: https://postgr.es/m/16894.1501392088@sss.pgh.pa.us

Back-patch of commit 3c163a7fc76debbbdad1bdd3c43721cffe72f4db,
which see for more info.
Also throw in commit b4cc35fbb709bd6fcae8998f041fd731c9acbf42,
so Coverity doesn't whine about the back branches.
Ashutosh Sharma, some adjustments by me
Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com

Back-patch of commit bebe174bb4462ef079a1d7eeafb82ff969f160a4,
which see for more info.
Patch by me, with some help from Ashutosh Sharma
Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com

After typing CREATE USER M..., it would not fill in MAPPING FOR,
even though that was clearly intended behavior.
Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1wo2iQ6jWnN=egqOb5NxEPn0PpANEtKHr3uPooQ+nYPtw@mail.gmail.com

Fix assorted places that had not bothered with the convention of
prefixing catalog and function names with "pg_catalog.". That
could possibly result in query failure when running with a nondefault
search_path. Also fix two places that weren't quoting OID literals.
I think the latter hasn't mattered much since about 7.3, but it's still
a bad idea to be doing it in 99 places and not in 2 others.
Also remove a useless EXISTS sub-select that someone had stuck into
describeOneTableDetails' queries for child tables. We just got the OID
out of pg_class, so I hardly see how checking that it exists in pg_class
was doing anything helpful.
In passing, try to improve the emitted formatting of a couple of
these queries, though I didn't work really hard on that. And merge
unnecessarily duplicative coding in some other places.
Much of this was new in HEAD, but some was quite old; back-patch
as appropriate.

If several sessions are concurrently locking a tuple update chain with
nonconflicting lock modes using an old snapshot, and they all succeed,
it may happen that some of them fail because of restarting the loop (due
to a concurrent Xmax change) and getting an error in the subsequent pass
while trying to obtain a tuple lock that they already have in some tuple
version.
This can only happen with very high concurrency (where a row is being
both updated and FK-checked by multiple transactions concurrently), but
it's been observed in the field and can have unpleasant consequences
such as an FK check failing to see a tuple that definitely exists:
ERROR: insert or update on table "child_table" violates foreign key constraint "fk_constraint_name"
DETAIL: Key (keyid)=(123456) is not present in table "parent_table".
(where the key is observably present in the table).
Discussion: https://postgr.es/m/20170714210011.r25mrff4nxjhmf3g@alvherre.pgsql

This module becomes much more useful if we allow it to be used as base
class for external projects. To achieve this, change the exported
get_new_node function into a class method instead, and use the standard
Perl idiom of accepting the class as first argument. This method works
as expected for subclasses. The standalone function is kept for
backwards compatibility, though it could be removed in pg11.
Author: Chap Flackman, based on an earlier patch from Craig Ringer
Discussion: https://postgr.es/m/CAMsr+YF8kO+4+K-_U4PtN==2FndJ+5Bn6A19XHhMiBykEwv0wA@mail.gmail.com

Trading a little too heavily on letting the code path be the same whether
we were creating shared data structures or only attaching to them,
InitPredicateLocks() inserted the "scratch" PredicateLockTargetHash entry
unconditionally. This is just wrong if we're in a postmaster child,
which would only reach this code in EXEC_BACKEND builds. Most of the
time, the hash_search(HASH_ENTER) call would simply report that the
entry already existed, causing no visible effect since the code did not
bother to check for that possibility. However, if this happened while
some other backend had transiently removed the "scratch" entry, then
that other backend's eventual RestoreScratchTarget would suffer an
assert failure; this appears to be the explanation for a recent failure
on buildfarm member culicidae. In non-assert builds, there would be
no visible consequences there either. But nonetheless this is a pretty
bad bug for EXEC_BACKEND builds, for two reasons:
1. Each new backend would perform the hash_search(HASH_ENTER) call
without holding any lock that would prevent concurrent access to the
PredicateLockTargetHash hash table. This creates a low but certainly
nonzero risk of corruption of that hash table.
2. In the event that the race condition occurred, by reinserting the
scratch entry too soon, we were defeating the entire purpose of the
scratch entry, namely to guarantee that transaction commit could move
hash table entries around with no risk of out-of-memory failure.
The odds of an actual OOM failure are quite low, but not zero, and if
it did happen it would again result in corruption of the hash table.
The user-visible symptoms of such corruption are a little hard to predict,
but would presumably amount to misbehavior of SERIALIZABLE transactions
that'd require a crash or postmaster restart to fix.
To fix, just skip the hash insertion if IsUnderPostmaster. I also
inserted a bunch of assertions that the expected things happen
depending on whether IsUnderPostmaster is true. That might be overkill,
since most comparable code in other functions isn't quite that paranoid,
but once burnt twice shy.
In passing, also move a couple of lines to places where they seemed
to make more sense.
Diagnosis of problem by Thomas Munro, patch by me. Back-patch to
all supported branches.
Discussion: https://postgr.es/m/10593.1500670709@sss.pgh.pa.us

If the user modifies a view that has CHECK OPTIONs and this gets
translated into a modification to an underlying relation which happens
to be a foreign table, the check options should be enforced. In the
normal code path, that was happening properly, but it was not working
properly for "direct" modification because the whole operation gets
pushed to the remote side in that case and we never have an option to
enforce the constraint against individual tuples. Fix by disabling
direct modification when there is a need to enforce CHECK OPTIONs.
Etsuro Fujita, reviewed by Kyotaro Horiguchi and by me.
Discussion: http://postgr.es/m/f8a48f54-6f02-9c8a-5250-9791603171ee@lab.ntt.co.jp

Various cases involving renaming of view columns are handled by having
make_viewdef pass down the view's current relation tupledesc to
get_query_def, which then takes care to use the column names from the
tupledesc for the output column names of the SELECT. For some reason
though, we'd missed teaching make_ruledef to do similarly when it is
printing an ON SELECT rule, even though this is exactly the same case.
The results from pg_get_ruledef would then be different and arguably wrong.
In particular, this breaks pre-v10 versions of pg_dump, which in some
situations would define views by means of emitting a CREATE RULE ... ON
SELECT command. Third-party tools might not be happy either.
In passing, clean up some crufty code in make_viewdef; we'd apparently
modernized the equivalent code in make_ruledef somewhere along the way,
and missed this copy.
Per report from Gilles Darold. Back-patch to all supported versions.
Discussion: https://postgr.es/m/ec05659a-40ff-4510-fc45-ca9d965d0838@dalibo.com

pg_dump with the --clean option failed to emit DROP EVENT TRIGGER
commands for event triggers. In a closely related oversight,
it also did not emit ALTER OWNER commands for event triggers.
Since only superusers can create event triggers, the latter oversight
is of little practical consequence ... but if we're going to record
an owner for event triggers, then surely pg_dump should preserve it.
Per complaint from Greg Atkins. Back-patch to 9.3 where event triggers
were introduced.
Discussion: https://postgr.es/m/20170722191142.yi4e7tzcg3iacclg@gmail.com

When incrementally updating a file larger than 2GB, the old code could
either fail outright (if the client asked the server for bytes beyond
the 2GB boundary) or fail to copy all the blocks that had actually
been modified (if the server reported a file size to the client in
excess of 2GB), resulting in data corruption. Generally, such files
won't occur anyway, but they might if using a non-default segment size
or if there the directory contains stray files unrelated to
PostgreSQL. Fix by a more prudent choice of data types.
Even with these improvements, this code still uses a mix of different
types (off_t, size_t, uint64, int64) to represent file sizes and
offsets, not all of which necessarily have the same width or
signedness, so further cleanup might be in order here. However, at
least now they all have the potential to be 64 bits wide on 64-bit
platforms.
Kuntal Ghosh and Michael Paquier, with a tweak by me.
Discussion: http://postgr.es/m/CAGz5QC+8gbkz=Brp0TgoKNqHWTzonbPtPex80U0O6Uh_bevbaA@mail.gmail.com

The new test cases added in commit 8bf58c0d9 turn out to have output
that can vary depending on the lc_messages setting prevailing on the
test server. Hide the remote end's error messages to ensure stable
output. This isn't a terribly desirable solution; we'd rather know
that the connection failed for the expected reason and not some other
one. But there seems little choice for the moment.
Per buildfarm.
Discussion: https://postgr.es/m/18419.1500658570@sss.pgh.pa.us

As written, the code would only fail the sanity check if none of the
columns returned by the server were of the expected type, but we want
it to fail if even one column is not of the expected type.
Discussion: http://postgr.es/m/CA+TgmoYuY5zW7JEs+1hSS1D=V5K8h1SQuESrq=bMNeo0B71Sfw@mail.gmail.com

Previously, postgres_fdw would keep on using an existing connection even
if the user did ALTER SERVER or ALTER USER MAPPING commands that should
affect connection parameters. Teach it to watch for catcache invals
on these catalogs and re-establish connections when the relevant catalog
entries change. Per bug #14738 from Michal Lis.
In passing, clean up some rather crufty decisions in commit ae9bfc5d6
about where fields of ConnCacheEntry should be reset. We now reset
all the fields whenever we open a new connection.
Kyotaro Horiguchi, reviewed by Ashutosh Bapat and myself.
Back-patch to 9.3 where postgres_fdw appeared.
Discussion: https://postgr.es/m/20170710113917.7727.10247@wrigleys.postgresql.org

SLRU buffer lwlocks are allocated twice by oversight in commit
fe702a7b3f9f2bc5bf6d173166d7d55226af82c8 where that locks were moved to
separate tranche. The bug doesn't have user-visible effects except small
overspending of shared memory.
Backpatch to 9.6 where it was introduced.
Alexander Korotkov with small editorization by me.

Claiming that NATURAL JOIN is equivalent to CROSS JOIN when there are
no common column names is only strictly correct if it's an inner join;
you can't say e.g. CROSS LEFT JOIN. Better to explain it as meaning
JOIN ON TRUE, instead. Per a suggestion from David Johnston.
Discussion: https://postgr.es/m/CAKFQuwb+mYszQhDS9f_dqRrk1=Pe-S6D=XMkAXcDf4ykKPmgKQ@mail.gmail.com

Normally, a JoinExpr would have empty "quals" only if it came from CROSS
JOIN syntax. However, it's possible to get to this state by specifying
NATURAL JOIN between two tables with no common column names, and there
might be other ways too. The code previously printed no ON clause if
"quals" was empty; that's right for CROSS JOIN but syntactically invalid
if it's some type of outer join. Fix by printing ON TRUE in that case.
This got broken by commit 2ffa740be, which stopped using NATURAL JOIN
syntax in ruleutils output due to its brittleness in the face of
column renamings. Back-patch to 9.3 where that commit appeared.
Per report from Tushar Ahuja.
Discussion: https://postgr.es/m/98b283cd-6dda-5d3f-f8ac-87db8c76a3da@enterprisedb.com

We're throwing people into the guts of the syntax with not much context;
let's back up one step and point out that this goes inside a literal in
a CREATE FUNCTION command. Per suggestion from Kurt Kartaltepe.
Discussion: https://postgr.es/m/CACawnnyWAmH+au8nfZhLiFfWKjXy4d0kY+eZWfcxPRnjVfaa_Q@mail.gmail.com

It seems pretty confusing to have tests named both largeobject and
large_object. The latter is of very recent vintage (commit ff992c074),
so get rid of it in favor of merging into the former.
Also, enable the LO comment test that was added by commit 70ad7ed4e,
since the later commit added the then-missing pg_upgrade functionality.
The large_object.sql test case is almost completely redundant with that,
but not quite: it seems like creating a user-defined LO with an OID in
the system range might be an interesting case for pg_upgrade, so let's
keep it.
Like the earlier patch, back-patch to all supported branches.
Discussion: https://postgr.es/m/18665.1500306372@sss.pgh.pa.us

In the frontend Makefiles that pull in libpgfeutils, we'd generally
done it like this:
LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
That method is badly broken, as seen in bug #14742 from Chris Ruprecht.
The -L flag for src/fe_utils ends up being placed after whatever random
-L flags are in LDFLAGS already. That puts us at risk of pulling in
libpgfeutils.a from some previous installation rather than the freshly
built one in src/fe_utils. Also, the lack of an "override" is hazardous
if someone tries to specify some LDFLAGS on the make command line.
The correct way to do it is like this:
override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(LDFLAGS)
so that libpgfeutils, along with libpq, libpgport, and libpgcommon, are
guaranteed to be pulled in from the build tree and not from any referenced
system directory, because their -L flags will appear first.
In some places we'd been even lazier and done it like this:
LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq
which is subtly wrong in an additional way: on platforms where we can't
restrict the symbols exported by libpq.so, it allows libpgfeutils to
latch onto libpgport and libpgcommon symbols from libpq.so, rather than
directly from those static libraries as intended. This carries hazards
like those explained in the comments for the libpq_pgport macro.
In addition to fixing the broken libpgfeutils usages, I tried to
standardize on using $(libpq_pgport) like so:
override LDFLAGS := $(libpq_pgport) $(LDFLAGS)
even where libpgfeutils is not in the picture. This makes no difference
right now but will hopefully discourage future mistakes of the same ilk.
And it's more like the way we handle CPPFLAGS in libpq-using Makefiles.
In passing, just for consistency, make pgbench include PTHREAD_LIBS the
same way everyplace else does, ie just after LIBS rather than in some
random place in the command line. This might have practical effect if
there are -L switches in that macro on some platform.
It looks to me like the MSVC build scripts are not affected by this
error, but someone more familiar with them than I might want to double
check.
Back-patch to 9.6 where libpgfeutils was introduced. In 9.6, the hazard
this error creates is that a reinstallation might link to the prior
installation's copy of libpgfeutils.a and thereby fail to absorb a
minor-version bug fix.
Discussion: https://postgr.es/m/20170714125106.9231.13772@wrigleys.postgresql.org

When writing a backup to stdout with pg_basebackup on Windows, put stdout
to binary mode. Any CR bytes in the output will otherwise be output
incorrectly as CR+LF.
In the passing, standardize on using "_setmode" instead of "setmode", for
the sake of consistency. They both do the same thing, but according to
MSDN documentation, setmode is deprecated.
Fixes bug #14634, reported by Henry Boehlert. Patch by Haribabu Kommi.
Backpatch to all supported versions.
Discussion: https://www.postgresql.org/message-id/20170428082818.24366.13134@wrigleys.postgresql.org

The grammar will only accept something syntactically similar to a function
call in a function-in-FROM expression. However, there are various ways
to input something that ruleutils.c won't deparse that way, potentially
leading to a view or rule that fails dump/reload. Fix by inserting a
dummy CAST around anything that isn't going to deparse as a function
(which is one of the ways to get something like that in there in the
first place).
In HEAD, also make use of the infrastructure added by this to avoid
emitting unnecessary parentheses in CREATE INDEX deparsing. I did
not change that in back branches, thinking that people might find it
to be unexpected/unnecessary behavioral change.
In HEAD, also fix incorrect logic for when to add extra parens to
partition key expressions. Somebody apparently thought they could
get away with simpler logic than pg_get_indexdef_worker has, but
they were wrong --- a counterexample is PARTITION BY LIST ((a[1])).
Ignoring the prettyprint flag for partition expressions isn't exactly
a nice solution anyway.
This has been broken all along, so back-patch to all supported branches.
Discussion: https://postgr.es/m/10477.1499970459@sss.pgh.pa.us

The race condition goes like this:
1. GetNewTransactionId advances nextXid e.g. from 100 to 101
2. GetOldestActiveTransactionId reads the new nextXid, 101
3. GetOldestActiveTransactionId loops through the proc array. There are no
active XIDs there, so it returns 101 as the oldest active XID.
4. GetNewTransactionid stores XID 100 to MyPgXact->xid
So, GetOldestActiveTransactionId returned XID 101, even though 100 only
just started and is surely still running.
This would be hard to hit in practice, and even harder to spot any ill
effect if it happens. GetOldestActiveTransactionId is only used when
creating a checkpoint in a master server, and the race condition can only
happen on an online checkpoint, as there are no backends running during a
shutdown checkpoint. The oldestActiveXid value of an online checkpoint is
only used when starting up a hot standby server, to determine the starting
point where pg_subtrans is initialized from. For the race condition to
happen, there must be no other XIDs in the proc array that would hold back
the oldest-active XID value, which means that the missed XID must be a top
transaction's XID. However, pg_subtrans is not used for top XIDs, so I
believe an off-by-one error is in fact inconsequential. Nevertheless, let's
fix it, as it's clearly wrong and the fix is simple.
This has been wrong ever since hot standby was introduced, so backport to
all supported versions.
Discussion: https://www.postgresql.org/message-id/e7258662-82b6-7a45-56d4-99b337a32bf7@iki.fi

Further investigation shows that ruleutils isn't quite up to speed either
for cases where we have a domain-over-array: it needs to be prepared to
look past a CoerceToDomain at the top level of field and element
assignments, else it decompiles them incorrectly. Potentially this would
result in failure to dump/reload a rule, if it looked like the one in the
new test case. (I also added a test for EXPLAIN; that output isn't broken,
but clearly we need more test coverage here.)
Like commit b1cb32fb6, this bug is reachable in cases we already support,
so back-patch all the way.

compute_tsvector_stats() detoasted and kept in memory every tsvector value
in the sample, but that can be a lot of memory. The original bug report
described a case using over 10 gigabytes, with statistics target of 10000
(the maximum).
To fix, allocate a separate copy of just the lexemes that we keep around,
and free the detoasted tsvector values as we go. This adds some palloc/pfree
overhead, when you have a lot of distinct lexemes in the sample, but it's
better than running out of memory.
Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport to
all supported versions.
Discussion: https://www.postgresql.org/message-id/20170514200602.1451.46797@wrigleys.postgresql.org

If the number of tuples in the heap exceeds approximately INT_MAX/2,
this loop's calculation "2*i+1" could overflow, resulting in a crash.
Fix it by using unsigned int rather than int for the relevant local
variables; that shouldn't cost anything extra on any popular hardware.
Per bug #14722 from Sergey Koposov.
Original patch by Sergey Koposov, modified by me per a suggestion
from Heikki Linnakangas to use unsigned int not int64.
Back-patch to 9.4, where tuplesort.c grew the ability to sort as many
as INT_MAX tuples in-memory (commit 263865a48).
Discussion: https://postgr.es/m/20170629161637.1478.93109@wrigleys.postgresql.org

Commit 14e8803f1 removed the locking in SyncRepWaitForLSN, but that
introduced a race condition, where SyncRepWaitForLSN might see
syncRepState already set to SYNC_REP_WAIT_COMPLETE, but the process was
not yet removed from the queue. That tripped the assertion, that the
process should no longer be in the uqeue. Reorder the operations in
SyncRepWakeQueue to remove the process from the queue first, and update
syncRepState only after that, and add a memory barrier in between to make
sure the operations are made visible to other processes in that order.
Fixes bug #14721 reported by Const Zhang. Analysis and fix by Thomas Munro.
Backpatch down to 9.5, where the locking was removed.
Discussion: https://www.postgresql.org/message-id/20170629023623.1480.26508%40wrigleys.postgresql.org

Mostly in the new subscription-related commands. Backport the few that
were also present in older versions.
Thomas Munro
Discussion: https://www.postgresql.org/message-id/CAEepm=3CyW1QmXcXJXmqiJXtXzFDc8SvSfnxkEGD3Bkv2SrkeQ@mail.gmail.com

We allow INSERT and UPDATE commands to assign to the same column more than
once, as long as the assignments are to subfields or elements rather than
the whole column. However, this failed when the target column was a domain
over array rather than plain array. Fix by teaching process_matched_tle()
to look through CoerceToDomain nodes, and add relevant test cases.
Also add a group of test cases exercising domains over array of composite.
It's doubtless accidental that CREATE DOMAIN allows this case while not
allowing straight domain over composite; but it does, so we'd better make
sure we don't break it. (I could not find any documentation mentioning
either side of that, so no doc changes.)
It's been like this for a long time, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us

We've heard occasional reports of backend launch failing because
pgwin32_ReserveSharedMemoryRegion() fails, indicating that something
has already used that address space in the child process. It's not
very clear what, given that we disable ASLR in Windows builds, but
suspicion falls on antivirus products. It'd be better if we didn't
have to disable ASLR, anyway. So let's try to ameliorate the problem
by retrying the process launch after such a failure, up to 100 times.
Patch by me, based on previous work by Amit Kapila and others.
This is a longstanding issue, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com

Original wording had confusingly vague antecedent for "they", so replace
that with a more repetitive but clearer formulation. In passing, make the
link to the installation requirements section more specific. Per gripe
from Martin Mai, though this is not the fix he initially proposed.
Discussion: https://postgr.es/m/CAN_NWRu-cWuNaiXUjV3m4H-riWURuPW=j21bSaLADs6rjjzXgQ@mail.gmail.com

If the client closes an SSL connection, treat it the same as EOF on a
non-SSL connection. In particular, don't write a message in the log about
that.
Michael Paquier.
Discussion: https://www.postgresql.org/message-id/CAB7nPqSfyVV42Q2acFo%3DvrvF2gxoZAMJLAPq3S3KkjhZAYi7aw@mail.gmail.com

It's possible for WalSndWaitForWal to be asked to wait for WAL that doesn't
exist yet. That's fine, in fact it's the normal situation if we're caught
up; but when the client requests shutdown we should not keep waiting.
The previous coding could wait indefinitely if the source server was idle.
In passing, improve the rather weak comments in this area, and slightly
rearrange some related code for better readability.
Back-patch to 9.4 where this code was introduced.
Discussion: https://postgr.es/m/14154.1498781234@sss.pgh.pa.us

Buildfarm evidence shows that TCP_KEEPALIVE_THRESHOLD doesn't exist
after all on Solaris < 11. This means we need to take positive action to
prevent the TCP_KEEPALIVE code path from being taken on that platform.
I've chosen to limit it with "&& defined(__darwin__)", since it's unclear
that anyone else would follow Apple's precedent of spelling the symbol
that way.
Also, follow a suggestion from Michael Paquier of eliminating code
duplication by defining a couple of intermediate symbols for the
socket option.
In passing, make some effort to reduce the number of translatable messages
by replacing "setsockopt(foo) failed" with "setsockopt(%s) failed", etc,
throughout the affected files. And update relevant documentation so
that it doesn't claim to provide an exhaustive list of the possible
socket option names.
Like the previous commit (f0256c774), back-patch to all supported branches.
Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org

Commit 330b84d8c4 didn't contemplate the case where the public schema
has been dropped and introduced a query which fails when there is no
public schema into pg_dump (when used with -c).
Adjust the query used by pg_dump to handle the case where the public
schema doesn't exist and add tests to check that such a case no longer
fails.
Back-patch the specific fix to 9.6, as the prior commit was.
Adding tests for this case involved adding support to the pg_dump
TAP tests to work with multiple databases, which, while not a large
change, is a bit much to back-patch, so that's only done in master.
Addresses bug #14650
Discussion: https://www.postgresql.org/message-id/20170512181801.1795.47483%40wrigleys.postgresql.org

Turns out that the socket option for this is named TCP_KEEPALIVE_THRESHOLD,
at least according to the tcp(7P) man page for Solaris 11. (But since that
text refers to "SunOS", it's likely pretty ancient.) It appears that the
symbol TCP_KEEPALIVE does get defined on that platform, but it doesn't
seem to represent a valid protocol-level socket option. This leads to
bleats in the postmaster log, and no tcp_keepalives_idle functionality.
Per bug #14720 from Andrey Lizenko, as well as an earlier report from
Dhiraj Chawla that nobody had followed up on. The issue's been there
since we added the TCP_KEEPALIVE code path in commit 5acd417c8, so
back-patch to all supported branches.
Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org

M src/backend/libpq/pqcomm.c
M src/interfaces/libpq/fe-connect.c

Re-allow SRFs and window functions within sub-selects within aggregates.

check_agg_arguments_walker threw an error upon seeing a SRF or window
function, but that is too aggressive: if the function is within a
sub-select then it's perfectly fine. I broke the SRF case in commit
0436f6bde by copying the logic for window functions ... but that was
broken too, and had been since commit eaccfded9.
Repair both cases in HEAD, and the window function case back to 9.3.
9.2 gets this right.

By default, wal_retrieve_retry_interval is five seconds, which is far
more than is needed in any of our TAP tests, leaving the test cases
just twiddling their thumbs for significant stretches. Moreover,
because it's so large, we get basically no testing of the retry-before-
master-is-ready code path. Hence, make PostgresNode::init set up
wal_retrieve_retry_interval = '500ms' as part of its customization of
test clusters' postgresql.conf. This shaves quite a few seconds off
the runtime of the recovery TAP tests.
Back-patch into 9.6. We have wal_retrieve_retry_interval in 9.5,
but the test infrastructure isn't there.
Discussion: https://postgr.es/m/31624.1498500416@sss.pgh.pa.us

When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us

The stats collector disregards inquiry messages that bear a cutoff_time
before when it last wrote the relevant stats file. That's fine, but at
startup when it reads the "permanent" stats files, it absorbed their
timestamps as if they were the times at which the corresponding temporary
stats files had been written. In reality, of course, there's no data
out there at all. This led to disregarding inquiry messages soon after
startup if the postmaster had been shut down and restarted within less
than PGSTAT_STAT_INTERVAL; which is a pretty common scenario, both for
testing and in the field. Requesting backends would hang for 10 seconds
and then report failure to read statistics, unless they got bailed out
by some other backend coming along and making a newer request within
that interval.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there because the autovacuum
launcher hangs for 10 seconds when it can't get statistics at startup,
thus preventing a second shutdown from occurring promptly. We might
want to do some things in the autovac code to make it less prone to
getting stuck that way, but this change is a good bug fix regardless.
In passing, also fix pgstat_read_statsfiles() to ensure that it
re-zeroes its global stats variables if they are corrupted by a
short read from the stats file. (Other reads in that function
go into temp variables, so that the issue doesn't arise.)
This has been broken since we created the separation between permanent
and temporary stats files in 8.4, so back-patch to all supported branches.
Discussion: https://postgr.es/m/16860.1498442626@sss.pgh.pa.us

Fix its header comment, which described the old behavior of the <N>
phrase distance operator; we missed updating that in commit 028350f61.
Also, reset errno before strtol() call, to defend against the possibility
that it was already ERANGE at entry. (The lack of complaints says that
it generally isn't, but this is at least a latent bug.) Very minor
stylistic improvements as well.
Victor Drobny noted the obsolete comment, I noted the errno issue.
Back-patch to 9.6 where this code was added, just in case the errno
issue is a live bug in some cases.
Discussion: https://postgr.es/m/2b5382fdff9b1f79d5eb2c99c4d2cbe2@postgrespro.ru

When promoting a standby just after a XLOG_SWITCH record was replayed,
and next segment(s) are already are locally available (via walsender,
restore_command + trigger/recovery target), that segment could
accidentally be recycled onto the past of the new timeline. Later
checkpointer would create a .ready file for it, assuming there was an
error during creation, and it would get archived. That causes trouble
if another standby is later brought up from a basebackup from before
the timeline creation, because it would try to read the
segment, because XLogFileReadAnyTLI just tries all possible timelines,
which doesn't have valid contents. Thus replay would fail.
The problem, if already occurred, can be fixed by removing the segment
and/or having restore_command filter it out.
The reason for the creation of such "phantom" segments was, that after
an XLOG_SWITCH record the EndOfLog variable points to the beginning of
the next segment, and RemoveXlogFile() used XLByteToPrevSeg().
Normally RemoveXlogFile() doing so is harmless, because the last
segment will still exist preventing InstallXLogFileSegment() from
causing harm, but just after promotion there's no previous segment on
the new timeline.
Fix that by using XLByteToSeg() instead of XLByteToPrevSeg().
Author: Andres Freund
Reported-By: Greg Burek
Discussion: https://postgr.es/m/20170619073026.zcwpe6mydsaz5ygd@alap3.anarazel.de
Backpatch: 9.2-, bug older than all supported versions

When commit 0f33a719fdbb5d8c43839ea0d2c90cd03e2af2d2 removed the
instructions to start/stop the new cluster before running rsync, it was
now possible for pg_resetwal/pg_resetxlog to leave the final WAL record
at wal_level=minimum, preventing upgraded standby servers from
reconnecting.
This patch fixes that by having pg_upgrade unconditionally start/stop
the new cluster after pg_resetwal/pg_resetxlog has run.
Backpatch through 9.2 since, though the instructions were added in PG
9.5, they worked all the way back to 9.2.
Discussion: https://postgr.es/m/20170620171844.GC24975@momjian.us
Backpatch-through: 9.2

When materialized views were added, psql's \d commands were made to
treat them as a separate object category ... but not everyplace in the
documentation or comments got the memo.
Noted by David Johnston. Back-patch to 9.3 where matviews came in.
Discussion: https://postgr.es/m/CAKFQuwb27M3VXRhHErjCpkWwN9eKThbqWb1=trtoXi9_ejqPXQ@mail.gmail.com

David Rowley found that the "use the smallest per-column selectivity"
heuristic applied in some cases by get_foreign_key_join_selectivity()
was badly off if the FK columns are independent, producing estimates
much worse than we got before that code was added in 9.6.
One case where that heuristic was used was for LEFT and FULL outer joins
with the referenced rel on the outside of the join. But we should not
really need to special-case those here. eqjoinsel() never has had such a
special case; the correction is applied by calc_joinrel_size_estimate()
instead. Let's just estimate such cases like inner joins and rely on that
later adjustment. (I think there was something of a thinko here, in that
the comments seem to be thinking about the selectivity as defined for
semi/anti joins; but that shouldn't apply to left/full joins.) Add a
regression test exercising such a case to show that this is sane in
at least some cases.
The other case where we used that heuristic was for SEMI/ANTI outer joins,
either if the referenced rel was on the outside, or if it was on the inside
but was part of a join within the RHS. In either case, the FK doesn't give
us a lot of traction towards estimating the selectivity. To ensure that
we don't have regressions from what happened before 9.6, let's punt by
ignoring the FK in such cases and applying the traditional selectivity
calculation. (We might be able to improve on that later, but for now
I just want to be sure it's not worse than 9.5.)
Report and patch by David Rowley, simplified a bit by me. Back-patch
to 9.6 where this code was added.
Discussion: https://postgr.es/m/CAKJS1f8NO8oCDcxrteohG6O72uU1saEVT9qX=R8pENr5QWerXw@mail.gmail.com

The combination of -Z -Fp and output to stdout resulted in corrupted
output data, because we left stdout in text mode, resulting in newline
conversion being done on the compressed stream. Switch stdout to binary
mode for this case, at the same place where we do it for non-text output
formats.
Report and patch by Kuntal Ghosh, tested by Ashutosh Sharma and Neha
Sharma. Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAGz5QCJPvbBjXAmJuGx1B_41yVCetAJhp7rtaDf7XQGWuB1GSw@mail.gmail.com

When, during logical decoding, a transaction gets too big, it's
contents get spilled to disk. Not just the top-transaction gets
spilled, but *also* all of its subtransactions, even if they're not
that large themselves. Unfortunately we didn't clean up
such small spilled subtransactions from disk.
Fix that, by keeping better track of whether a transaction has been
spilled to disk.
Author: Andres Freund
Reported-By: Dmitriy Sarafannikov, Fabrízio de Royes Mello
Discussion:
https://postgr.es/m/1457621358.355011041@f382.i.mail.ru
https://postgr.es/m/CAFcNs+qNMhNYii4nxpO6gqsndiyxNDYV0S=JNq0v_sEE+9PHXg@mail.gmail.com
Backpatch: 9.4-, where logical decoding was introduced

When a new base type is created using the old-style procedure of first
creating the input/output functions with "opaque" in place of the base
type, the "opaque" argument/return type is changed to the final base type,
on CREATE TYPE. However, we did not create a pg_depend record when doing
that, so the functions were left not depending on the type.
Fixes bug #14706, reported by Karen Huddleston.
Discussion: https://www.postgresql.org/message-id/20170614232259.1424.82774@wrigleys.postgresql.org

We had three occurrences of essentially the same coding pattern
wherein we tried to retrieve a query result from a libpq connection
without blocking. In the case where PQconsumeInput failed (typically
indicating a lost connection), all three loops simply gave up and
returned, forgetting to clear any previously-collected PGresult
object. Since those are malloc'd not palloc'd, the oversight results
in a process-lifespan memory leak.
One instance, in libpqwalreceiver, is of little significance because
the walreceiver process would just quit anyway if its connection fails.
But we might as well fix it.
The other two instances, in postgres_fdw, are somewhat more worrisome
because at least in principle the scenario could be repeated, allowing
the amount of memory leaked to build up to something worth worrying
about. Moreover, in these cases the loops contain CHECK_FOR_INTERRUPTS
calls, as well as other calls that could potentially elog(ERROR),
providing another way to exit without having cleared the PGresult.
Here we need to add PG_TRY logic similar to what exists in quite a
few other places in postgres_fdw.
Coverity noted the libpqwalreceiver bug; I found the other two cases
by checking all calls of PQconsumeInput.
Back-patch to all supported versions as appropriate (9.2 lacks
postgres_fdw, so this is really quite unexciting for that branch).
Discussion: https://postgr.es/m/22620.1497486981@sss.pgh.pa.us

pg_upgrade never used Windows junction points but instead always used
Windows hard links.
Reported-by: Adrian Klaver
Discussion: https://postgr.es/m/6a638c60-90bb-4921-8ee4-5fdad68f8b09@aklaver.com
Backpatch-through: 9.3, where the mention first appeared

It was unsafe to instruct users to start/stop the server after
pg_upgrade was run but before the standby servers were rsync'ed. The
new instructions avoid this.
RELEASE NOTES: This fix should be mentioned in the minor release notes.
Reported-by: Dmitriy Sarafannikov and Sergey Burladyan
Discussion: https://postgr.es/m/87wp8o506b.fsf@seb.koffice.internal
Backpatch-through: 9.5, where standby server upgrade instructions first appeared

It was explained that read only transactions (not in standby) allow to
update sequences. This had been wrong since the commit:
05d8a561ff85db1545f5768fe8d8dc9d99ad2ef7
Discussion: https://www.postgresql.org/message-id/20170614.110826.425627939780392324.t-ishii%40sraoss.co.jp

M doc/src/sgml/high-availability.sgml

Assert that we don’t invent relfilenodes or type OIDs in binary upgrade.

During pg_upgrade's restore run, all relfilenode choices should be
overridden by commands in the dump script. If we ever find ourselves
choosing a relfilenode in the ordinary way, someone blew it. Likewise for
pg_type OIDs. Since pg_upgrade might well succeed anyway, if there happens
not to be a conflict during the regression test run, we need assertions
here to keep us on the straight and narrow.
We might someday be able to remove the assertion in GetNewRelFileNode,
if pg_upgrade is rewritten to remove its assumption that old and new
relfilenodes always match. But it's hard to see how to get rid of the
pg_type OID constraint, since those OIDs are embedded in user tables
in some cases.
Back-patch as far as 9.5, because of the risk of back-patches breaking
something here even if it works in HEAD. I'd prefer to go back further,
but 9.4 fails both assertions due to get_rel_infos()'s use of a temporary
table. We can't use the later-branch solution of a CTE for compatibility
reasons (cf commit 5d16332e9), and it doesn't seem worth inventing some
other way to do the query. (I did check, by dint of changing the Asserts
to elog(WARNING), that there are no other cases of unwanted OID assignments
during 9.4's regression test run.)
Discussion: https://postgr.es/m/19785.1497215827@sss.pgh.pa.us

This reverts commit 56b6ef893fee9e9bf47d927a02f4d1ea911f4d9c and instead
makes vcregress.pl parse out PROVE_FLAGS from a command line argument
when doing a TAP test, thus making it consistent with the makefile
treatment.
Discussion: https://postgr.es/m/c26a7416-2fb9-34ab-7991-618c922f896e%402ndquadrant.com
Backpatch to 9.4 like previous patch.

Commit f039eaac7131ef2a4cf63a10cf98486f8bcd09d2, later back-patched
with commit 1b812afb0eafe125b820cc3b95e7ca03821aa675, allowed many of
the queries issued by postgres_fdw to fetch remote data to respond to
cancel interrupts in a timely fashion. However, it didn't do anything
about the transaction control commands, which remained
noninterruptible.
Improve the situation by changing do_sql_command() to retrieve query
results using pgfdw_get_result(), which uses the asynchronous
interface to libpq so that it can check for interrupts every time
libpq returns control. Since this might result in a situation
where we can no longer be sure that the remote transaction state
matches the local transaction state, add a facility to force all
levels of the local transaction to abort if we've lost track of
the remote state; without this, an apparently-successful commit of
the local transaction might fail to commit changes made on the
remote side. Also, add a 60-second timeout for queries issue during
transaction abort; if that expires, give up and mark the state of
the connection as unknown. Drop all such connections when we exit
the local transaction. Together, these changes mean that if we're
aborting the local toplevel transaction anyway, we can just drop the
remote connection in lieu of waiting (possibly for a very long time)
for it to complete an abort.
This still leaves quite a bit of room for improvement. PQcancel()
has no asynchronous interface, so if we get stuck sending the cancel
request we'll still hang. Also, PQsetnonblocking() is not used, which
means we could block uninterruptibly when sending a query. There
might be some other optimizations possible as well. Nonetheless,
this allows us to escape a wait for an unresponsive remote server
quickly in many more cases than previously.
Report by Suraj Kharage. Patch by me and Rafia Sabih. Review
and testing by Amit Kapila and Tushar Ahuja.
Discussion: http://postgr.es/m/CAF1DzPU8Kx+fMXEbFoP289xtm3bz3t+ZfxhmKavr98Bh-C0TqQ@mail.gmail.com

If authentication over an SSL connection fails, with sslmode=prefer,
libpq will reconnect without SSL and retry. However, we did not clear
the variables related to GSS, SSPI, and SASL authentication state, when
reconnecting. Because of that, the second authentication attempt would
always fail with a "duplicate GSS/SASL authentication request" error.
pg_SSPI_startup did not check for duplicate authentication requests like
the corresponding GSS and SASL functions, so with SSPI, you would leak
some memory instead.
Another way this could manifest itself, on version 10, is if you list
multiple hostnames in the "host" parameter. If the first server requests
Kerberos or SCRAM authentication, but it fails, the attempts to connect to
the other servers will also fail with "duplicate authentication request"
errors.
To fix, move the clearing of authentication state from closePGconn to
pgDropConnection, so that it is cleared also when re-connecting.
Patch by Michael Paquier, with some kibitzing by me.
Backpatch down to 9.3. 9.2 has the same bug, but the code around closing
the connection is somewhat different, so that this patch doesn't apply.
To fix this in 9.2, I think we would need to back-port commit 210eb9b743
first, and then apply this patch. However, given that we only bumped into
this in our own testing, we haven't heard any reports from users about
this, and that 9.2 will be end-of-lifed in a couple of months anyway, it
doesn't seem worth the risk and trouble.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRuOUm0MyJaUy9L3eXYJU3AKCZ-0-03=-aDTZJGV4GyWw@mail.gmail.com

Because walsender and normal backends share the same main loop it's
problematic to have two different flag variables, set in signal
handlers, indicating a pending configuration reload. Only certain
walsender commands reach code paths checking for the
variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT
... LOGICAL, notably not base backups).
This is a bug present since the introduction of walsender, but has
gotten worse in releases since then which allow walsender to do more.
A later patch, not slated for v10, will similarly unify SIGHUP
handling in other types of processes as well.
Author: Petr Jelinek, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de
Backpatch: 9.2-, bug is present since 9.0

When the checkpointer writes the shutdown checkpoint, it checks
afterwards whether any WAL has been written since it started and
throws a PANIC if so. At that point, only walsenders are still
active, so one might think this could not happen, but walsenders can
also generate WAL, for instance in BASE_BACKUP and logical decoding
related commands (e.g. via hint bits). So they can trigger this panic
if such a command is run while the shutdown checkpoint is being
written.
To fix this, divide the walsender shutdown into two phases. First,
checkpointer, itself triggered by postmaster, sends a
PROCSIG_WALSND_INIT_STOPPING signal to all walsenders. If the backend
is idle or runs an SQL query this causes the backend to shutdown, if
logical replication is in progress all existing WAL records are
processed followed by a shutdown. Otherwise this causes the walsender
to switch to the "stopping" state. In this state, the walsender will
reject any further replication commands. The checkpointer begins the
shutdown checkpoint once all walsenders are confirmed as
stopping. When the shutdown checkpoint finishes, the postmaster sends
us SIGUSR2. This instructs walsender to send any outstanding WAL,
including the shutdown checkpoint record, wait for it to be replicated
to the standby, and then exit.
Author: Andres Freund, based on an earlier patch by Michael Paquier
Reported-By: Fujii Masao, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de
Backpatch: 9.4, where logical decoding was introduced

The non-participation in procsignal was a problem for both changes in
master, e.g. parallelism not working for normal statements run in
walsender backends, and older branches, e.g. recovery conflicts and
catchup interrupts not working for logical decoding walsenders.
This commit thus replaces the previous WalSndXLogSendHandler with
procsignal_sigusr1_handler. In branches since db0f6cad48 that can
lead to additional SetLatch calls, but that only rarely seems to make
a difference.
Author: Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170421014030.fdzvvvbrz4nckrow@alap3.anarazel.de
Backpatch: 9.4, earlier commits don't seem to benefit sufficiently

The current method of computing the record length (excluding the
lenght of full-page images) has been wrong since the WAL format has
been revamped in 2c03216d831160bedd72d45f712601b6f7d03f1c. Only the
main record's length was counted, but that can be significantly too
little if there's data associated with further blocks.
Fix by computing the record length as total_lenght - fpi_length.
Reported-By: Chen Huajun
Bug: #14687
Reviewed-By: Heikki Linnakangas
Discussion: https://postgr.es/m/20170603165939.1436.58887@wrigleys.postgresql.org
Backpatch: 9.5-

Previously this was not allowed, as copy.c didn't set the
CURSOR_OPT_PARALLEL_OK flag when planning the query. Set it.
While the lack of parallel query for COPY isn't strictly speaking a
bug, it does prevent parallelism from being used in a facility
commonly used to run long running queries. Thus backpatch to 9.6.
Author: Andres Freund
Discussion: https://postgr.es/m/20170531231958.ihanapplorptykzm@alap3.anarazel.de
Backpatch: 9.6, where parallelism was introduced.

On some platforms, -fpic fails for sufficiently large shared libraries.
We've mostly not hit that boundary yet, but there are some extensions
such as Citus and pglogical where it's becoming a problem. A bit of
research suggests that the penalty for -fPIC is small, in the
single-digit-percentage range --- and there's none at all on popular
platforms such as x86_64. So let's just default to -fPIC everywhere
and provide one less thing for extension developers to worry about.
Per complaint from Christoph Berg. Back-patch to all supported branches.
(I did not bother to touch the recently-removed Makefiles for sco and
unixware in the back branches, though. We'd have no way to test that
it doesn't break anything on those platforms.)
Discussion: https://postgr.es/m/20170529155850.qojdfrwkkqnjb3ap@msg.df7cb.de

Remove some gratuituous message differences by making the AM name
previously embedded in each message be a %s instead. While at it, get
rid of terminology that's unclear and unnecessary in one message.
Discussion: https://postgr.es/m/20170523001557.bq2hbq7hxyvyw62q@alvherre.pgsql

pg_resetwal (formerly pg_resetxlog) doesn't insist on finding a matching
version number in pg_control, and that seems like an important thing to
preserve since recovering from corrupt pg_control is a prime reason to
need to run it. However, that means you can try to run it against a
data directory of a different major version, which is at best useless
and at worst disastrous. So as to provide some protection against that
type of pilot error, inspect PG_VERSION at startup and refuse to do
anything if it doesn't match. PG_VERSION is read-only after initdb,
so it's unlikely to get corrupted, and even if it were corrupted it would
be easy to fix by hand.
This hazard has been there all along, so back-patch to all supported
branches.
Michael Paquier, with some kibitzing by me
Discussion: https://postgr.es/m/f4b8eb91-b934-8a0d-b3cc-68f06e2279d1@enterprisedb.com

The NumericOnly grammar production accepted ICONST, + ICONST, - ICONST,
FCONST, and - FCONST, but for some reason not + FCONST. This led to
strange inconsistencies like
regression=# set random_page_cost = +4;
SET
regression=# set random_page_cost = 4000000000;
SET
regression=# set random_page_cost = +4000000000;
ERROR: syntax error at or near "4000000000"
(because 4000000000 is too large to be an ICONST). While there's
no actual functional reason to need to write a "+", if we allow
it for integers it seems like we should allow it for numerics too.
It's been like that forever, so back-patch to all supported branches.
Discussion: https://postgr.es/m/30908.1496006184@sss.pgh.pa.us

Commit 9aa3c782c added code to allow CREATE TABLE/CREATE TYPE to not fail
when the desired type name conflicts with an autogenerated array type, by
dint of renaming the array type out of the way. But I (tgl) overlooked
that the same case arises in ALTER TABLE/TYPE RENAME. Fix that too.
Back-patch to all supported branches.
Report and patch by Vik Fearing, modified a bit by me
Discussion: https://postgr.es/m/0f4ade49-4f0b-a9a3-c120-7589f01d1eb8@2ndquadrant.com

If an operator class has no operators or functions, and doesn't need
a STORAGE clause, we emitted "CREATE OPERATOR CLASS ... AS ;" which
is syntactically invalid. Fix by forcing a STORAGE clause to be
emitted anyway in this case.
(At some point we might consider changing the grammar to allow CREATE
OPERATOR CLASS without an opclass_item_list. But probably we'd want to
omit the AS in that case, so that wouldn't fix this pg_dump issue anyway.)
It's been like this all along, so back-patch to all supported branches.
Daniel Gustafsson, tweaked by me to avoid a dangling-pointer bug
Discussion: https://postgr.es/m/D9E5FC64-7A37-4F3D-B946-7E4FB468F88A@yesql.se

This patch replaces isspace() calls with scanner_isspace() in functions
that are likely to be presented with non-ASCII input. isspace() has
the small advantage that it will correctly recognize no-break space
in single-byte encodings (such as LATIN1); but it cannot work successfully
for any multibyte character, and depending on platform it might return
false positive results for some fragments of multibyte characters. That's
disastrous for functions that are trying to discard whitespace between
valid strings, as noted in bug #14662 from Justin Muise. Even treating
no-break space as whitespace is pretty questionable for the usages touched
here, because the core scanner would think it is an identifier character.
Affected functions are parse_ident(), parseNameAndArgTypes (underlying
regprocedurein() and siblings), SplitIdentifierString (used for parsing
GUCs and options that are qualified names or lists of names), and
SplitDirectoriesString (used for parsing GUCs that are lists of
directories).
All the functions adjusted here are parsing SQL identifiers and similar
constructs, so it's reasonable to insist that their definition of
whitespace match the core scanner. So we can hope that this won't cause
many backwards-compatibility problems. I've left alone isspace() calls
in places that aren't really expecting any non-ASCII input characters,
such as float8in().
Back-patch to all supported branches.
Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us

The cash_div_intX functions applied rint() to the result of the division.
That's not merely useless (because the result is already an integer) but
it causes precision loss for values larger than 2^52 or so, because of
the forced conversion to float8.
On the other hand, the cash_mul_fltX functions neglected to apply rint() to
their multiplication results, thus possibly causing off-by-one outputs.
Per C standard, arithmetic between any integral value and a float value is
performed in float format. Thus, cash_mul_flt4 and cash_div_flt4 produced
answers good to only about six digits, even when the float value is exact.
We can improve matters noticeably by widening the float inputs to double.
(It's tempting to consider using "long double" arithmetic if available,
but that's probably too much of a stretch for a back-patched fix.)
Also, document that cash_div_intX operators truncate rather than round.
Per bug #14663 from Richard Pistole. Back-patch to all supported branches.
Discussion: https://postgr.es/m/22403.1495223615@sss.pgh.pa.us

This is more secure, and saves a redirect since we no longer accept
plain HTTP connections on the website.
References in code comments should probably be updated too, but
that doesn't seem to need back-patching, whereas this does.
Also, in the 9.2 branch, remove suggestion that you can get the
source code via FTP, since that service will be shut down soon.
Daniel Gustafsson, with a few additional changes by me
Discussion: https://postgr.es/m/9A2C89A7-0BB8-41A8-B288-8B7BD09D7D44@yesql.se

When stdin is a terminal, it's possible to end a COPY FROM STDIN with
a keyboard EOF signal (typically control-D), and then keep on issuing
SQL commands. One would expect another COPY FROM STDIN to work as well,
but on some platforms it did not. This turns out to be because we were
not resetting the stream's feof() flag, and BSD-ish versions of fread()
and fgets() won't attempt to read more data if that's set.
The misbehavior is observed on BSDen (including macOS), but not Linux,
Windows, or SysV-ish Unixen, which makes this a portability bug not
just a missing feature.
Add a clearerr() call to fix the behavior, and improve the prompt that's
issued when copying from a TTY to mention that EOF signals work.
It's been like this forever, so back-patch to all supported branches.
Thomas Munro
Discussion: https://postgr.es/m/CAEepm=0MCGfYf=JAMiYhO6JPtv9-3ZfBo8fcGeCZ8oMzaw+Z+Q@mail.gmail.com

Modifying the permissions of a persistent file isn't really much nicer
than modifying its contents, even if git doesn't currently notice it.
Adjust the test script to make a copy and set the permissions of that
instead.
Michael Paquier, per a gripe from me. Back-patch to 9.5 where these
tests were introduced.
Discussion: https://postgr.es/m/14836.1494885946@sss.pgh.pa.us

The CommentStmt made by RebuildConstraintComment() has to pstrdup the
relation name, else it will contain a dangling pointer after that
relcache entry is flushed. (I'm less sure that pstrdup'ing conname
is necessary, but let's be safe.) Failure to do this leads to weird
errors or crashes, as reported by Marko Elezovic.
Bug introduced by commit e42375fc8, so back-patch to 9.5 as that was.
Fix by David Rowley, regression test by Michael Paquier
Discussion: https://postgr.es/m/DB6PR03MB30775D58E732D4EB0C13725B9AE00@DB6PR03MB3077.eurprd03.prod.outlook.com

Commit 60690a6fe attempted to fix the wait_for_stats() function in this
test so that it would wait properly if the tenk2 scans were done in
parallel workers instead of the main session (typically as a consequence of
force_parallel_mode being turned on). However, we made it test for whether
the main session's actions had been reported by looking for inserts on
'trunc_stats_test'. This is the Wrong Thing, because those aren't the last
updates we expect the main session to do. As shown by recent failures on
buildfarm member frogmouth, it's entirely likely that the trunc_stats_test
updates will be reported in a separate message from later updates, which
means there can be a window in which wait_for_stats() will exit but not all
the updates we are expecting to see will have arrived. We should test for
the last updates we're expecting, namely those on 'trunc_stats_test4'.
Unfortunately, I doubt that this explains frogmouth's failures, because
there's no reason to believe that it's running the tenk2 queries in
parallel. Still, the test is wrong on its own terms, so fix and back-patch
to 9.6 where parallel query came in.

Ultra-modern versions of the perl Data::Dumper module have apparently
changed how they indent output. Instead of trying to keep up we choose
to tell it to supporess all indentation in the hstore_plperl regression
tests.
Backpatch to 9.5 where this feature was introduced.

Before 955a684e0401 logical decoding snapshot maintenance needed to
cope with transactions it might not have seen in their entirety. For
such transactions we'd to assume they modified the catalog (could have
happened before we were watching), and thus a new snapshot had to be
built, and distributed to concurrently running transactions.
That's problematic because building a new snapshot isn't that cheap ,
especially as the the array of committed transactions needs to be
sorted. When creating a slot on a server with a lot of transactions,
this could make logical slot creation infeasibly expensive.
After 955a684e0401 there's no need to deal with transaction that
aren't guaranteed to be fully observable. That allows to avoid
building snapshots for transactions that haven't modified catalog,
even before reaching consistency.
While this isn't necessarily a bugfix, slot creation being impossible
in some production workloads, is severe enough to warrant
backpatching.
Author: Andres Freund, based on a quite different patch from Petr Jelinek
Analyzed-By: Petr Jelinek
Reviewed-By: Petr Jelinek
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backpatch: 9.4-, where logical decoding has been introduced

The snapshot assembly during the creation of logical slots relied
waiting for transactions in xl_running_xacts to end, by checking for
their commit/abort records. Unfortunately, despite locking, it is
possible to see an xl_running_xact record listing transactions as
ready, that have already WAL-logged an commit/abort record, as the
locking just prevents the ProcArray to be adjusted, and the commit
record has to be logged first.
That lead to either delayed or hanging snapshot creation, because
snapbuild.c would wait "forever" to see commit/abort records for some
transactions. That hang resolved only if a xl_running_xacts record
without any running transactions happened to be logged, far from
certain on a busy server.
It's impractical to prevent that via more heavyweight locking, the
likelihood of deadlocks and significantly increased contention would
be too big.
Instead change the initial snapshot creation to be solely based on
tracking the oldest running transaction via
xl_running_xacts->oldestRunningXid - that actually ends up
significantly simplifying the code. That has two disadvantages:
1) Because we cannot fully "trust" the contents of xl_running_xacts,
we cannot use it to build the initial snapshot. Instead we have to
wait twice for all running transactions to finish.
2) Previously a slot, unless the race occurred, could be created when
the all transaction perceived as running based on commit/abort
records, now we have to wait for the next xl_running_xacts record.
To address that, trigger logging new xl_running_xacts record from
within snapbuild.c exactly when necessary.
Unfortunately snabuild.c's SnapBuild is stored on disk, one of the
stupider ideas of a certain Mr Freund, so we can't change it in a
minor release. As this is going to be backpatched, we have to hack
around a bit to keep on-disk compatibility. A later commit will
rejigger that on master.
Author: Andres Freund, based on a quite different patch from Petr Jelinek
Analyzed-By: Petr Jelinek
Reviewed-By: Petr Jelinek
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backpatch: 9.4-, where logical decoding has been introduced

We have now grown enough registerable syscache-invalidation callback
functions that the original assumption that there would be few of them
is causing performance problems. In particular, let's fix things so that
CallSyscacheCallbacks doesn't have to search the whole array to find
which callback(s) to invoke for a given cache ID. Preserve the original
behavior that callbacks are called in order of registration, just in
case there's someplace that depends on that (which I doubt).
In support of this, export the number of syscaches from syscache.h.
People could have found that out anyway from the enum, but adding a
#define makes that much safer.
This provides a useful additional speedup in Mathieu Fenniak's
logical-decoding test case, although we're reaching the point of
diminishing returns there. I think any further improvement will have
to come from reducing the number of cache invalidations that are
triggered in the first place. Still, we can hope that this change
gives some incremental benefit for all invalidation scenarios.
Back-patch to 9.4 where logical decoding was introduced.
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com

This has to be backpatched to all supported releases so release markup
added to HEAD and copied to back branches matches the existing markup.
Reported-by: Peter Eisentraut
Discussion: 2b8a2552-fffa-f7c8-97c5-14db47a87731@2ndquadrant.com
Author: initial patch and sample markup by Peter Eisentraut
Backpatch-through: 9.2

A test case provided by Mathieu Fenniak shows that hash_seq_search'ing
this hashtable can consume a very significant amount of overhead during
logical decoding, which triggers frequent cache invalidation. Testing
suggests that the actual population of the hashtable is often no more
than a few dozen entries, so we can cut the overhead just by dropping
the initial number of buckets down from 1024 --- I chose to cut it to 64.
(In situations where we do have a significant number of entries, we
shouldn't get any real penalty from doing this, as the dynahash.c code
will resize the hashtable automatically.)
This gives a further factor-of-two savings in Mathieu's test case.
That may be overly optimistic for real-world benefit, as real cases
may have larger average table populations, but it's hard to see it
turning into a net negative for any workload.
Back-patch to 9.4 where relfilenodemap.c was introduced.
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com

A test case provided by Mathieu Fenniak shows that the initial search for
the target catcache in CatalogCacheIdInvalidate consumes a very significant
amount of overhead in cases where cache invalidation is triggered but has
little useful work to do. There is no good reason for that search to exist
at all, as the index array maintained by syscache.c allows direct lookup of
the catcache from its ID. We just need a frontend function in syscache.c,
matching the division of labor for most other cache-accessing operations.
While there's more that can be done in this area, this patch alone reduces
the runtime of Mathieu's example by 2X. We can hope that it offers some
useful benefit in other cases too, although usually cache invalidation
overhead is not such a striking fraction of the total runtime.
Back-patch to 9.4 where logical decoding was introduced. It might be
worth going further back, but presently the only case we know of where
cache invalidation is really a significant burden is in logical decoding.
Also, older branches have fewer catcaches, reducing the possible benefit.
(Note: although this nominally changes catcache's API, we have always
documented CatalogCacheIdInvalidate as a private function, so I would
have little sympathy for an external module calling it directly. So
backpatching should be fine.)
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com

On MSVC builds and on back branches that means removing the hardcoded
--verbose setting. On master for Unix that means removing the empty
setting in the global Makefile so that the value can be acquired from
the environment as well as from the make arguments.
Backpatch to 9.4 where we introduced TAP tests

On Unix this path is detected via the use of xml2-config, but that's not
available on Windows. This means that users building with libxml2 will
no longer need to move things around from the standard libxml2
installation for MSVC builds.
Backpatch to all live branches.

Increase from the historical value of 32 to 64. We are up to 31 callers
of CacheRegisterSyscacheCallback() in HEAD, so if they were all to be
exercised in one process that would leave only one slot for add-on modules.
It's probably not possible for that to happen, but still we clearly need
more daylight here. (At some point it might be worth making the array
dynamically resizable; but since we've never heard a complaint of "out of
syscache_callback_list slots" happening in the field, I doubt it's worth
it yet.)
Back-patch as far as 9.4, which is where we increased the companion limit
MAX_RELCACHE_CALLBACKS (cf commit f01d1ae3a). It's not as urgent in
released branches, which have only a couple dozen call sites in core, but
it still seems that somebody might hit the limit before these branches die.
Discussion: https://postgr.es/m/12184.1494450131@sss.pgh.pa.us