lo_open registers the currently active snapshot, and checks if the
large object exists after that. Normally, snapshots registered by lo_open
are unregistered at end of transaction when the lo descriptor is closed, but
if we error out before the lo descriptor is added to the list of open
descriptors, it is leaked. Fix by moving the snapshot registration to after
checking if the large object exists.
Reported by Pavel Stehule. Backpatch to 8.4. The snapshot registration
system was introduced in 8.4, so prior versions are not affected (and not
supported, anyway).

Fix spurious warning after vacuuming a page on a table with no indexes.

There is a rare race condition, when a transaction that inserted a tuple
aborts while vacuum is processing the page containing the inserted tuple.
Vacuum prunes the page first, which normally removes any dead tuples, but
if the inserting transaction aborts right after that, the loop after
pruning will see a dead tuple and remove it instead. That's OK, but if the
page is on a table with no indexes, and the page becomes completely empty
after removing the dead tuple (or tuples) on it, it will be immediately
marked as all-visible. That's OK, but the sanity check in vacuum would
throw a warning because it thinks that the page contains dead tuples and
was nevertheless marked as all-visible, even though it just vacuumed away
the dead tuples and so it doesn't actually contain any.
Spotted this while reading the code. It's difficult to hit the race
condition otherwise, but can be done by putting a breakpoint after the
heap_page_prune() call.
Backpatch all the way to 8.4, where this code first appeared.

In libpq, we set up and pass to OpenSSL callback routines to handle
locking. When we run out of SSL connections, we try to clean things
up by de-registering the hooks. Unfortunately, we had a few calls
into the OpenSSL library after these hooks were de-registered during
SSL cleanup which lead to deadlocking. This moves the thread callback
cleanup to be after all SSL-cleanup related OpenSSL library calls.
I've been unable to reproduce the deadlock with this fix.
In passing, also move the close_SSL call to be after unlocking our
ssl_config mutex when in a failure state. While it looks pretty
unlikely to be an issue, it could have resulted in deadlocks if we
ended up in this code path due to something other than SSL_new
failing. Thanks to Heikki for pointing this out.
Back-patch to all supported versions; note that the close_SSL issue
only goes back to 9.0, so that hunk isn't included in the 8.4 patch.
Initially found and reported by Vesa-Matti J Kari; many thanks to
both Heikki and Andres for their help running down the specific
issue and reviewing the patch.

Once the administrator has called for an immediate shutdown or a backend
crash has triggered a reinitialization, no mere SIGINT or SIGTERM should
change that course. Such derailment remains possible when the signal
arrives before quickdie() blocks signals. That being a narrow race
affecting most PostgreSQL signal handlers in some way, leave it for
another patch. Back-patch this to all supported versions.

The previous coding attempted to activate all the GUC settings specified
in SET clauses, so that the function validator could operate in the GUC
environment expected by the function body. However, this is problematic
when restoring a dump, since the SET clauses might refer to database
objects that don't exist yet. We already have the parameter
check_function_bodies that's meant to prevent forward references in
function definitions from breaking dumps, so let's change CREATE FUNCTION
to not install the SET values if check_function_bodies is off.
Authors of function validators were already advised not to make any
"context sensitive" checks when check_function_bodies is off, if indeed
they're checking anything at all in that mode. But extend the
documentation to point out the GUC issue in particular.
(Note that we still check the SET clauses to some extent; the behavior
with !check_function_bodies is now approximately equivalent to what ALTER
DATABASE/ROLE have been doing for awhile with context-dependent GUCs.)
This problem can be demonstrated in all active branches, so back-patch
all the way.

It's possible that inlining of SQL functions (or perhaps other changes?)
has exposed typmod information not known at parse time. In such cases,
Vars generated by query_planner might have valid typmod values while the
original grouping columns only have typmod -1. This isn't a semantic
problem since the behavior of grouping only depends on type not typmod,
but it breaks locate_grouping_columns' use of tlist_member to locate the
matching entry in query_planner's result tlist.
We can fix this without an excessive amount of new code or complexity by
relying on the fact that locate_grouping_columns only gets called when
make_subplanTargetList has set need_tlist_eval == false, and that can only
happen if all the grouping columns are simple Vars. Therefore we only need
to search the sub_tlist for a matching Var, and we can reasonably define a
"match" as being a match of the Var identity fields
varno/varattno/varlevelsup. The code still Asserts that vartype matches,
but ignores vartypmod.
Per bug #8393 from Evan Martin. The added regression test case is
basically the same as his example. This has been broken for a very long
time, so back-patch to all supported branches.

With this optimization flag enabled, recent versions of gcc can generate
incorrect code that assumes variable-length arrays (such as oidvector)
are actually fixed-length because they're embedded in some larger struct.
The known instance of this problem was fixed in 9.2 and up by commit
8137f2c32322c624e0431fac1621e8e9315202f9 and followon work, which hides
actually-variable-length catalog fields from the compiler altogether.
And we plan to gradually convert variable-length fields to official
"flexible array member" notation over time, which should prevent this type
of bug from reappearing as gcc gets smarter. We're not going to try to
back-port those changes into older branches, though, so apply this
band-aid instead.
Andres Freund

The C99 and POSIX standards require strtod() to accept all these spellings
(case-insensitively): "inf", "+inf", "-inf", "infinity", "+infinity",
"-infinity". However, pre-C99 systems might accept only some or none of
these, and apparently Windows still doesn't accept "inf". To avoid
surprising cross-platform behavioral differences, manually check for each
of these spellings if strtod() fails. We were previously handling just
"infinity" and "-infinity" that way, but since C99 is most of the world
now, it seems likely that applications are expecting all these spellings
to work.
Per bug #8355 from Basil Peace. It turns out this fix won't actually
resolve his problem, because Python isn't being this careful; but that
doesn't mean we shouldn't be.

If a tuple is locked but not updated by a concurrent transaction,
HeapTupleSatisfiesDirty would return that transaction's Xid in xmax,
causing callers to wait on it, when it is not necessary (in fact, if the
other transaction had used a multixact instead of a plain Xid to mark
the tuple, HeapTupleSatisfiesDirty would have behave differently and
*not* returned the Xmax).
This bug was introduced in commit 3f7fbf85dc5b42, dated December 1998,
so it's almost 15 years old now. However, it's hard to see this
misbehave, because before we had NOWAIT the only consequence of this is
that transactions would wait for slightly more time than necessary; so
it's not surprising that this hasn't been reported yet.
Craig Ringer and Andres Freund

We should really be reporting a useful error along with returning
a valid return code if pthread_mutex_lock() throws an error for
some reason. Add that and back-patch to 9.0 as the prior patch.
Pointed out by Alvaro Herrera

I've been working with Nick Phillips on an issue he ran into when
trying to use threads with SSL client certificates. As it turns out,
the call in initialize_SSL() to SSL_CTX_use_certificate_chain_file()
will modify our SSL_context without any protection from other threads
also calling that function or being at some other point and trying to
read from SSL_context.
To protect against this, I've written up the attached (based on an
initial patch from Nick and much subsequent discussion) which puts
locks around SSL_CTX_use_certificate_chain_file() and all of the other
users of SSL_context which weren't already protected.
Nick Phillips, much reworked by Stephen Frost
Back-patch to 9.0 where we started loading the cert directly instead of
using a callback.

We'd find the same match twice if it was of zero length and not immediately
adjacent to the previous match. replace_text_regexp() got similar cases
right, so adjust this search logic to match that. Note that even though
the regexp_split_to_xxx() functions share this code, they did not display
equivalent misbehavior, because the second match would be considered
degenerate and ignored.
Jeevan Chalke, with some cosmetic changes by me.

Refactoring as part of commit 8ceb24568054232696dddc1166a8563bc78c900a
had the unintended effect of making REINDEX TABLE and REINDEX DATABASE
no longer validate constraints enforced by the indexes in question;
REINDEX INDEX still did so. Indexes marked invalid remained so, and
constraint violations arising from data corruption went undetected.
Back-patch to 9.0, like the causative commit.

These modules used the YYPARSE_PARAM macro, which has been deprecated
by the bison folk since 1.875, and which they finally removed in 3.0.
Adjust the code to use the replacement facility, %parse-param, which
is a much better solution anyway since it allows specification of the
type of the extra parser parameter. We can thus get rid of a lot of
unsightly casting.
Back-patch to all active branches, since somebody might try to build
a back branch with up-to-date tools.

Fix booltestsel() for case where we have NULL stats but not MCV stats.

In a boolean column that contains mostly nulls, ANALYZE might not find
enough non-null values to populate the most-common-values stats,
but it would still create a pg_statistic entry with stanullfrac set.
The logic in booltestsel() for this situation did the wrong thing for
"col IS NOT TRUE" and "col IS NOT FALSE" tests, forgetting that null
values would satisfy these tests (so that the true selectivity would
be close to one, not close to zero). Per bug #8274.
Fix by Andrew Gierth, some comment-smithing by me.

It's possible to drop a column from an input table of a JOIN clause in a
view, if that column is nowhere actually referenced in the view. But it
will still be there in the JOIN clause's joinaliasvars list. We used to
replace such entries with NULL Const nodes, which is handy for generation
of RowExpr expansion of a whole-row reference to the view. The trouble
with that is that it can't be distinguished from the situation after
subquery pull-up of a constant subquery output expression below the JOIN.
Instead, replace such joinaliasvars with null pointers (empty expression
trees), which can't be confused with pulled-up expressions. expandRTE()
still emits the old convention, though, for convenience of RowExpr
generation and to reduce the risk of breaking extension code.
In HEAD and 9.3, this patch also fixes a problem with some new code in
ruleutils.c that was failing to cope with implicitly-casted joinaliasvars
entries, as per recent report from Feike Steenbergen. That oversight was
because of an inadequate description of the data structure in parsenodes.h,
which I've now corrected. There were some pre-existing oversights of the
same ilk elsewhere, which I believe are now all fixed.

An ancient logic error in cfindloop() could cause the regex engine to fail
to find matches that begin later than the start of the string. This
function is only used when the regex pattern contains a back reference,
and so far as we can tell the error is only reachable if the pattern is
non-greedy (i.e. its first quantifier uses the ? modifier). Furthermore,
the actual match must begin after some potential match that satisfies the
DFA but then fails the back-reference's match test.
Reported and fixed by Jeevan Chalke, with cosmetic adjustments by me.

In tuplesort.c:inittapes(), we calculate tapeSpace by first figuring
out how many 'tapes' we can use (maxTapes) and then multiplying the
result by the tape buffer overhead for each. Unfortunately, when
we are on a system with an 8-byte long, we allow work_mem to be
larger than 2GB and that allows maxTapes to be large enough that the
32bit arithmetic can overflow when multiplied against the buffer
overhead.
When this overflow happens, we end up adding the overflow to the
amount of space available, causing the amount of memory allocated to
be larger than work_mem.
Note that to reach this point, you have to set work mem to at least
24GB and be sorting a set which is at least that size. Given that a
user who can set work_mem to 24GB could also set it even higher, if
they were looking to run the system out of memory, this isn't
considered a security issue.
This overflow risk was found by the Coverity scanner.
Back-patch to all supported branches, as this issue has existed
since before 8.4.

Make it easier for readers of the FP docs to find out about possibly
truncated values.
Per complaint from Tom Duffey in message
F0E0F874-C86F-48D1-AA2A-0C5365BF5118@trillitech.com
Author: Albe Laurenz
Reviewed by: Abhijit Menon-Sen

When there's a comment on an index that was created with UNIQUE or PRIMARY
KEY constraint syntax, we need to label the comment as depending on the
constraint not the index, since only the constraint object actually appears
in the dump. This incorrect dependency can lead to parallel pg_restore
trying to restore the comment before the index has been created, per bug
#8257 from Lloyd Albin.
This patch fixes pg_dump to produce the right dependency in dumps made
in the future. Usually we also try to hack pg_restore to work around
bogus dependencies, so that existing (wrong) dumps can still be restored in
parallel mode; but that doesn't seem practical here since there's no easy
way to relate the constraint dump entry to the comment after the fact.
Andres Freund

Expect EWOULDBLOCK from a non-blocking connect() call only on Windows.

On Unix-ish platforms, EWOULDBLOCK may be the same as EAGAIN, which is
*not* a success return, at least not on Linux. We need to treat it as a
failure to avoid giving a misleading error message. Per the Single Unix
Spec, only EINPROGRESS and EINTR returns indicate that the connection
attempt is in progress.
On Windows, on the other hand, EWOULDBLOCK (WSAEWOULDBLOCK) is the expected
case. We must accept EINPROGRESS as well because Cygwin will return that,
and it doesn't seem worth distinguishing Cygwin from native Windows here.
It's not very clear whether EINTR can occur on Windows, but let's leave
that part of the logic alone in the absence of concrete trouble reports.
Also, remove the test for errno == 0, effectively reverting commit
da9501bddb42222dc33c031b1db6ce2133bcee7b, which AFAICS was just a thinko;
or at best it might have been a workaround for a platform-specific bug,
which we can hope is gone now thirteen years later. In any case, since
libpq makes no effort to reset errno to zero before calling connect(),
it seems unlikely that that test has ever reliably done anything useful.
Andres Freund and Tom Lane

Adjust the wording in the first para of "Sequence Manipulation Functions"
so that neither of the link phrases in it break across line boundaries,
in either A4- or US-page-size PDF output. This fixes a reported build
failure for the 9.3beta2 A4 PDF docs, and future-proofs this particular
para against causing similar problems in future. (Perhaps somebody will
fix this issue in the SGML/TeX documentation tool chain someday, but I'm
not holding my breath.)
Back-patch to all supported branches, since the same problem could rise up
to bite us in future updates if anyone changes anything earlier than this
in func.sgml.

In some cases with higher numbers of subtransactions
it was possible for us to incorrectly initialize
subtrans leading to complaints of missing pages.
Bug report by Sergey Konoplev
Analysis and fix by Andres Freund

In most scenarios a portal without a ResourceOwner is dead and not subject
to any further execution, but a portal for a cursor WITH HOLD remains in
existence with no ResourceOwner after the creating transaction is over.
In this situation, if we attempt to "execute" the portal directly to fetch
data from it, we were setting CurrentResourceOwner to NULL, leading to a
segfault if the datatype output code did anything that required a resource
owner (such as trying to fetch system catalog entries that weren't already
cached). The case appears to be impossible to provoke with stock libpq,
but psqlODBC at least is able to cause it when working with held cursors.
Simplest fix is to just skip the assignment to CurrentResourceOwner, so
that any resources used by the data output operations will be managed by
the transaction-level resource owner instead. For consistency I changed
all the places that install a portal's resowner as current, even though
some of them are probably not reachable with a held cursor's portal.
Per report from Joshua Berry (with thanks to Hiroshi Inoue for developing
a self-contained test case). Back-patch to all supported versions.

When the existing code here was written, it made sense to special-case
RowExprs because that was the only way that we could handle row comparisons
at all. Now that we have record_eq() and arrays of composites, the generic
logic for "scalar" types will in fact work on RowExprs too, so there's no
reason to throw error for combinations of RowExprs and other ways of
forming composite values, nor to ignore the possibility of using a
ScalarArrayOpExpr. But keep using the old logic when comparing two
RowExprs, for consistency with the main transformAExprOp() logic. (This
allows some cases with not-quite-identical rowtypes to succeed, so we might
get push-back if we removed it.) Per bug #8198 from Rafal Rzepecki.
Back-patch to all supported branches, since this works fine as far back as
8.4.
Rafal Rzepecki and Tom Lane

Per discussion, this restriction isn't needed for any real security reason,
and it seems to confuse people more often than it helps them. It could
also result in some database states being unrestorable. So just drop it.
Back-patch to 9.0, where ALTER DEFAULT PRIVILEGES was introduced.

Long-standing code has called tolower() on identifier character bytes
with the high bit set. This is clearly an error and produces junk output
when the encoding is multi-byte. This patch therefore restricts this
activity to cases where there is a character with the high bit set AND
the encoding is single-byte.
There have been numerous gripes about this, most recently from Martin
Schäfer.
Backpatch to all live releases.

The planner is aware that it mustn't push down upper-level quals into
subqueries if the quals reference subquery output columns that contain
set-returning functions or volatile functions, or are non-DISTINCT outputs
of a DISTINCT ON subquery. However, it missed making this check when
there were one or more levels of UNION or INTERSECT above the dangerous
expression. This could lead to "set-valued function called in context that
cannot accept a set" errors, as seen in bug #8213 from Eric Soroos, or to
silently wrong answers in the other cases.
To fix, refactor the checks so that we make the column-is-unsafe checks
during subquery_is_pushdown_safe(), which already has to recursively
inspect all arms of a set-operation tree. This makes
qual_is_pushdown_safe() considerably simpler, at the cost that we will
spend some cycles checking output columns that possibly aren't referenced
in any upper qual. But the cases where this code gets executed at all
are already nontrivial queries, so it's unlikely anybody will notice any
slowdown of planning.
This has been broken since commit 05f916e6add9726bf4ee046e4060c1b03c9961f2,
which makes the bug over ten years old. A bit surprising nobody noticed it
before now.

In commit 2c92edad48796119c83d7dbe6c33425d1924626d, I broke "EXPLAIN
(ANALYZE)" syntax, because I mistakenly thought that ANALYZE/ANALYSE were
only partially reserved and thus would be included in NonReservedWord;
but actually they're fully reserved so they still need to be called out
here.
A nicer solution would be to demote these words to type_func_name_keyword
status (they can't be less than that because of "VACUUM [ANALYZE] ColId").
While that works fine so far as the core grammar is concerned, it breaks
ECPG's grammar for reasons I don't have time to isolate at the moment.
So do this for the time being.
Per report from Kevin Grittner. Back-patch to 9.0, like the previous
commit.

The array allocated by GetRunningTransactionLocks() needs to be pfree'd
when we're done with it. Otherwise we leak some memory during each
checkpoint, if wal_level = hot_standby. This manifests as memory bloat
in the checkpointer process, or in bgwriter in versions before we made
the checkpointer separate.
Reported and fixed by Naoya Anzai. Back-patch to 9.0 where the issue
was introduced.
In passing, improve comments for GetRunningTransactionLocks(), and add
an Assert that we didn't overrun the palloc'd array.

"eval q{foo}" used to complain that the error was on line 2 of the eval'd
string, because eval internally tacked on "\n;" so that the end of the
erroneous command was indeed on line 2. But as of Perl 5.18 it more
sanely says that the error is on line 1. To avoid Perl-version-dependent
regression test results, use "eval q{foo;}" instead in the two places
where this matters. Per buildfarm.
Since people might try to use newer Perl versions with older PG releases,
back-patch as far as 9.0 where these test cases were added.

Allow type_func_name_keywords in some places where they weren’t before.

This change makes type_func_name_keywords less reserved than they were
before, by allowing them for role names, language names, EXPLAIN and COPY
options, and SET values for GUCs; which are all places where few if any
actual keywords could appear instead, so no new ambiguities are introduced.
The main driver for this change is to allow "COPY ... (FORMAT BINARY)"
to work without quoting the word "binary". That is an inconsistency that
has been complained of repeatedly over the years (at least by Pavel Golub,
Kurt Lidl, and Simon Riggs); but we hadn't thought of any non-ugly solution
until now.
Back-patch to 9.0 where the COPY (FORMAT BINARY) syntax was introduced.

PathNameOpenFile failed to ensure that the correct value of errno was
returned to its caller after a failure (because it incorrectly supposed
that free() can never change errno). In some cases this would result
in a user-visible failure because an expected ENOENT errno was replaced
with something else. Bogus EINVAL failures have been observed on OS X,
for example.
There were also a couple of places that could mangle an important value
of errno if FDDEBUG was defined. While the usefulness of that debug
support is highly debatable, we might as well make it safe to use,
so add errno save/restore logic to the DO_DB macro.
Per bug #8167 from Nelson Minar, diagnosed by RhodiumToad.
Back-patch to all supported branches.

If OID wraparound should occur while in standalone mode (unlikely but
possible), we want to advance the counter to FirstNormalObjectId not
FirstBootstrapObjectId. Otherwise, user objects might be created with OIDs
in the system-reserved range. That isn't immediately harmful but it poses
a risk of conflicts during future pg_upgrade operations.
Noted by Andres Freund. Back-patch to all supported branches, since all of
them are supported sources for pg_upgrade operations.

This case doesn't normally happen, because the planner usually clamps
all row estimates to at least one row; but I found that it can arise
when dealing with relations excluded by constraints. Without a defense,
estimate_num_groups() can return zero, which leads to divisions by zero
inside the planner as well as assertion failures in the executor.
An alternative fix would be to change set_dummy_rel_pathlist() to make
the size estimate for a dummy relation 1 row instead of 0, but that seemed
pretty ugly; and probably someday we'll want to drop the convention that
the minimum rowcount estimate is 1 row.
Back-patch to 8.4, as the problem can be demonstrated that far back.

Patch b19e4250b45e91c9cbdd18d35ea6391ab5961c8d attempted to
preserve existing behavior regarding statistics generation in the
case that a truncation attempt was canceled due to lock conflicts.
It failed to do this accurately in two regards: (1) autovacuum had
previously generated statistics if the truncate attempt failed to
initially get the lock rather than having started the attempt, and
(2) the VACUUM ANALYZE command had always generated statistics.
Both of these changes were unintended, and are reverted by this
patch. On review, there seems to be consensus that the previous
failure to generate statistics when the truncate was terminated
was more an unfortunate consequence of how that effort was
previously terminated than a feature we want to keep; so this
patch generates statistics even when an autovacuum truncation
attempt terminates early. Another unintended change which is kept
on the basis that it is an improvement is that when a VACUUM
command is truncating, it will the new heuristic for avoiding
blocking other processes, rather than keeping an
AccessExclusiveLock on the table for however long the truncation
takes.
Per multiple reports, with some renaming per patch by Jeff Janes.
Backpatch to 9.0, where problem was created.

There was a high probability of two or more concurrent C.I.C. commands
deadlocking just before completion, because each would wait for the others
to release their reference snapshots. Fix by releasing the snapshot
before waiting for other snapshots to go away.
Per report from Paul Hinze. Back-patch to all active branches.

When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.

The old formula didn't take into account that each WAL sender process needs
a spinlock. We had also already exceeded the fixed number of spinlocks
reserved for misc purposes (10). Bump that to 30.
Backpatch to 9.0, where WAL senders were introduced. If I counted correctly,
9.0 had exactly 10 predefined spinlocks, and 9.1 exceeded that, but bump the
limit in 9.0 too because 10 is uncomfortably close to the edge.