The ability to create like-named objects in different schemas opens up
the potential for users to change the behavior of other users' queries,
maliciously or accidentally. When you connect to a PostgreSQL server,
you should remove from your search_path any schema for which a user
other than yourself or superusers holds the CREATE privilege. If you do
not, other users holding CREATE privilege can redefine the behavior of
your commands, causing them to perform arbitrary SQL statements under
your identity. "SET search_path = ..." and "SELECT
pg_catalog.set_config(...)" are not vulnerable to such hijacking, so one
can use either as the first command of a session. As special
exceptions, the following client applications behave as documented
regardless of search_path settings and schema privileges: clusterdb
createdb createlang createuser dropdb droplang dropuser ecpg (not
programs it generates) initdb oid2name pg_archivecleanup pg_basebackup
pg_config pg_controldata pg_ctl pg_dump pg_dumpall pg_isready
pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_standby
pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb
vacuumlo. Not included are core client programs that run user-specified
SQL commands, namely psql and pgbench. PostgreSQL encourages non-core
client applications to do likewise.
Document this in the context of libpq connections, psql connections,
dblink connections, ECPG connections, extension packaging, and schema
usage patterns. The principal defense for applications is "SELECT
pg_catalog.set_config('search_path', '', false)", and the principal
defense for databases is "REVOKE CREATE ON SCHEMA public FROM PUBLIC".
Either one is sufficient to prevent attack. After a REVOKE, consider
auditing the public schema for objects named like pg_catalog objects.
Authors of SECURITY DEFINER functions use some of the same defenses, and
the CREATE FUNCTION reference page already covered them thoroughly.
This is a good opportunity to audit SECURITY DEFINER functions for
robust security practice.
Back-patch to 9.3 (all supported versions).
Reviewed by Michael Paquier and Jonathan S. Katz. Reported by Arseniy
Sharoglazov.
Security: CVE-2018-1058

This makes the client programs behave as documented regardless of the
connect-time search_path and regardless of user-created objects. Today,
a malicious user with CREATE permission on a search_path schema can take
control of certain of these clients' queries and invoke arbitrary SQL
functions under the client identity, often a superuser. This is
exploitable in the default configuration, where all users have CREATE
privilege on schema "public".
This changes behavior of user-defined code stored in the database, like
pg_index.indexprs and pg_extension_config_dump(). If they reach code
bearing unqualified names, "does not exist" or "no schema has been
selected to create in" errors might appear. Users may fix such errors
by schema-qualifying affected names. After upgrading, consider watching
server logs for these errors.
The --table arguments of src/bin/scripts clients have been lax; for
example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint. That
now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still
performs a checkpoint.
Back-patch to 9.3 (all supported versions).
Reviewed by Tom Lane, though this fix strategy was not his first choice.
Reported by Arseniy Sharoglazov.
Security: CVE-2018-1058

Historically, pg_dump has "set search_path = foo, pg_catalog" when
dumping an object in schema "foo", and has also caused that setting
to be used while restoring the object. This is problematic because
functions and operators in schema "foo" could capture references meant
to refer to pg_catalog entries, both in the queries issued by pg_dump
and those issued during the subsequent restore run. That could
result in dump/restore misbehavior, or in privilege escalation if a
nefarious user installs trojan-horse functions or operators.
This patch changes pg_dump so that it does not change the search_path
dynamically. The emitted restore script sets the search_path to what
was used at dump time, and then leaves it alone thereafter. Created
objects are placed in the correct schema, regardless of the active
search_path, by dint of schema-qualifying their names in the CREATE
commands, as well as in subsequent ALTER and ALTER-like commands.
Since this change requires a change in the behavior of pg_restore
when processing an archive file made according to this new convention,
bump the archive file version number; old versions of pg_restore will
therefore refuse to process files made with new versions of pg_dump.
Security: CVE-2018-1058

In the pgoutput plugin, skip changes for relations that are not
publishable, per is_publishable_class(). This concerns in particular
materialized views and information_schema tables. While those relations
cannot be part of a publication, per existing checks, they will be
considered by a FOR ALL TABLES publication. A subscription would not
actually apply changes for those relations, again per existing checks,
but trying to match incoming changes to local tables on the subscriber
would lead to errors if no matching local table exists. Skipping those
changes on the publisher avoids sending useless changes and eliminates
the error.
Bug: #15044
Reported-by: Chad Trabant <chad@iris.washington.edu>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>

The previous limit of INT_MAX / 1000 seems to have been cargo-culted in
from somewhere else. Or possibly the value was converted to microseconds
at some point; but in all supported releases, it's just compared to other
values, so there's no need for the restriction. This change raises the
effective limit from ~35 minutes to ~24 days, which conceivably is useful
to somebody, and anyway it's more consistent with the range of the core
log_min_duration_statement GUC.
Per complaint from Kevin Bloch. Back-patch to all supported releases.
Discussion: https://postgr.es/m/8ea82d7e-cb78-8e05-0629-73aa14d2a0ca@codingthat.com

Given overlapping or partially redundant join clauses, for example
t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x
the planner's EquivalenceClass machinery will ordinarily refactor the
clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't
see multiple references to the same EquivalenceClass in a list of join
equality clauses. However, if the join is outer, it's incorrect to derive
a restriction clause on the outer side from the join conditions, so the
clause refactoring does not happen and we end up with overlapping join
conditions. The code that attempted to deal with such cases had several
subtle bugs, which could result in "left and right pathkeys do not match in
mergejoin" or "outer pathkeys do not match mergeclauses" planner errors,
if the selected join plan type was a mergejoin. (It does not appear that
any actually incorrect plan could have been emitted.)
The core of the problem really was failure to recognize that the outer and
inner relations' pathkeys have different relationships to the mergeclause
list. A join's mergeclause list is constructed by reference to the outer
pathkeys, so it will always be ordered the same as the outer pathkeys, but
this cannot be presumed true for the inner pathkeys. If the inner sides of
the mergeclauses contain multiple references to the same EquivalenceClass
({t2.x} in the above example) then a simplistic rendering of the required
inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery
recognizes that the second sort column is redundant and throws it away.
The mergejoin planning code failed to account for that behavior properly.
One error was to try to generate cut-down versions of the mergeclause list
from cut-down versions of the inner pathkeys in the same way as the initial
construction of the mergeclause list from the outer pathkeys was done; this
could lead to choosing a mergeclause list that fails to match the outer
pathkeys. The other problem was that the pathkey cross-checking code in
create_mergejoin_plan treated the inner and outer pathkey lists
identically, whereas actually the expectations for them must be different.
That led to false "pathkeys do not match" failures in some cases, and in
principle could have led to failure to detect bogus plans in other cases,
though there is no indication that such bogus plans could be generated.
Reported by Alexander Kuzmenkov, who also reviewed this patch. This has
been broken for years (back to around 8.3 according to my testing), so
back-patch to all supported branches.
Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru

This oversight led to data corruption in matviews, manifesting as
"could not access status of transaction" before our most recent releases,
and "found xmin from before relfrozenxid" errors since then.
The proximate cause of the problem seems to have been confusion between
the task of preserving dropped-column status and the task of preserving
frozenxid status. Those are required for distinct sets of relkinds,
and the reasoning was entirely undocumented in the source code. In hopes
of forestalling future errors of the same kind, try to improve the
commentary in this area.
In passing, also improve the remarkably unhelpful comments around
pg_upgrade's set_frozenxids(). That's not actually buggy AFAICS,
but good luck figuring out what it does from the old comments.
Per report from Claudio Freire. It appears that bug #14852 from Alexey
Ermakov is an earlier report of the same issue, and there may be other
cases that we failed to identify at the time.
Patch by me based on analysis by Andres Freund. The bug dates back
to the introduction of matviews, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAGTBQpbrY9CdRGGhyBZ9yqY4jWaGC85rUF4X+R7d-aim=mBNsw@mail.gmail.com
Discussion: https://postgr.es/m/20171013115320.28049.86457@wrigleys.postgresql.org

The previous coding here applied atoi() to strings that could represent
values too large to fit in an int. If the overflowed value happened to
match one of the cases it was looking for, it would drop that limit
value from the output, leading to incorrect restoration of the sequence.
Avoid the unsafe behavior, and also make the logic cleaner by explicitly
calculating the default min/max values for the appropriate kind of
sequence.
Reported and patched by Alexey Bashtanov, though I whacked his patch
around a bit. Back-patch to v10 where the faulty logic was added.
Discussion: https://postgr.es/m/cb85a9a5-946b-c7c4-9cf2-6cd6e25d7a33@imap.cc

An updating query that reads a CTE within an InitPlan or SubPlan could get
incorrect results if it updates rows that are concurrently being modified.
This is caused by CteScanNext supposing that nothing inside its recursive
ExecProcNode call could change which read pointer is selected in the CTE's
shared tuplestore. While that's normally true because of scoping
considerations, it can break down if an EPQ plan tree gets built during the
call, because EvalPlanQualStart builds execution trees for all subplans
whether they're going to be used during the recheck or not. And it seems
like a pretty shaky assumption anyway, so let's just reselect our own read
pointer here.
Per bug #14870 from Andrei Gorita. This has been broken since CTEs were
implemented, so back-patch to all supported branches.
Discussion: https://postgr.es/m/20171024155358.1471.82377@wrigleys.postgresql.org

One example in create_table.sgml claimed to be showing table constraint
syntax, but it was really column constraint syntax due to the omission
of a comma. This is both wrong and confusing, so fix it in all
supported branches.
Per report from neil@postgrescompare.com.
Discussion: https://postgr.es/m/151871659877.1393.2431103178451978795@wrigleys.postgresql.org

plpython_error_callback() reported the name of the function associated
with the topmost PL/Python execution context. This was not merely
wrong if there were nested PL/Python contexts, but it risked a core
dump if the topmost one is an inline code block rather than a named
function. That will have proname = NULL, and so we were passing a NULL
pointer to snprintf("%s"). It seems that none of the PL/Python-testing
machines in the buildfarm will dump core for that, but some platforms do,
as reported by Marina Polyakova.
Investigation finds that there actually is an existing regression test
that used to prove that the behavior was wrong, though apparently no one
had noticed that it was printing the wrong function name. It stopped
showing the problem in 9.6 when we adjusted psql to not print CONTEXT
by default for NOTICE messages. The problem is masked (if your platform
avoids the core dump) in error cases, because PL/Python will throw away
the originally generated error info in favor of a new traceback produced
at the outer level.
Repair by using ErrorContextCallback.arg to pass the correct context to
the error callback. Add a regression test illustrating correct behavior.
Back-patch to all supported branches, since they're all broken this way.
Discussion: https://postgr.es/m/156b989dbc6fe7c4d3223cf51da61195@postgrespro.ru

pg_dump supposed that a stats object necessarily shares the same schema
as its underlying table, and that it doesn't have a separate owner.
These things may have been true during early development of the feature,
but they are not true as of v10 release.
Failure to track the object's schema separately turns out to have only
limited consequences, because pg_get_statisticsobjdef() always schema-
qualifies the target object name in the generated CREATE STATISTICS command
(a decision out of step with the rest of ruleutils.c, but I digress).
Therefore the restored object would be in the right schema, so that the
only problem is that the TOC entry would be mislabeled as to schema. That
could lead to wrong decisions for schema-selective restores, for example.
The ownership issue is a bit more serious: not only was the TOC entry
potentially mislabeled as to owner, but pg_dump didn't bother to issue an
ALTER OWNER command at all, so that after restore the stats object would
continue to be owned by the restoring superuser.
A final point is that decisions as to whether to dump a stats object or
not were driven by whether the underlying table was dumped or not. While
that's not wrong on its face, it won't scale nicely to the planned future
extension to cross-table statistics. Moreover, that design decision comes
out of the view of stats objects as being auxiliary to a particular table,
like a rule or trigger, which is exactly where the above problems came
from. Since we're now treating stats objects more like independent objects
in their own right, they ought to behave like standalone objects for this
purpose too. So change to using the generic selectDumpableObject() logic
for them (which presently amounts to "dump if containing schema is to be
dumped").
Along the way to fixing this, restructure so that getExtendedStatistics
collects the identity info (only) for all extended stats objects in one
query, and then for each object actually being dumped, we retrieve the
definition in dumpStatisticsExt. This is necessary to ensure that
schema-qualification in the generated CREATE STATISTICS command happens
with respect to the search path that pg_dump will now be using at restore
time (ie, the schema the stats object is in, not that of the underlying
table). It's probably also significantly faster in the typical scenario
where only a minority of tables have extended stats.
Back-patch to v10 where extended stats were introduced.
Discussion: https://postgr.es/m/18272.1518328606@sss.pgh.pa.us

Since we now support the server side handler for git over https (so
we're no longer using the "dumb protocol"), make https the primary
choice for cloning the repository, and the git protocol the secondary
choice.
In passing, also change the links to git-scm.com from http to https.
Reviewed by Stefan Kaltenbrunner and David G. Johnston