parallel joins

parallel aggregates

Parallel Aggregates allow aggregate functions to be processed concurrently by multiple worker processes. This can significantly increase the response times for queries aggregating large amounts of tuples down to just a few aggregate groups, providing that the server has enough otherwise idle CPUs and would otherwise be bounded by the single CPU which the backend process is running on. Each worker process operates on a subset of tuples and creates a partially aggregated result to which it returns to the main backend process where each partial result will be combined with ones from the other worker processes and the final aggregate result produced.

Most internal aggregate functions support parallel mode. User defined aggregates will need to be changed to add a combine function, and possible a serialization and deserialization in the case of aggregate functions which return an INTERNAL state.

pg_visibility extension for examining visibility maps

User-defined expiration of snapshots to control table bloat

Until now a long-running report or cursor displaying query results could block cleanup of dead rows, bloating all volatile tables in the database. The only options were to allow the bloat, suspend normal updates to other tables, or kill the long-running activity -- which might be completely unrelated to the bloating tables. Accumulation of bloat could cause performance problems and excessive use of storage space.

A new configuration option called old_snapshot_threshold allows the cluster to be configured to allow cleanup of a dead row when the transaction which updated or deleted it (causing it to be dead) has completed and all snapshots which can still see it have reached a certain age, without immediately terminating activities which are using such snapshots. If, for example, a large monthly report is running off of data in table etl_monthly (which is not currently changing), bloat in other tables will be limited, and the report can run to completion without error. If a snapshot has passed the threshold and a query uses that snapshot to attempt to read data from a page modified recently enough that it might not produce accurate results, then a "snapshot too old" error will be thrown. Normally the goal in configuring old_snapshot_threshold is to set it high enough that such an error is rarely seen, while preventing problems caused by bloat.

Prompt variable to show PId of backend

Multiple -c and -f options

Crosstabs in psql

\crosstabview is a completely different way to display results from a
query: instead of a vertical display of rows, the data values are placed
in a grid where the column and row headers come from the data itself,
similar to a spreadsheet.

pg_notification_queue_usage to look at notify queue

Backups

pg_basebackup extended with replication slots

New API for hot physical backups

Prior to PostgreSQL 9.6, the only way to perform concurrent physical backups was through pg_basebackup, via the streaming replication protocol. Low-level file system copy was only available in an exclusive mode, by calling pg_start_backup(), initiating the copy of data files, then finally calling pg_stop_backup().

The limitation of taking one backup at a time has been overcome by a new native API which redefines the pg_start_backup() method and adds a variant to pg_stop_backup(). For back-compatibility, the older signatures have been kept.

The new pg_start_backup function now accepts a third optional parameter, called 'exclusive', which is set by default to 'true' for backward compatibility.
Initiating a concurrent (or if you prefer, a non-exclusive) backup is rather simple:

SELECT pg_start_backup('my_label', false, false);

If you are not familiar with pg_start_backup(), the first argument is a label for the backup, while the second is a request for a fast checkpoint operation. The third parameter, when set to false, requests a concurrent backup:

The PostgreSQL connection requesting the concurrent backup needs to remain active for the whole duration of the physical copy of data files (performed with your favourite tools such as rsync, cp, tar, SAN or LVM snapshots, ...). Note: this operation follows the same procedure as with prior versions of PostgreSQL (for detailed information, please refer to the documentation about continuous archiving and Point-In-Time-Recovery).

When finished, we need to use the new version of pg_stop_backup() to specify that we are closing the current non-exclusive backup:

SELECT * FROM pg_stop_backup(false);

The different signature has a mandatory parameter, called 'exclusive', which
needs to be set to 'false' for concurrent backup. The reason is that the function returns a different result, a row made up of three fields:

LSN, for backup consistency, returned at pg_stop_backup() time: identical to the previous version

the content of the label file: this needs to be saved as 'backup_label' in the main backup directory (new in 9.6)

the list of tablespaces: if not empty, this needs to be saved as 'tablespace_map' in the main backup directory (new in 9.6)

As you can see, these two steps require users to change their customised backup scripts. However, users should expect common disaster recovery and backup external solutions such as Barman, OmniPITR, PgBackRest, pghoard, WAL-E to transparently manage this new behaviour introduced in 9.6.

This new in-core API replaces the pgespresso package that was previously used with Barman 1.3.1 and above. Barman 2.0 now supports both pgespresso for PG9.2-PG9.5 and the new 9.6 API.

IMPORTANT: this new API will become the only supported way in the future to coordinate low level backup operations with PostgreSQL.