This list contains some known PostgreSQL bugs, some feature requests, and some things we are not even sure we want. Many of these items are hard, and some are perhaps impossible. If you would like to work on an item, please read the Developer FAQ first. There is also a development information page.

- marks ordinary, incomplete items

[E] - marks items that are easier to implement

[D] - marks changes that are done, and will appear in the PostgreSQL 12 release.

For help on editing this list, please see Talk:Todo. Please do not add items here without discussion on the mailing list.

Development Process

WARNING for Developers: Unfortunately this list does not contain all the information necessary for someone to start coding a feature. Some of these items might have become unnecessary since they were added --- others might be desirable but the implementation might be unclear. When selecting items listed below, be prepared to first discuss the value of the feature. Do not assume that you can select one, code it and then expect it to be committed. Always discuss design on Hackers list before starting to code. The flow should be:

This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them. See also Logging Brainstorm.

Tablespaces

Allow a database in tablespace t1 with tables created in tablespace t2 to be used as a template for a new database created with default tablespace t2

Currently all objects in the default database tablespace must have default tablespace specifications. This is because new databases are created by copying directories. If you mix default tablespace tables and tablespace-specified tables in the same directory, creating a new database from such a mixed directory would create a new database with tables that had incorrect explicit tablespaces. To fix this would require modifying pg_class in the newly copied database, which we don't currently do.

Allow reporting of which objects are in which tablespaces

This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.

Allow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original

XML Canonical: Convert XML documents to canonical form to compare them. libxml2 has support for this.

Add pretty-printed XML output option

Parse a document and serialize it back in some indented form. libxml2 might support this.

Add XMLQUERY (from the SQL/XML standard)

Allow XML shredding

In some cases shredding could be better option (if there is no need to keep XML docs entirely, e.g. if we have already developed tools that understand only relational data. This would be a separate module that implements annotated schema decomposition technique, similar to DB2 and SQL Server functionality.

Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation. Another issue is that we should probably force NOT NULL to be propagated from parent tables to children, just as CHECK constraints are. (But then does dropping PRIMARY KEY affect children?)

CLUSTER

Automatically maintain clustering on a table

This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.

This items has been fixed in commit 750c5e.
Fix psql \s to work with recent libedit, and add pager support.

psql's \s (print command history) doesn't work at all with recent libedit
versions when printing to the terminal, because libedit tries to do an
fchmod() on the target file which will fail if the target is /dev/tty.
(We'd already noted this in the context of the target being /dev/null.)
Even before that, it didn't work pleasantly, because libedit likes to
encode the command history file (to ensure successful reloading), which
renders it nigh unreadable, not to mention significantly different-looking
depending on exactly which libedit version you have. So let's forget using
write_history() for this purpose, and instead print the data ourselves,
using logic similar to that used to iterate over the history for newline
encoding/decoding purposes.

While we're at it, insert the ability to use the pager when \s is printing
to the terminal. This has been an acknowledged shortcoming of \s for many
years, so while you could argue it's not exactly a back-patchable bug fix
it still seems like a good improvement. Anyone who's seriously annoyed
at this can use "\s /dev/tty" or local equivalent to get the old behavior.

Experimentation with this showed that the history iteration logic was
actually rather broken when used with libedit. It turns out that with
libedit you have to use previous_history() not next_history() to advance
to more recent history entries. The easiest and most robust fix for this
seems to be to make a run-time test to verify which function to call.
We had not noticed this because libedit doesn't really need the newline
encoding logic: its own encoding ensures that command entries containing
newlines are reloaded correctly (unlike libreadline). So the effective
behavior with recent libedits was that only the oldest history entry got
newline-encoded or newline-decoded. However, because of yet other bugs in
history_set_pos(), some old versions of libedit allowed the existing loop
logic to reach entries besides the oldest, which means there may be libedit
~/.psql_history files out there containing encoded newlines in more than
just the oldest entry. To ensure we can reload such files, it seems
appropriate to back-patch this fix, even though that will result in some
incompatibility with older psql versions (ie, multiline history entries
written by a psql with this fix will look corrupted to a psql without it,
if its libedit is reasonably up to date).

Stepan Rutz and Tom Lane

Add a \set variable to control whether \s displays line numbers

Another option is to add \# which lists line numbers, and allows command execution.

pg_dump / pg_restore

[E] Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.

Avoid using platform-dependent names for locales in pg_dumpall output

Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get
CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.

In a selective dump, allow dumping of an object and all its dependencies

Stop dumping CASCADE on DROP TYPE commands in clean mode

Allow pg_restore to load different parts of the COPY data for a single table simultaneously

[D]Refactor handling of database attributes between pg_dump and pg_dumpall

Currently only pg_dumpall emits database attributes, such as ALTER DATABASE SET commands and database-level GRANTs.
Many people wish that pg_dump would do that. One proposal is to let pg_dump issue such commands if the -C switch was used,
but it's unclear whether that will satisfy the demand.

Triggers

Improve storage of deferred trigger queue

Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, some bulk operation, or a bitmap.

This is currently possible by starting a multi-statement transaction, modifying the system tables, performing the desired SQL, restoring the system tables, and committing the transaction. ALTER TABLE ... TRIGGER requires a table lock so it is not ideal for this usage.

Postgres 11 allows unique indexes across partitions if the partition key is part of the index.

Research whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY)

ALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.

Indexes

Prevent index uniqueness checks when UPDATE does not modify the column

Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE.
However, HOT already short-circuits this in common cases, so more work might not be helpful.

Consider having a larger statistics target for indexed columns and expression indexes.

Add REINDEX CONCURRENTLY, like CREATE INDEX CONCURRENTLY

This is difficult because you must upgrade to an exclusive table lock to replace the existing index file. CREATE INDEX CONCURRENTLY does not have this complication. This would allow index compaction without downtime.

Write-Ahead Log

Eliminate need to write full pages to WAL before page modification

Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.

When full page writes are off, write CRC to WAL and check file system blocks on recovery

If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches. The difficulty is that hint bits are not WAL logged, meaning a valid page might not match the earlier CRC.

Write full pages during file system write and not when the page is modified in the buffer cache

This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.

Miscellaneous Performance

Use mmap() rather than shared memory for shared buffers?

This would remove the requirement for SYSV SHM but would introduce portability issues. Anonymous mmap (or mmap to /dev/zero) is required to prevent I/O overhead. We could also consider mmap() for writing WAL.

Rather than consider mmap()-ing in 8k pages, consider mmap()'ing entire files into a backend?

Doing I/O to large tables would consume a lot of address space or require frequent mapping/unmapping. Extending the file also causes mapping problems that might require mapping only individual pages, leading to thousands of mappings. Another problem is that there is no way to _prevent_ I/O to disk from the dirty shared buffers so changes could hit disk before WAL is written.

Specify and implement wire protocol compression. If SSL transparent compression is used, hopefully avoid the overhead of key negotiation and encryption when SSL is configured only for compression. Note that compression is being removed from TLS 1.3 so we really need to do it ourselves.

Exotic Features

This could allow SQL written for other databases to run without modification.

Allow plug-in modules to emulate features from other databases

Add features of Oracle-style packages

A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.

Features We Do Not Want

The following features have been discussed ad nauseum on the PostgreSQL mailing lists and the consensus has been that the project is not interested in them. As such, if you are going to bring them up as potential features, you will want to be familiar with all of the arguments against these features which have been previously made over the years. If you decide to work on such features anyway, you should be aware that you face a higher-than-normal barrier to get the Project to accept them.

All backends running as threads in a single process (not wanted)

This eliminates the process protection we get from the current setup. Thread creation is usually the same overhead as process creation on modern systems, so it seems unwise to use a pure threaded model, and MySQL and DB2 have demonstrated that threads introduce as many issues as they solve. Threading specific operations such as I/O, seq scans, and connection management has been discussed and will probably be implemented to enable specific performance features. Moving to a threaded engine would also require halting all other work on PostgreSQL for one to two years.

Optimizer hints, as implemented in Oracle and other RDBMSes, are used to work around problems in the optimizer and introduce upgrade and maintenance issues. We would rather have such problems reported and fixed. We have discussed a more sophisticated system of per-class cost adjustment instead, but a specification remains to be developed. See Optimizer Hints Discussion for further information.

Embedded server (not wanted)

While PostgreSQL clients runs fine in limited-resource environments, the server requires multiple processes and a stable pool of resources to run reliably and efficiently. Stripping down the PostgreSQL server to run in the same process address space as the client application would add too much complexity and failure cases. Besides, there are several very mature embedded SQL databases already available.

Obfuscated function source code (not wanted)

Obfuscating function source code has minimal protective benefits because anyone with super-user access can find a way to view the code. At the same time, it would greatly complicate backups and other administrative tasks. To prevent non-super-users from viewing function source code, remove SELECT permission on pg_proc.

At least one other database product allows specification of a subset of the result columns which GROUP BY would need to be able to provide predictable results; the server is free to return any value from the group. This is not viewed as a desirable feature. PostgreSQL 9.1 allows result columns that are not referenced by GROUP BY if a primary key for the same table is referenced in GROUP BY.