This list contains known PostgreSQL bugs and feature requests and we hope it is complete. If you would like to work on an item, please read the Developer FAQ first. There is also a development information page.

- marks ordinary, incomplete items

[E] - marks items that are easier to implement

[D] - marks changes that are done, and will appear in the PostgreSQL 9.3 release.

For help on editing this list, please see Talk:Todo. Please do not add items here without discussion on the mailing list.

For Developers: Unfortunately this list does not contain all the information necessary for someone to start coding a feature. Some of these items might have become unnecessary since they were added --- others might be desirable but the implementation might be unclear. When selecting items listed below, be prepared to first discuss the value of the feature. Do not assume that you can select one, code it and then expect it to be committed. Always discuss design on Hackers list before starting to code. The flow should be:

Desirability -> Design -> Implement -> Test -> Review -> Commit

Administration

Allow administrators to cancel multi-statement idle transactions

This allows locks to be released, but it is complex to report the cancellation back to the client.

Currently all schemas are owned by the super-user because they are copied from the template1 database. However, since all objects are inherited from the template database, it is not clear that setting schemas to the db owner is correct.

Allow log_min_messages to be specified on a per-module basis

This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them. See also Logging Brainstorm.

Simplify creation of partitioned tables

This would allow creation of partitioned tables without requiring creation of triggers or rules for INSERT/UPDATE/DELETE, and constraints for rapid partition selection. Options could include range and hash partition selection. See also Table partitioning

Tablespaces

Allow a database in tablespace t1 with tables created in tablespace t2 to be used as a template for a new database created with default tablespace t2

Currently all objects in the default database tablespace must have default tablespace specifications. This is because new databases are created by copying directories. If you mix default tablespace tables and tablespace-specified tables in the same directory, creating a new database from such a mixed directory would create a new database with tables that had incorrect explicit tablespaces. To fix this would require modifying pg_class in the newly copied database, which we don't currently do.

Allow reporting of which objects are in which tablespaces

This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.

Allow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original

Currently subtracting one date from another that crosses a daylight savings time adjustment can return '1 day 1 hour', but adding that back to the first date returns a time one hour in the future. This is caused by the adjustment of '25 hours' to '1 day 1 hour', and '1 day' is the same time the next day, even if daylight savings adjustments are involved.

XML Canonical: Convert XML documents to canonical form to compare them. libxml2 has support for this.

Add pretty-printed XML output option

Parse a document and serialize it back in some indented form. libxml2 might support this.

Add XMLQUERY (from the SQL/XML standard)

Allow XML sthredding

In some cases shredding could be better option (if there is no need to keep XML docs entirely, e.g. if we have already developed tools that understand only relational data. This would be a separate module that implements annotated schema decomposition technique, similar to DB2 and SQL Server functionality.

Some special format flag would be required to request such accumulation. Such functionality could also be added to EXTRACT. Prevent accumulation that crosses the month/day boundary because of the uneven number of days in a month.

Right now materialized views require the user to create triggers on the main table to keep the summary table current. SQL syntax should be able to manage the triggers and summary table automatically. A more sophisticated implementation would automatically retrieve from the summary table when the main table is referenced, if possible. See Materialized Views for implementation details

Lateral subqueries can reference columns of tables defined outside the subquery at the same level, i.e. laterally.
For example, a LATERAL subquery in a FROM clause could reference tables defined in the same FROM clause.
Currently only the columns of tables defined above subqueries are recognized.

Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation. Another issue is that we should probably force NOT NULL to be propagated from parent tables to children, just as CHECK constraints are. (But then does dropping PRIMARY KEY affect children?)

CLUSTER

Automatically maintain clustering on a table

This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.

pg_dump / pg_restore

[E]Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.

[E]Modify pg_dump to create skeleton views for reload (which are then updated via CREATE OR REPLACE VIEW) when views have circular dependencies. This should eliminate the need for the CREATE RULE "_RETURN" hack currently used to address this issue. Thread and additional information here:

Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get
CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.

In a selective dump, allow dumping of an object and all its dependencies

Add options like pg_restore -l and -L to pg_dump

[D]Add support for multiple pg_restore -t options, like pg_dump

Stop dumping CASCADE on DROP TYPE commands in clean mode

Allow pg_dump --clean to drop roles that own objects or have privileges

tgl says: if this is about pg_dumpall, it's done as of 8.4. If it's really about pg_dump, what does it mean? pg_dump has no business dropping roles.

Allow pg_restore to load different parts of the COPY data for a single table simultaneously

Remove support for dumping from pre-7.3 servers

In 7.3 and later, we can get accurate dependency information from the server. pg_dump still contains a lot of crufty code
to try to deal with the lack of dependency info in older servers, but the usefulness of maintaining that code grows small.

Refactor handling of database attributes between pg_dump and pg_dumpall

Currently only pg_dumpall emits database attributes, such as ALTER DATABASE SET commands and database-level GRANTs.
Many people wish that pg_dump would do that. One proposal is to let pg_dump issue such commands if the -C switch was used,
but it's unclear whether that will satisfy the demand.

HTTP

Triggers

Improve storage of deferred trigger queue

Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, some bulk operation, or a bitmap.

This is currently possible by starting a multi-statement transaction, modifying the system tables, performing the desired SQL, restoring the system tables, and committing the transaction. ALTER TABLE ... TRIGGER requires a table lock so it is not ideal for this usage.

The main difficulty with this item is the problem of creating an index that can span multiple tables.

Determine whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY). If yes, implement it.

ALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.

Indexes

Prevent index uniqueness checks when UPDATE does not modify the column

Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE.
However, HOT already short-circuits this in common cases, so more work might not be helpful.

Allow the creation of on-disk bitmap indexes which can be quickly combined with other bitmap indexes

Such indexes could be more compact if there are only a few distinct values. Such indexes can also be compressed. Keeping such indexes updated can be costly.

This is difficult because you must upgrade to an exclusive table lock to replace the existing index file. CREATE INDEX CONCURRENTLY does not have this complication. This would allow index compaction without downtime.

Startup Time Improvements

Experiment with multi-threaded backend for backend creation

This would prevent the overhead associated with process creation. Most operating systems have trivial process creation time compared to database startup overhead, but a few operating systems (Win32, Solaris) might benefit from threading. Also explore the idea of a single session using multiple threads to execute a statement faster.

Write-Ahead Log

Eliminate need to write full pages to WAL before page modification

Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.

When full page writes are off, write CRC to WAL and check file system blocks on recovery

If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches. The difficulty is that hint bits are not WAL logged, meaning a valid page might not match the earlier CRC.

Write full pages during file system write and not when the page is modified in the buffer cache

This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.

This would allow a single query to make use of multiple I/O channels simultaneously. One idea is to create a background reader that can pre-fetch sequential and index scan pages needed by other backends. This could be expanded to allow concurrent reads from multiple devices in a partitioned table.

Miscellaneous Performance

Use mmap() rather than SYSV for shared buffers?

This would remove the requirement for SYSV SHM but would introduce portability issues. Anonymous mmap (or mmap to /dev/zero) is required to prevent I/O overhead. We could also consider mmap() for writing WAL.

Rather than consider mmap()-ing in 8k pages, consider mmap()'ing entire files into a backend?

Doing I/O to large tables would consume a lot of address space or require frequent mapping/unmapping. Extending the file also causes mapping problems that might require mapping only individual pages, leading to thousands of mappings. Another problem is that there is no way to _prevent_ I/O to disk from the dirty shared buffers so changes could hit disk before WAL is written.

Exotic Features

This could allow SQL written for other databases to run without modification.

Allow plug-in modules to emulate features from other databases

Add features of Oracle-style packages

A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.

Features We Do Not Want

The following features have been discussed ad nauseum on the PostgreSQL mailing lists and the consensus has been that the project is not interested in them. As such, if you are going to bring them up as potential features, you will want to be familiar with all of the arguments against these features which have been previously made over the years. If you decide to work on such features anyway, you should be aware that you face a higher-than-normal barrier to get the Project to accept them.

All backends running as threads in a single process (not wanted)

This eliminates the process protection we get from the current setup. Thread creation is usually the same overhead as process creation on modern systems, so it seems unwise to use a pure threaded model, and MySQL and DB2 have demonstrated that threads introduce as many issues as they solve. Threading specific operations such as I/O, seq scans, and connection management has been discussed and will probably be implemented to enable specific performance features. Moving to a threaded engine would also require halting all other work on PostgreSQL for one to two years.

"Oracle-style" optimizer hints (not wanted)

Optimizer hints, as implemented in Oracle and other RDBMSes, are used to work around problems in the optimizer and introduce upgrade and maintenance issues. We would rather have such problems reported and fixed. We have discussed a more sophisticated system of per-class cost adjustment instead, but a specification remains to be developed. See Optimizer Hints Discussion for further information.

Embedded server (not wanted)

While PostgreSQL clients runs fine in limited-resource environments, the server requires multiple processes and a stable pool of resources to run reliably and efficiently. Stripping down the PostgreSQL server to run in the same process address space as the client application would add too much complexity and failure cases. Besides, there are several very mature embedded SQL databases already available.

Obfuscated function source code (not wanted)

Obfuscating function source code has minimal protective benefits because anyone with super-user access can find a way to view the code. At the same time, it would greatly complicate backups and other administrative tasks. To prevent non-super-users from viewing function source code, remove SELECT permission on pg_proc.

At least one other database product allows specification of a subset of the result columns which GROUP BY would need to be able to provide predictable results; the server is free to return any value from the group. This is not viewed as a desirable feature. PostgreSQL 9.1 allows result columns that are not referenced by GROUP BY if a primary key for the same table is referenced in GROUP BY.