Wednesday, May 05, 2010

Big Ideas

Each release of PostgreSQL features dozens, if not hundreds, of small improvements over the previous release; and a few really major new features. Of course, not everyone will agree on what the best new features are. In the forthcoming 9.0 release, pretty much everyone seems to agree that Streaming Replication (standby servers can receive WAL incrementally rather than in 16MB chunks) and Hot Standby (standby servers can execute read-only queries) are the big new features. In 8.4, I think the biggest improvements were the addition of window functions and common table expressions and the self-tuning free space map, which eliminated a major potential source of hassle for database administrators. In 8.3, we got several dramatic performance improvements - HOT, checkpoint spreading, and improvements to the background writer.

So, what's next? What should our community be focusing on for PostgreSQL 9.1? If you've been following developments on pgsql-hackers, you might be tempted to pick out some of the big patches that weren't completed in time for 9.1, like KNNGIST (use indices to accelerate queries that do ORDER BY , as when you want to find, say, the nearest points to a given circle), improved partitioning support (built-in syntax to reduce manual setup), and index-only scans (reduce I/O for index scans by opportunistically skipping or postponing tuple visibility checks). Another possible source of ideas is to look at the features that will be part of 9.0 and think about ways in which they could be further improved: streaming replication that is synchronous rather than asynchronous, or allowing Hot Standby to pause and resume WAL replay, or narrowing the set of circumstances under which Hot Standby needs to cancel a query in order to proceed with WAL replay. You could also look at our TODO list, or previoussurveys and blogpostings on this topic.

But I think it might be good to step back and look at the problem a bit more broadly. Ignoring for a minute what people actually are working on, what should they be working on? Where is PostgreSQL strong as a product and where does it need improvement? Here are a few things to think about - please leave a comment with your thoughts.

1. Performance and Scalability. When I first started using PostgreSQL, the product had a reputation for being slow, principally because of issues relating to VACUUM. That reputation wasn't entirely justified even back then, and I think we've made enormous progress here in 8.3 and 8.4, but there might be more improvements we can make. Where are the remaining bottlenecks? What features can we provide to make better use of machines with lots and lots of RAM, CPU, and/or I/O bandwidth? Or what if you have more than one machine available? Are there classes of queries that we don't optimize well, and could do better with?

2. SQL Features. We're pretty close to having all of the SQL features required by the standard - and many of the ones that are left are not terribly interesting, which may be why no one has gone to the trouble of implementing them. Still, there are still a few more things that would be nice to have, including updateable views, materialized views, LATERAL(), and global temporary tables Do you need these features? Are there other things we're missing?

3. Indexes and Data Types. We currently support btree, hash, gin, and gist indices; hash indices are somewhat limited at present because they are not WAL-logged, and therefore can become corrupted after a system crash. We also support a broad array of datatypes, including all of the usual base types plus user-defined record and enum types, an XML datatype, and the ability to create new datatypes using PostgreSQL's powerful extensibility model. What else do we need, or what that we already have needs improvement?

4. Procedural Languages. Our core distribution includes support for SQL functions and four procedural languages: PL/pgsql, PL/perl, PL/python, and PL/tcl. Additional languages such as PL/R and the ever-popular PL/LOLCODE are available as extensions. It seems unlikely to me that we need more procedural lanaguages, but the existing ones might need improvement.

5. Security. Security can be further subdivided into connection security (preventing the bad guys from connecting to your database) and database privileges (allowing access to some of the data in the database, but not all of it). Two major features that we don't have in this are row-level security and SE-Linux integration, but there may be other things as well. What are they and which ones are important?

6. Administration. I think that the simplicity of administering PostgreSQL is one of its greatest strengths: installing PostgreSQL doesn't mean that you need to hire a dedicated DBA. Still, there's always something that's hard to do. What is it?

7. Replication and High Availability. I alluded to some possible projects in this area near the top of this post, and of course some of our needs in this area will continue to be met by third-party projects, such as Slony, Bucardo, and Londiste, but there may be other enhancements needed in the core product, also.

77 comments:

I would love to see some native ability to view past versions of a record or resurrect a deleted row without the need to build a soft delete system. I've read some blog postings about how such a feature could be built. See Scott Bailey's slides from PG West 2009 on future enhancements.

How about making multiple instances able to operate from separate machines on shared files. In other words, would it be possible to make several machines run the PG daemon but all using the same files on an nfs/SAN/ceph based share safely. Maybe even directly on RADOS files?

I'd like to see:- automatically maintained clustered indexes (pg clustered indexes are awesome but can be painful to keep up-to-date in terms of load)- materialized views could be really cool for caching in the DB tho they're easily worked around- better performance monitoring tools

To my way of thinking, documentation is key at this point and time. People know mysql is the default and use that instead of looking at PG to see if this DB might be a better solution. We used that at DirtyPhoneBook without even taking a look at PG because of that I'm afraid. Now we're scaling well and all, but missing several features that would make life a little easier.

The feature I would love most is the ability to modify a VIEW, or COMPOSITE TYPE that functions are dependent on. In this way it's not necessary to DROP .. CASCADE, re-create the object, and then re-create the functions.

I would also appreciate to be able to better integrate PostgreSQL databases in information systems, by being able to EASILY and EFFICIENTLY join for instance 2 tables, one belonging to a pg database and the other managed by another RDBMS that respects SQL/MED standart.

1. The website itself2. Documentation on the website itself (as tested by finding the correct documentation for the current version via Google)3. East of administration 'out of the box' (i.e. backup scenario that works with minimal understanding of underlying issues for simple deployments).4. Cloud tools (i.e. backups going to S3)5. Deprecating some of the arcane syntax like \dt instead of 'show tables' or something simple and descriptive.

Proper collation support since I am a bit of a language nerd and hate the way collation works right now. A global locale is never right.

MERGE to be implemented since it is so common to want to either insert or update. Almost every project includes that.

Adam, I disagree with almost all of your idea. I prefer \dt over SHOW and the documentation and the website of PostgreSQL are some of the best on the Internet. Sure they can always be improved but you make it sound like there would be a problem now. And backing up databases is trivial in simple cases for Postgres. It first gets hard when you try to do replication. Cloud tools also seem out of scope of the Postgres project and instead part of another project.

How about have a Hot Standby that you can pause and resume WAL replay, but also rewind.

I could imagine situations where I want to see what my live server looked like at some point in history. So I pop over to the Hot Standby, hit 'rewind' and go back a few days, check the data and then hit 'play' again.

It's probably more an issue for associated projects, but more flexible asynchronous replication is very important for us. In particular: - Multi-master, so we can fail over to DR without having to get replication up-to-date first - Some ability to transform replication events, so we can replicate to a different (incompatible) schema version, and thus perform upgrades without outages.

3) Dependency invalidation or something to help the view/function dependency hell.

4) XMLTABLE and binary XML data type

5) Better parameter support in plpython. For instance returning record.

@Alex - Our proposal for a period data type was shot down on hackers. Instead they wanted a more generic mechanism to create ranges from many scalar data types not just timestamps. Jeff Davis is working on this for 9.1.

I'm not sure what to say to the folks asking for better documentation. Postgres' docs are top notch. Go look at the website/docs for Firebird and then we'll talk. Perhaps we can do something about getting Google to prefer "current" instead of a particular version.

This request might be more of a 10.0 feature than a 9.1 feature... I would like to have support for historical queries, similar to Oracle's Query Flashback:http://wiki.oracle.com/page/Query+Flashback

If PostgreSQL supported SELECT ... AS OF TIMESTAMP queries, that would make it much easier to develop applications that require an audit trail. Without that feature today, you could only develop such an application via an awkward convention of INSERTing log entries with timestamps — never UPDATEing or DELETEing — and querying views that "replay" the log. It would be much easier if the database supported it natively.

A stronger focus on performance on SSDs. RethinkDB is building a new storage engine for MySQL that's optimized around the performance characteristics of SSDs. It would be great to see a similar initiative in the PG community.

Comparing to MySQL you definietly needs simpler installation, adding users and creating databases. It is very simple in mysql, can be also done from phpmyadmin easly.

Beyond that I think psql is in very good shape.

For me most important is data safety. Maybe you should check on start check for example if all writing barriers and sync, fsync are all properly working. some disks, filesystems, volume manager, crypto layers, just cheats, and do ignore this requests and sometimes doesn't even return proper error codes. :(

1. Implement partial sorts. This is useful when you have a fast source of tuples that are sorted by field a, but you need tuples "ORDER BY a, b". Currently, Postgres loads the whole lot into memory and sorts it, whereas it only needs to sort each group of similar values of a.

2. A generalised multi-column R-tree index. What I mean is, in a similar way that you can create a multi-column B-tree index, which can answer queries like "a = 1 AND b > 1", it would be great to have a multi-column R-tree index, which can answer queries like "a < 1 AND b > 1". This may be what Jeff Davis is doing, but I haven't looked.

3. Planner risk analysis. There are queries where picking the lowest cost plan is not the sensible choice, if that plan is more risky than other plans. An example is a query with EXISTS. If matching rows are common in the table, then Postgres will use a sequential scan to answer EXISTS, assuming that it will find a matching row very quickly. However, if the table is ordered in some way, and all the matching rows are at the end of the table, a full sequential scan is performed, which can cause bafflingly slow queries, where an index would be very quick. We see this crop up on the performance mailing list quite frequently.

1) The to use multiple cores to execute a single CPU-heavy transaction.

2) Concurrent partition reading+processing for a single query

3) Clean fkey/pkey handling in partitions

I don't ask much, do I ;-) . I'm well aware of how absurdly improbable the first two are, and I'm not actually asking anyone to do anything. It's nothing more than a "gee, it'd be nice if..." .

I do think Pg's single-thread-per-backend, single-backend-per-connection design will begin to seriously hurt it as machines stop increasing in single-core performance as much and start sprouting more cores than you can count. At some point the horrid task of getting multi-threaded backends going is going to have to be tackled.

For that matter, separating query executor from connection state may become necessary, too. You can get around the very expensive connections (because each connection has a private executor that's idle most of the time) in Pg using an external pool, but it'd be really nice if people didn't have to worry about poolers to get the most out of Pg in situations with high client counts.

@Peter Eisentraut: it's a basic fact of economics that humans always want (demand) more but resources (people, time, etc.) are scarce.

For those that have lots of resources (CPU, I/O bandwidth, etc.), some form of parallelization of queries would be interesting, e.g., breaking down a query over a partitioned table so that each partition is processed by a separate process (or thread).

I put number 3 in there as being a MySQL fan (though not as much anymore), I love the fact you can change the underlying storage engine and still retain the same interface. It has lead to some very cool projects like RethinkDB, Infobright, PBXT, etc. What I'm looking at specifically is the ability to plug in a column-oriented storage engine as they provide significant benefits for OLAP systems.

I'd love to see PostgreSQL provide some of the functionality of a graph database.

This. Having CTEs makes it easier to do "graph stuff" in PostgreSQL, but graph operations done this way don't scale particularly well from what I've seen. Putting some more thought into ways to make PostgreSQL excel at storing and querying graphs would be a wonderful thing.

Clustered indexes. Not clustering a table using an index (seems to be some confusion here), but actual clustered indexes aka index organized tables. That is to store the entire table inside an index. The result is that when we're looking up something by primary key we only need to do the index scan to find the data we're interested in. Today we need to scan the index and then actually find the row on the heap.

On a similar (very) note, index only queries. With this we can select data directly from an index without needing to lookup the data from the underlying table.

- Materialized/indexed views (e.g. useful for calculated fields in views that don't exist in underlying tables, etc.)- Built-in partitioning syntax- Index-only scans (essential for people who like to use natural keys for the primary key)

...But there is another feature that is related to the replication features that I don't hear much about:

- Built-in "database resynchronization"

Not everyone replicates for performance. I need replication for fault-tolerance/fail-over. It's useful to have hot-standby, but what do I do when the "down" server comes back up again? How do I re-sync the databases in a way that will minimize downtime, so that the data in both is the most current?

Materialized views and partitioning in the base syntax without the over-head required to do it today would be positively amazing. Index Organized tables and further scalability/reliability improvements as discussed in the post would also be very very very welcome.

Concentrate entirely on user friendliness. Postgres is so much less user friendly than MySQL. People like me are just running simple web sites, and the awesome database stuff beneath the covers in Postgres doesn't make any difference to us. Thus, we end up using MySQL just because it's so much easier. We even use *gasp* the dreaded MyISAM, because it doesn't make any difference for our applications.

Make Postgres way way way more user friendly in pretty much every aspect. Now that replication is there, this should be priority number one.

After hearing the effusive praise for Postgres I've been giving it a shot. If I could make a few requests: strengthen the UPDATE capabilities. For instance, allow group-by syntax in a update with joins. Also, I hear there's no "replace into" support. And this: "" doesn't equate into an empty string? Really? It's 2010 not 1980.

Currently we maintain our own "materialized views" as separate tables that we update via Java code. It works, but it's a pain, and it's easy to mess up. I would love it if the DB handled all that work for us.

I'm no database guy, but my business associate won't use postgres because it doesn't have true record level locking for serial transactions. I know for most uses mvcc and concurrent transactions are faster, but some people have to use the serial transaction features and would really like true record level locking.

at the moment a postgres user is forced to make tradeoffs between good design (normalization, referential integrity) and performance - basically it's too slow to do all the joins, and you're better off duplicating the data.

a live materialized view would give the best of both worlds - create a well structured DB, but then use materialized views to cache the overhead of flattening the tables.

web sites frequently run at 10,000-1 or higher read/write ratio, so it really doesn't matter a bit if there is significant cost to maintaining the view (as long as it's not a complete rebuild on every change..)

1. autoconfiguration: I would love postgres to suggest or set its best guess at the optimal settings for shared_buffers, work_mem etc given my hardware.

2. query optimization with stored procedures: while the query optimizer is great for plain SQL queries it is not able to dig down and take into account the queries within stored procedures. That is, I have a SQL query that has a subquery with a call to PL/PGSQL stored procedure. Expand the ultimate query from all the calls and optimize that.

3. documentation: yes, the technical docs are good but what about books for newbs: "postgres for dummies", "postgres in 24 hours" etc. These are important in lowering the barrier to entry to non-DBAs. That your average Barnes & Noble / Borders etc. has 10+ books on MySQL, none on postgres I am sure drives traffic to mysql.

4. schema support: schema are not yet first class concepts with full support. E.g. in 8.3 there is not schema option for log_line_prefix and I cannot specify a schema in psql -c

While temporal databases, graph-database-like interaction, data-cubes, and the multitude of RDBMS standards are all very nice to have and support...

RDBMSes are inherently based on relational algebra. Relational algebra ought to be distributable. I would like to be able to access my database as though it were a single server. But, when I need additional performance, add a few systems to a cloud and call it good. I don't want to worry that if one of those systems goes down, the data is lost. I don't want to be limited in my data capacity to that amount of storage available on the smallest machine. This might require that I have a separate disk performing asynchronous saves like git change-sets to a central data store. Each system would have to be ACID compliant with its operations including a commit to the queue on the persistent data store manager.

Parallel Query Optimizer (From Greemplum)Converting SQL or MapReduce into a physical execution plan.Using a cost-based optimization algorithm in which it evaluates a vast number of potential plansand selects the one that it believes will lead to the most efficient query execution.Take a global view of execution across the cluster, and factors in the cost of moving data between nodesin any candidate plan.

Polymorphic Data Storage (From Greemplum)Customers can tune the storage types and compression settings of different partitions within the same table.A single partitioned table could (for example) have older data stored as 'column-oriented with deep/archival compression',more recent data as 'column-oriented with fast/light compression',and the most recent data as 'read/write optimized' to support fast updates and deletes.

The killer database feature nobody seems to have been able to implement yet is good temporal support. Read "Temporal Data and the Relational Model by Date (no flame wars ;)) and Darwen, the stuff in there would save *tons* of application development time in lots of areas (accounting, contracts, scheduling) and improve application correctness and performance as a bonus.

Server-side support for graph algorithms would also be great. DFS, BFS, all-pairs shortest paths, single source shortest paths, connected components, minimum spanning trees, etc all have lots of applications but sometimes require access to huge amounts of data to produce quite small result sets.

Postgres has been kind to me, but I still have two large problems with Postgres:

1) Partitioning

2) Clustered Indexes

The data sets I deal with are time ordered sets of continuous data 24/7. This naturally leads me to want to both cluster on timestamp and partition by time.

After realizing how much server side code had to be written to support partitioning (building up the next partition at the correct time, dropping old partitions, replacing triggers etc) I was actually a bit shocked. I suspect that many users besides myself are looking to parition by time ranges, so I think that support for automated paritioning based on some simple rules (per day, per 1,000,000 records) would be very useful.

Additionally, the lack of automatic clustering significantly hurts reporting on my latest partition that is still receiving incoming data. It is relatively easy for me to get our "historical" partitions clustered, but the current partition is not (and can't reasonably be) routinely clustered. Right now there is no real solution besides making the clustered range a bit smaller to limit the size of the non clustered data set, but making too many paritions seems to have a significant impact on insert performance. I suspect that this has improved in 9.0, but I haven't yet had time to test.

1. Auditing capabilities similar to oracle. PG is getting pushed aside at my shop because of lack of SOX compliance auditing.

2. Triggers on "create" and "alter" SQL commands. I think this could potentially make replication much much easier to administer (you could trigger the replication system to replicate the change). Triggers on login/logout, startup/shutdown, and on errors (especially privilege errors for auditing) would be extremely useful.

3. More monitoring metrics to measure and tune PG parameters, such as autovac and bgwriter.

4. Read/Write capability while re-indexing. I have a 24x7x365 system and reindexing never happens.

a) Horizontal partitioning. Right now there is some very useful Horizontal partitioning. It would be nice to have support at GUI level.It would be nice to improve the current Horizontal partitioning. Version 8.2.x needed some triggers and some manual stuff. I think it would be nice to create a table and define the Horizontal partition with DDL and let postgresql to solve the implementation details.

More monitoring features. We want to know which component does a query realy slow down: parser? executer? IO-Time (read/write)? For us log_statement_stats, log_parser_stats, log_planner_stats, log_executor_stats are a little bit too much, but log_min_duration_statement(0) with the duration-time is not enough to drill down the what's realy going on inside of PG.

I'll add another one for "more information from the database". This could be:

* More built in queries/views for determining index/table sizes, which indexes are getting used and which aren't, when things are getting bloated, etc. I've been devouring some of the pgCon talks about these things but the information sometimes seems so difficult to get to unless someone posts some helpful system catalog queries to build that info. * More options for logging useful information (into separate files that can be rotated/handled differently) that can later be parsed and analyzed in useful ways. Especially if some contrib modules were added to do that analysis. * Anything to help me make more sense out of EXPLAIN ANALYZE. My company's main product is maturing to the point where I can go back to some of our problem queries to optimize, but I often don't know where to start. Identifying problems and hinting at potential solutions would help me immensely. Documentation, tools, GUIs, I don't care, I just need some help. * Some way to sanity check various configuration/tuning options. So many of the optimization recommendations and best practices are outdated and confusing. I'd like a simple way to determine either what my best settings are or be able to create test suites to benchmark various settings. Really, even an updated best practices documentation with each release would be awesome.

Simple and easy things that would make my life easier:

* Being able to modify ENUMs without having to rebuild my table would be super dandy. * Some kind of an "If 0 results on UPDATE, do an INSERT instead" functionality. Currently I implement this in application logic (which makes sense), but being able to fire a single statement at the DB instead would be faster and more efficient code-wise.

Beyond that, next-gen pie-in-the-sky type features I'd like to see include: * Materialized views would be helpful in some circumstances. I'd actually been investigating rolling my own here without knowing about this generalized functionality. * Better partitioning support. Being able to split my data up without all the caveats and manual setup would be dandy. * Multi-master clusters, when combined with that partitioning support would really enhance scalability.

I've been using PostgreSQL since 7.0/7.1 and I have to say the progress that's been made has been incredible. That so many things have dropped off my wishlist since those days is very encouraging and I really look forward to seeing what the future holds.