We will have git super-master which synchs to git.postgresql.org. Can do receive hooks. Have we considered using github? Github should not be canonical source, in case they go away. Can't do postcommit hooks on Github. People can just do both. Forking Postgres repo puts you near their limit. Put off Github questions.

We will have git super-master which synchs to git.postgresql.org. Can do receive hooks. Have we considered using github? Github should not be canonical source, in case they go away. Can't do postcommit hooks on Github. People can just do both. Forking Postgres repo puts you near their limit. Put off Github questions.

−

Issue: what about the name? People will need to reclone, will be part of suckitude. Rename old repo and create new repo. Where will secret master repo be? Maybe Canova.

+

Issue: what about the name? People will need to reclone, will be part of suckitude. Rename old repo and create new repo. Where will secret master repo be? Maybe Conova.

Mapping usernames onto e-mail addresses could be a pain. Maybe we should standardized onto committer@postgresql.org. Committers should pick names before conversion.

Mapping usernames onto e-mail addresses could be a pain. Maybe we should standardized onto committer@postgresql.org. Committers should pick names before conversion.

Revision as of 14:27, 20 May 2010

A meeting of the most active PostgreSQL developers and senior figures from PostgreSQL-developer-sponsoring companies is being planned for Wednesday 19th May, 2010 near the University of Ottawa, prior to pgCon 2010. In order to keep the numbers manageable, this meeting is by invitation only. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).

This is a PostgreSQL Community event. Room and lunch sponsored by EnterpriseDB. Other companies sponsored attendance for the their developers.

Review of The 9.0 Development Process

How did the commitfest work? Do we feel that the process worked in general, do we like Robert's CF application? What other parts of the process should we improve?

David Fetter commented that the writable CTE patch went through more than one CF without adequate feedback, and the patch got rejected. Should we not allow things to be bumped, or not bumped twice? Still listed as open on November Commitfest. RH thinks feedback was provided but it might not have been very clear. It was reviewed more than once. RT: maybe we shouldn't be so quick to bump people in the last CF. Writeable CTE wasn't bumped until Feb. 10. Part of the issue is that it's a very complicated patch.

JB feels that integration/testing needs to be more structured. Still amorphous. If we had more structure, maybe it would go faster. RH we had a lot of open items, we closed them an released Beta1. Agrees that we need more concrete critera. BMomj: we end up with a pile of really hard problems which we don't know how to fix. Even now can't fix max_standby_delay. Needs to be a fire under someone. Open items list is a big win, but doesn't show the scope of the problem. Didn't have anything on open items list until this AM for Beta. Need to reconstitute list.

If stuff is on the open items list it stops release. DF: maybe we should have ratings of complex/not? Perhaps we need release manager to keep up on open items etc.? Or Beta manager. Not everyone knows what every open item is. Someone should track the list and see status. Need list to know what to work on. JB would like to put stuff on the open items list. If there's a thread on hackers put it on the list.

How do we get all the big patches in the first commitfest or second instead of last? Assumes people are working while releasing. Why didn't HS get into first commitfest. Post-CF, prerelease is long and delays development, people take a vacation for 6 months. Also the CF reviews were not very good for big patches. CFs worked well for small/medium patches. But for big patches not so great. KNNGiST and WriteableCTE not so great. HS at least didn't get into last commitfest.

How much of a problem is this issue going to be for 9.1? Do we have anything that large? Synchronous Replication. SQL/MED?

Josh & Selena will track open items and make sure they get listed or tracked and resolved.

Priorities for 9.1

Timeline for 9.1

Treat: Release in July, have an immediate commitfest of pending stuff. Will we release in July? If we're late, do we want to drop a commitfest and have shorter cycle? Maybe the development cycle should go, even if the release is delayed? Lane doesn't think we have enough manpower for that.

Issue is that people are waiting 6 months to resume development. Are there enough reviewers, though? Maybe we could have a reviewfest. Haas thinks that we have manpower. Berkus likes the idea of a reviewfest, new people. Haas says that we could commit stuff or at least put it "pending commit". Smith says that pruning patches would be valuable. What's the main bottleneck of people? Maybe Kevin could run it.

We would need to branch first. Which would involve backpatching. Branch on July 1, first CF on July 15? Or 15 and August 1? Tom Lane: if we're not close to releaseable by July 1, then it's not feasible. Frost would like to have reviewfest in June. We could ask for reviews right now. RRRs need more direction. Selena will help.

In the future, do we want to start earlier? We should get more people to help with getting to beta. Get people on open items list. Put it on the commitfest app? Magnus: but that makes it closer to a bug tracker. Haas: cycle of work is different for open items. No, will use wiki instead. Next year we'll have an open items app.

Doing early branching will also help with bitrot. And will help with people's work schedules.

Plan is to start 9.1 development on July 15, and only delay if things blow up.

Alpha releases unanimously good. We might want to branch them differently. Downloads weren't huge, 10s or maybe 100 per alpha. But practice found issues with packaging, build scripts, etc. Maybe we shouldn't create branches for them, though. We should just tag it. We just wanted it to say the right name. This is probably fixed. So, for 9.1 we'll have a patch for the tarball and not a branch. Discussion of checkout/tag/branch detailed ensued.

CFs need to have enough reviewers. Need to recruit more? Need to make it clear what's in it for reviewers. Reviewers should be nominated for minor contributors.

Actions:

Selena to get reviewers to start now.

Branch on July 1, CF July 15.

Announce as plan for schedule.

No more branching for alphas.

Performance QA/Performance Farm

Last year we took this as an issue at the meeting. Holdup was pgBench needed overhaul; results were useless on Linux. New pgBench should resolve those problems. Got something in pgBench tools which tries to figure out number of threads. Other thing which has been moving along well is benchfarm, and how should systems be set up to give reasonable performance. Greenplum has nice utility called gperftest, people need to test hardware before running pgbench. Nobody will let us benchmark high-end machines and talk about the results. Smith has some new high-end machines to test performance results.

Smith: we are ready to write a spec for a performancefarm client. Need to build client for this. Frost has an intern to work on Postgres stuff will be working on performance farm client. Will be working for 8 weeks.

Performancefarm also needs to run a battery of individual operations for performance regressions. Also needs to run a quick hardware/OS test for comparability. Need a general framework; maybe we'll eventually add DW test or TPCH.

Why do we keep the same dependancy restrictions as buildfarm? It's easier to get clients that way. If we can tell people that they can just add the PerformanceFarm to the buildfarm, it's easier. Will go to assembled tool very soon. Data collection will start with 9.0 because of old pgbench. Biggest thing is to notice if someone's patch torpedoes performance.

Propose that machines for the PerformanceFarm be named after plants. ;-)

What about replication performance? Too big to take over.

Actions: Stephen's intern to develop PerformanceFarm application. Will need help from Dunstan/Drake etc.

Advanced Security Features

KaiGai's Presentation <link?>

We try to load something externally to make access control decisions. Row-level access controls have a number of issues.

PostgreSQL currently has logic & access controls in the same place. (1) rework external check using same flow. (2) add label support. (3) add SELinux support. New method will have clear separation between Postgres and SELinux, possibly using a loadable module.

Rework of access controls needs to do all of the access control checks at once instead of one at a time with query in between. Need to do one object at a time because otherwise it's too big. That way patches are only 200-500 lines each.

Finally, add security labels to objects.

The concern with the rework was that moving all of the security checks into a separate area was that that area needed to have knowledge about everything. Haas: need to provide a clean interface to security providers, but not by changing huge amounts of code. Heikki: it's not that big, it's fairly mechanical.

Currently we check some basic things (like does the table exist) and later we check fine-grained permissions. Completely isolating it not possible. Locks for one thing. Also it's difficult to have a clean API because the API needs to know about everything. Kaigai says that generalized interface isn't necessary, Linux has had to add to the API with each new security provider.

Why can't we put calls in the current aclcheck? Too low-level, don't pass enough information to them. We could pass more. But if we have the OID, we can look up all of the class information. Right now we have duplicate permissions checks all over the code. And the checks we need to do are not necessarily the same checks which SE wants to do.

Smith: what users want isn't necessarily what we have in the patch. Maybe we should just build a subset of functionality, a lot of people don't care about DDL etc. We could implement only SELECT security, it would make the patch more digestible. All permissions checks for SELECT are all in one place. Or DML only.

Does the information provided supply enough? It has to be because it's the first stage. It's basically the information the user entered.

Is DML-only enough? Will it leak? Of course. Anyway, it's useful simplication. Smith says 95% of use cases are solved by that. Table discussion of all of the ACL checks. Kaigai says that checks are the same for DML and DDL, but others do not agree.

Actions: Kaigai, Stephen, Smith, etc. to get together at pgCon and hash out some more issues.

CVS to GIT

Its probably clear that we should change to a new VCS, and it should be GIT. No disagreement.

What are the gating factors to moving now? Let's make a decision to do it, and when and we'll fix the issues. Problem with buildfarm has been solved. Buildfarm now runs git. Building Git on any older platform is impossible; bad make files. Getting all buildfarm members running Git wouldn't be possible, but we can run CVS mirror for older ones.

We have a checklist on the wiki already for switching.

Most buildfarm members will run either; it's a config item. We'll track which ones are using the emulator.

Building older versions may have issues to build identically. Magnus claims that it's been fixed. How much do we care about old tags? There are still a couple of bad files but they're minor. Do we still have old issues with CVS? Marc says they're fixed, shouldn't show up in Git history.

Are commit e-mails an issue? No. But e-mails will look different. Tom wants them to just work the same.

We don't need to solve technical issues here. Just pick a date. We'll know when we're doing it and that everything will suck for a month afterwards. Will need to be a low-stress time for the project, between commitfests. Tom isn't sure how to apply commits across multiple branches. Discussion of Git details hashed out.

Two issues: sheer space usage. Second, management of commits. But these are not serious problems. Andrew has checklist. Will need to test stuff and decide how to do specific stuff. Suggestion on date: after 2nd commitfest. No, halfway after first commitfest. No, immediately after first commitfest ... August 20th or similar.

We will have git super-master which synchs to git.postgresql.org. Can do receive hooks. Have we considered using github? Github should not be canonical source, in case they go away. Can't do postcommit hooks on Github. People can just do both. Forking Postgres repo puts you near their limit. Put off Github questions.

Issue: what about the name? People will need to reclone, will be part of suckitude. Rename old repo and create new repo. Where will secret master repo be? Maybe Conova.

Mapping usernames onto e-mail addresses could be a pain. Maybe we should standardized onto committer@postgresql.org. Committers should pick names before conversion.

Discussion about commit messages, merges, commits, etc. ensued.

Action:

Magnus to set up emulator.

Andrew to publish checklist of how to set up your Git

Move to Git August 17-20: Magnus, Haas, Dunstan. Frost will be out.

Lunch

Clustering

Snapshot Cloning

Koichi: had meeting in Tokyo, and make a list of core APIs which clustering projects could use. Snapshot cloning is one such, plus it's useful for parallel query and parallel pg_dump. First use snapshot cloning to enforce consistent view of the database. Has already implemented this for PostgresXC. The same thing could be applicable for single PostgreSQL. It is a very simple implementation, and should not produce resource conflicts.

For parallel pg_restore, maybe snapshot cloning will not be sufficient. Cloning the snapshot for read-only transactions is simple, not for write transactions.

Smith: Using this for parallel query also works for read-only cloning.

Very useful for dumping partitioned tables, with one backend for each partition.

Added API to libpq. But shouldn't this be a server-side command? For cluster usage, it was useful for it to be in libpq. RH: one idea is a function we could call, and the shared snapshot would use a "cookie". Joachim W. wrote a patch with publish/subscribe. Needs to be all server-side.

Tom has suggestion for simpler implementation, without locks. That is, you just need to have same snapshot start, not shared snapshot. Snapshot would die once the original transaction was gone. Koichi: this is not a problem.

Tom: maybe we could just use a prepared transaction, which would keep the snapshot valid. Proposing to begin with read-only implementation.

Action: Koichi to extract patch from PostgresXC and submit.

XID Feed

PostgresXC needs to have a transaction run on multiple servers in the same cluster. The XID is needed so that you can have the same transactions. Will also be useful for parallel write operation, but that's really complicated. Parallel backend needs to be assigned same XID, but locks, resource conflicts.

Heikki: let's start with parallel read queries.

JB: parallel write on one server is a different feature than XID feed for clustering.

Multiple backends share the same XID so they can share the same snapshot. If you're doing a multimaster update across multiple servers so you can maintain serialization. Stark explains multi-server deadlock situation.

The XID is not the issue, it's the commit order. But communicating the xids means that you don't need to communicate more data to the servers. Just maintaining transaction IDs isn't enough, we need to maintain commit/abort info. If you want a snapshot which is valid on both nodes, you'd have to lock the procarray on both. You'd have to have a single global transaction manager controlling commits.

What is the core feature here? You might want to make a specific instance of Postgres the global transaction manager. Or you might make one postgres a consumer of snapshots. Heikki: you could interrogate each node about what transactions were running at the time of the snapshot. Some discussion without agreement.

Koichi explains how snapshots are distributed in PostgresXC, they receive them with XID. There's no negotiation between nodes. What stability would this affect with core Postgres? Vacuum and analyze need their own GXID.

What is the feature: getting XID and Snapshot from PostgreSQL. Is this useful for core Postgres? Does it work for other cluster systems other than PostgresXC. Would be useful for all synchronous multi-master replication. Like Postgres-R. Or any distributed databases. Should be done as a "hook". Not really different from two-phase commit, but not testable without an external manager, which is the main problem. How could you test it?

What other things do you need? What other hooks would we need in core to support GTM and other clustering functionality? If we had SQL/MED working, you could export XIDs to remote tables. But we don't have that yet.

A hook will be fine.

Action: Koichi to come up with proposed patch design

General Modification Queue

Marko: one use case is transactional que. Have some sample imentations in pgQ and Slony. Two different stragies: Slony/Londiste, and Josh wants to replicate data to external non-PostgreSQL tables. Josh is mainly concerned about write overhead, but no way around WAL.

What is not solved by current LISTEN/NOTIFY? What this has is potential for really improving. Both Slony and pgQ rely on being able to filter out blocks of events and serialized sequence of individual events. Problem is eventID sequence number cannot be cached, that causes painful overhead. Both systems come up with insert/update/delete statements which go by index scan.

If we can support general functions where a trigger can hand in old and new tuples and the receiver can get something which allows it to pull new data. Seems like commit order is the issue. Why do you need a sequence which can't be cached? If you knew what order they committed ... you wouldn't need a global counter. Jan isn't sure this makes sense for core because of lack of version independance.

If you had a stream of commit information, you'd just have to buffer it. But that could work. The real missing piece is a commit ordering stamp, which the database should supply for you. This was a requirement of Postgres-R as well, they need to know what order to apply the writeset in.

We could use the LSN of the commit record as that number. In the CLOG, for a range of XIDs, we have some LSNs. But it's not enough information. It sounds like all that's really needed is to have a way to grab LSN numbers. Maybe write it to a separate file.

Commit-order table would need to be truncated. The clients have to send message about being done with it. Do we want to call gettimeofday while holding walinsertlock? Tom: we already are, but it's not exact enough. LSN plus approximate timestamp would give you order.

Clients would need to look and say grab all transactions between one LSN and another.

Action:

Develop specification for commit sequence / LSN data

DDL Triggers

Jan: it's a feature we've been missing for at least a decade. Jan starting to work on it, but DDL code is very messy. It's in tcop_utility.c function process_utility. The mess is that while the function gets a query string some calls don't put a real query string in there.

Purposes include enforcement of complex CREATE requirements. Also replicating DDL to replicas. Wouldn't you like it better if the data were passed to the trigger using some kind of structured data rather than a query string. Take node structure of utility statement and create query string which can be passed, as well as passing node structure using nodetostring.

We haven't exposed nodetostring because it's not a stable API. But generally changes there indicate changes in features. But if we could give the trigger an object name. Maybe we could pass before-and-after snapshots.

How do Oracle and other systems get this data?

Also there's an issue that some utility statements call other utility statements.

Nodetostring exposure was also vetoed for other reasons. Slony and other systems can take a tree instead. Maybe we should already have a utility function. We already theoretically have a hook, but it's not being used. And also still a problem with recursive calls.

pgAdmin wants a notification for changes. Would need some notification with data about object changed. But just object changed would be enough. We could also build up a set of events for DDL changes over time based on the tree ... we can start with just objectype and objectID. And type of modification: create, drop, alter.

ProcessUtilityHook is there, but the problem is how it's exposed to the user. Hooks aren't used consistently and you don't know who has set it in what modules. Several people are already using it. Ordering becomes a problem. People don't want to use the hook.

We also want a userspace implementation. Like maybe a trigger.

Action:

Wiki page to be updated with spec by Jan et. all.

Status Report on Modules

Slides from Dimitri

Many issues and topics. Talking today about dump/reload support. If you dump and restore, you don't want to restore objects. We want to support any source language. We want to support custom GUCs and versions in extensions. We also want upgrading factilities.

We are not going to talk about schema. We are not going to talk about souce level packaging, ACLs, PGAN or OS packaging and distribution. Example of extensions/foo/control. Should be in user/share. Control file will have name, version, custom_variable_classes.

Then you can just do install extension foo, drop extension foo. pg_restore would call install extension foo and not its objects.

Need dependancies on extensionID in pg_depends so we know what belongs to the extension.

Used name = value because we already know how to parse them in control file. pg_dump will be easy, we will know how to exclude based on dependancies. uninstall.sql files will be replaced by this.

What do we do about user-modifiable tables which are associated with a module? This is similar to how debian deals with config files. Or we could allow install files have items which aren't tracked as part of the module, and pg_restore would need to know about that.

Should extensions have different versions or different names per version? The install script is just a sql file, you can add a DO script. Debian handles this by checking if the configuration is the default and replacing it, or failing over to the user afterwards.

Need license information in the control file.

We probably just need to punt configuration tables to version 2.

Will this help pg_upgrade? Maybe. Right now you have to migrate shared libs yourself. Will fix some cases.

Action:

Dimitri to do patch

SQL/MED

Have to decide what plans you can push to the remote database. Not all remote sources can handle all structures, including functions, joins etc. Even between PostgreSQL ruleutils.c is different for each version.

Proposed FDW planner API. Pass parse tree, needs to say what it can take. How do we not duplicate the entire planner in the API. EnterpriseDB has been working on this but company has not committed to contributing it.

Jan mentioned that at least a wrapper has to take a scan.

Itagaki: didn't know EDB was also working on SQL/MED. Code is probably completely different. There's four issues: any features it should consider about. Currently considering dbLink and COPY FROM. Maybe there should be other features, like GDQ.

Josh: what about PL/proxy? SQL/MED should support the functionality of PL/proxy.

What is the best design to support access methods for tables? Postgres AMs for indexes, we need AMs for FDWs. Update is a problem, may start with select and insert.

How to push WHERE conditions to FDW? Itagaki's WIP code uses internal tree, requires C code in FDW to parse. Might be unstable for that reason. External server might not be SQL server, so passing SQL isn't that useful.

FDW vs. SRF ... can we merge these somehow? Current implementation of functionscan is to materialize. We didn't do value-per-call because it was difficult with PL/pgSQL. So we just materialized it.

Heikki suggested that we don't use the FDW API from the SQL committee because nobody uses it. Supplies function for reaching into planner but can't imagine how it would work. EnterpriseDB implements pipeline.

Also, an issue is cost estimation. How do we know how much it would cost. Statistics on remote tables? We could store them. It would also be nice to have joins take place on the remote server, but that's version 2 or 3.

Also what about indexes on the remote server. One implementation in Japan has CREATE FOREIGN INDEX. Creating definitions locally of remote schema are very useful. DB2 implemented this and had commands to define foreign objects.

Maybe we don't need to be really smart about this in the first version. But people are asking for this. Pass whole query to remote server. Wouldn't work for joins, but would work for union query. Defining an index on a FDW doesn't make much sense since we don't know what the costs are like over there. We should just have a function in the API.

How would we recognize that we want to do the join on the remote server. We just need costs from the FDW. But we could also keep them in pg_stats. FDW might be used to access complex database like Infobright or Bigtable.

Would EDB contribute a patch? Or just the rights for Heikki to steal bits? Patch from EDBAS would not work. Does the EDB patch help Itatgaki? EDB might get Itagaki access to the code.

Action:

EDB to decide on opening code or not

Review Itagaki's git repo code: Heikki, Peter

Itagaki to keep working on API -- what about Peter?

pg_migrator

Bruce: Just fixed a but in pg_upgrade for XID wraparound. Added docs. Migrator tries to rebuild a plane in the air. Bruce feels like he has a swiss army knife with pg_upgrade. Issue with FrozenXIDs in template0 fixed. Still a work in progress, which is why it's in contrib, still a work in progress but bugs are fixable.

Will still have issues with page format, binary format changes. Haven't had those for a while, may not need them, and a dump & restore every 4-5 years is an improvement. Bunches of people have used it to migrate. The tool makes sure that it doesn't corrupt your data, it goes back if it hits an error.

Stark: In the past we've had a chicken-and-egg issue if people wanted to make changes to the data format, we'd need a conversion function. And there's no hooks in pg_migrator. Could slow down the pace of adding features. We should add the functionality to do the conversion now.

pg_upgrade from EDB has hook for COPY mode for conversion. And then what's the point of pg_migrator? If you don't have to rebuild indexes. But a lot of times you'd have to. But to have it in place so that it's not a hurdle for a data change. If we're going to do that we need to do it for 9.1 early in the cycle.

Dump and reload is impossible for Smith's customers. page checksum thing is a good example. Read-on-conversion is a big performance hit. Background process to convert all old data.

Also issue around internal representation of data type change. We either have to convert it, or have versioned data types.

Is binary data conversion really faster? Read back one version, and convert while running. Something which can read one version level back and can convert is the only viable way. Would have a daemon to convert data. What about split pages, bigger page headers.

CRC would be a good test for this. It's a small patch but has a lot of common upgrade issues. And it would be only one patch so we could put it off for another version if we had to. Would like to get this started for 9.1. CRC requires dealing with the split pages issue.

Other Business

Treat: Mammoth Replicator into Core? Code has been BSD-licensed. Has features which don't exist in Hot Standby. Code is pretty big. Alvaro would say that it's not in great shape to contribute at this point.

Has log-based replication including per-table. Has its own logs, and has binary vs. SQL replay. Not trigger-based.

Real question is would we consider putting any replication solution into Core? Not until we've finished digesting HS/SR. Replication is one word for a dozen different solutions for 3 dozen problems. We've accepted one which solves some problems. Page things we should consider each case on its merits.

Question is what parts of Mammoth make sense to be in core. Bruce thinks that mammoth is so tied into the backend that we couldn't accept it. It's too complicated. It doesn't sound like Mammoth offers enough functionality to make it worth it.

Does mammoth need to be in core? It has grammar changes. We now have better ideas of what replication requirements are, we may have more commonalities in core in the future. Part of the reason have so many because we don't have one in Core.

We'd consider more replication in core, but maybe not Mammoth.

Action:

None

Other Business

Peter says we're almost compliant with SQL 2008. Is it of PR value to comply with the remaining random stuff? People don't think so.

What about Case Folding? We probably don't want to fix that. Wasn't on Peter's list, which is just features. Case Folding would break a lot of stuff. Thought Peter was already doing that with a couple of features per version.

Compliance is only of moderate value. Is there a point of implementing the features on Peter's list?

Summer of Code: how is it going?

Haas has concern. His student is working on Materialized Views. Has proposed a very basic implementation. Might not be able to do even that. We can fail people.

Selena just updated the open items on the mailing list. A lot of items were not closed on the mailing list.

What about max_standby_delay? Tom wants to go back to just a boolean. What about Tom's original proposal? Can't really do it. Tom wants to remove time dependancy. What are the issues with max_standby_delay? Idle time on the master uses up time on the standby. Plus NTP and keepalives and some other stuff.