We want to record a DDL changes in a way that allows it to be replayed elsewhere in a flexible way. So we need access to both the original text and the fully qualified text once all search paths are resolved, making both available to allow replication to decide which to use. We don't need to fire actions for every conceivable sub-statement, only enough that we can accurately reproduce what occurred. So for example, a DROP CASCADE might just need to be replayed as a DROP CASCADE. Query text is available from utility hooks, but fully cooked SQL event info is required.

+

We want to record DDL changes in a way that allows it to be replayed elsewhere in a flexible way. So we need access to both the original text and the fully qualified text once all search paths are resolved, making both available to allow replication to decide which to use. We don't need to fire actions for every conceivable sub-statement, only enough that we can accurately reproduce what occurred. So for example, a DROP CASCADE might just need to be replayed as a DROP CASCADE. Query text is available from utility hooks, but fully cooked SQL event info is required.

Event Triggers

The Event Trigger patch series main goal is to allow for users to tweak DDL commands without having to hook in ProcessUtility(). The design has been made more general so that we can support important use cases.

Main advantages over ProcessUtility hooks

The first version of Event Triggers has very few technical capabilities to offer that we don't already have available in the form of ProcessUtility hooks, so, what's the deal here?

You can write event triggers functions in PL/pgSQL, no need to write them in C, build an extension, ship it to the server's file system and finally install it;

You can actually list the currently installed Event Triggers by doing \dy in psql;

In single user mode the Event Triggers are disabled, staying out of the way while you're busy fixing your production services;

It's possible to install more than one Event Trigger then decide as the DBA in which order to run the Event Triggers without having to decipher how the code is implemented and in which order the shared object librairies are going to be _Init()ialized.

Use Cases

We attempt at being able to solve those use cases:

Logical Replication

Auditing

DDL support for Extensions

Impact analysis/performance

Logical Replication

We want to record DDL changes in a way that allows it to be replayed elsewhere in a flexible way. So we need access to both the original text and the fully qualified text once all search paths are resolved, making both available to allow replication to decide which to use. We don't need to fire actions for every conceivable sub-statement, only enough that we can accurately reproduce what occurred. So for example, a DROP CASCADE might just need to be replayed as a DROP CASCADE. Query text is available from utility hooks, but fully cooked SQL event info is required.

Auditing

We need to be able to record Who did What to Whom/WhichObject, When they did it and whether it succeeded. So we need access to userid, username, actioncategory, actiontext, subjectcategory, subjectid, action time, rc

Performance

How long did we hold locks for? How long did the action take?

Expected Features

The first release of Event Triggers is landing in PostgreSQL 9.3 and we will have a restricted set of features in that release.

so that you can call your event trigger when the object has already been CREATEd

Features still in the work for 9.3

The main features we need and which are not yet commited in 9.3 are:

No specific information about the objects that the command target

No access to the parse tree or the command string, raw or normalized

No support for multiple targets per command

only DROP can target more than one object at a time

and DROP CASCADE is another special case of that

No support for generated commands

The next point is detailing where the discussion is at about those features that we didn't commit yet.

Discussion

Some of the features are still under discussion in order to find the best design we can implement them.

As a general note it's useful to recall that most of those features have already been coded at least once (some of them received one or more redesign and refactoring following review in the 9.2 and 9.3 cycles). So it's not about adding new features that we didn't see coming before, it's about finishing a patch series to match a subset of the features proposed a couple of years ago.

The goal is to refine which subset we are ready to accept and to evaluate if we can ship with that subset as something that makes sense on its own.

For example, with no information available at all about the specifics of the objects that a command is targeting, it's hard to see which use case you can actually solve with Event Triggers.

generated and internal commands

For context, the following command will actually run several operations:

CREATE TABLE foo(id serial primary key, f1 text);

We will have CREATE SEQUENCE then a new CREATE TABLE statement using the name of the sequence we just created, then ALTER SEQUENCE and finally a CREATE INDEX command. The current implementation of PostgreSQL backend is to form a parse node from scratch and send that to ProcessUtility() again, in a recursive way.

The current patch proposal is to expose those with a CONTEXT of GENERATED that is not matched by default by any Event Trigger, and allow the user to opt-in when he's interested into that level of implementation detail.

It might not be very wise to make that implementation detail visible to users because that would make us unable to clean the mess later, should any developer has round tuits or motivation to go about that some day.

One idea is to only expose users who are willing to code their event trigger in C, and another idea is then to only expose them via a hook. The problem with the hook approach is that you need to build and expose the exact same amount of information.

How to expose Information to Event Triggers Functions

The current proposal is to expose some magic PL variables. The already commited code exposes TG_EVENT and TG_TAG.

We're proposing to add to that the most common pieces of information as variables, adding to them those that are really cheap to build (it's a constant string):

Variables

TG_OBJECTID

TG_OBJECTNAME

TG_SCHEMANAME

TG_OPERATION

TG_OBTYPENAME

TG_CONTEXT

Accessors

pg_get_event_command_string()

About TG_CONTEXT, see previous point.

Command String Normalisation

The main use case for the command string is Logical Replication. In that use case, it's very important to know off hand in which schemas the objects are created (resp. dropped, altered) and to some extend which names are given to automatically named objects (indexes, constraints, sequences).

Unfortunately it's already clear that the INSTEAD OF feature implementation is too hard to get right from the first release of Event Triggers, because the call points in the backend code must be really carefully placed.

table_rewrite

The idea is to provide an event called table_rewrite that will fire any time a command that will rewrite the whole of the table is to be executed, so that the DBA can install a local policy about that (accept the rewrite only at night when it's not a full moon, say).

create table on insert

Some developers are getting used to schema less databases nowadays, and want to be have the behaviour that when they do

INSERT INTO look_me_i_dont_exist(key, value) VALUES(1, 'foo');

The table look_me_i_dont_exist is automatically created by PostgreSQL with two columns key of type integer (presumably) and value of type text here.