In our big-brother-esque age, we like to have everything under control. While in real life it may not always be a good thing, we certainly want to know what happens to our data.
I've seen all sort of solutions to log database changes, many of them in PHP or some other language, but IMHO that's not how it should be done.
Since the DBMS itself often offers the tools we need, let's see how to use them effectively.
Database table auditing seems to be an hot topic: there are ad-hoc tools ,
there's an example in the postgresql manual
and of course there are blog posts about it. Nothing new under the sun.
But I'd still like to talk about it :-), since we're going to see a different approach, and learn something about PostgreSQL triggers and functions in the process, aren't we?

The goal

So how is this approach different? The idea is not to store a 1:1 copy of the monitored table with a record for each change,
but to store each (and only) changed field in a record, with many details that we'll soon explore, using just one audit table (and not one for each monitored table).
We're taking one step further than what David Fetter suggested in the GeneralBits (2nd April 2005) .
This is the audit table structure (you may change the size of the fields according to your needs):

NB: this example will work for tables with a Primary Key on one field only.

Prerequisites

The following function requires the PL/TCL language to be loaded in PostgreSQL.
You can do it with this simple command:

$ createlang -U username pltcl dbname

We need to use PL/TCL since we can't reference NEW and OLD as arrays in PL/PgSQL.
Of course, if you don't have TCL/TK 8.4 installed, it's the first thing to do.
There are also some windows binaries provided by ActiveState.

N.B. for Windows users: To configure TCL, remember to set these two system environment variables:TCL_LIBRARY = /path/to/Tcl/lib/tcl8.4TCLLIBPATH = /path/to/Tcl/lib
otherwise you'll get this nasty and clueless error:ERROR: could not create "normal" interpreter.

The function is a bit long but should be easy to understand. First, we get the name of the current user and table, then we get the PK name with a query to the system catalogs and finally we log the event to the audit_table.
Depending on the event type (INSERT, UPDATE or DELETE), the function loops through the table fields, checks which one(s) changed, and stores a record into the audit_table for each changed field.

Create a TRIGGER for each table to monitor

Now that we have defined a generic function that should work with any table (with the limitation of it having a PK on one field only), we can "attach" it to a table with a simple trigger:

--DROP TRIGGER tg_audit_test1 ON test1;
CREATE TRIGGER "tg_audit_test1"
AFTER INSERT OR DELETE OR UPDATE ON "test1"
FOR EACH ROW
EXECUTE PROCEDURE "log_to_audit_table" ();
--DROP TRIGGER tg_audit_test2 ON test2;
CREATE TRIGGER "tg_audit_test2"
AFTER INSERT OR DELETE OR UPDATE ON "test2"
FOR EACH ROW
EXECUTE PROCEDURE "log_to_audit_table" ();

Just change the trigger name and the table name, no other change is needed.

Test the auditing procedure

Now that we've prepared everything, let's see how it works by inserting a record into a monitored table:

The audit_table registered the insertion, with one record per field.
Please note that it recognized the event happened on another table, and automatically retrieved the field names and values.
What if we update only one field?

Conclusion

In this tutorial we've seen how to write a PostgreSQL function in PL/TCL and a trigger to attach the function to some events on other tables.
As you can guess, the function as-is isn't perfect nor bullet-proof: you can't monitor tables with multiple-field PKs or LOBs,
and the audit_table can become huge in no time. Still, you may find it useful, if nothing else for learning purposes.

Till next time, take care.

Update

It looks like someone found this tutorial useful, and extended it to solve a few problems of the original solution. I recommend you to have a look at this blog entry too.

Update 2010-07-11

4 responses to "PostgreSQL table audit"

Hi Lorenzo,
Thank you very much for your contribution. I want to add a remark concerning line 4. I had some issues with tables which were spatially enabled with PostGIS. The relfilenode was unknown for $TG_relid. Actually, the values in columns "OID" are always the same as in the column "relfilenode" a part for spatially enabled tables. Hence, I changed the code to compare with OID and the function works properly. Did you have a particular reason to compare the variable $TG_relid with the column relfilenode?
I am curious. Hope my remark is of interest to you.
Best regards,
Quirin
_________________________________
Albert-Ludwigs-Universität Freiburg
Department of Microsystems Engineering
Georges-Köhler-Allee 103
79110 Freiburg - Germany
Tel: +49761/203-7158
Fax: +49761/203-7222
e-mail: quirin.hamp at imtek.uni-freiburg.de
Internet: http://www.imtek.uni-freiburg.de/emp

many thanks for such a great head start on pltcl triggers
in my version of pgsql8.4 it proved necessary to use
set tgname $TG_table_name
rather than
spi_exec \"SELECT relname AS tgname FROM pg_class WHERE relfilenode = $TG_relid\"
it works well with that one change although does not seem to handle tables with boolean fields. The update from jimmyorg\'s blog also does not handle it (at least for me).

Hey, thank you so much for this tutorial.
I would like ask you something.
For each row the trigger insert a new line on audit table. How I can do to make the trigger insert only one new line for the same transaction?
Or if it isn\'t possible...how I could get the primary key value inside a NEW or OLD record? It\'s possible?
Thank you so much.