News

Welcome to End Point's blog

Bucardo and truncate triggers

Version 8.4 of Postgres was recently released. One of the features that hasn't gotten a lot of press, but which I'm excited about, is truncate triggers. This fixes a critical hole in trigger-based PostgreSQL replication systems, and support for these new triggers is now working in the Bucardo replication program.

Truncate triggers were added to Postgres by Simon Riggs (thanks Simon!), and unlike other types of triggers (UPDATE, DELETE, and INSERT), they are statement-level only, as truncate is not a row-level action.

Here's a quick demo showing off the new triggers. This is using the development version of Bucardo - a major new version is expected to be released in the next week or two that will include truncate trigger support and many other things. If you want to try this out for yourself, just run:

$ git clone git-clone http://bucardo.org/bucardo.git/

Bucardo does three types of replication; for this example, we'll be using the 'pushdelta' method, which is your basic "master to slaves" relationship. In addition to the master database (which we'll name A) and the slave database (which we'll name B), we'll create a third database for Bucardo itself.

$ initdb -D bcdata
$ initdb -D testA
$ initdb -D testB

(Technically, we are creating three new database clusters, and since we are doing this as the postgres user, the default database for all three will be 'postgres')

Before we go any further, let's install Bucardo itself. Bucardo is a Perl daemon that uses a central database to store its configuration information. The first step is to create the Bucardo schema. This, like almost everything else with Bucardo, is done with the 'bucardo_ctl' script. The install process is interactive:

$ bucardo_ctl install --dbport=5400
This will install the bucardo database into an existing Postgres cluster.
Postgres must have been compiled with Perl support,
and you must connect as a superuser
We will create a new superuser named 'bucardo',
and make it the owner of a new database named 'bucardo'
Current connection settings:
1. Host:
2. Port: 5400
3. User: postgres
4. PID directory: /var/run/bucardo
Enter a number to change it, P to proceed, or Q to quit: P
Version is: 8.4
Attempting to create and populate the bucardo database and schema
Database creation is complete
Connecting to database 'bucardo' as user 'bucardo'
Updated configuration setting "piddir"
Installation is now complete.
If you see any unexpected errors above, please report them to bucardo-general@bucardo.org
You should probably check over the configuration variables next, by running:
bucardo_ctl show all
Change any setting by using: bucardo_ctl set foo=bar

Because we don't want to tell the bucardo_ctl program our custom port each time we call it, we'll store that info into the ~/.bucardorc file:

$ echo dbport=5400 > ~/.bucardorc

Let's double check that everything went okay by checking the list of databases that Bucardo knows about:

$ bucardo_ctl list db
There are no entries in the 'db' table.

Time to teach Bucardo about our two new databases. The format for the add commands is: bucardo_ctl add [type of thing] [name of thing within the database] [arguments of foo=bar format]

Note that by default we connect as the 'bucardo' user. This is a highly recommended practice, for safety and auditing. Since that user obviously does not exist on the newly created databases, we need to add them in:

Looks fine, so let's try out the truncate. On versions of Postgres less than 8.4, there was no way for Bucardo (or Slony) to know that a truncate had been run, so the rows were removed from the master but not from the slave. We'll do a truncate and add a new row in a single operation:

As mentioned, there are three types of syncs in Bucardo. The other type that can make use of truncate triggers is the 'swap' sync, aka "master to master". I've not yet decided on the behavior for such syncs, but one possibility is simply:

Database A gets truncated at time X

Bucardo truncates database B, then discards all delta rows older than X for both A and B, and all delta rows for B

6 comments:

Anonymous
said...

The example is a bit confusing. Where you say you're creating the databases, you use the names testA, testB and bcdata. However, the commands given are not createdb's but initdb's which create database clusters (including a default database named "postgres"). The CREATE TABLE statements presumably operate on the latter.

In the commands that add the databases to Bucardo, e.g.,

$ bucardo_ctl add database postgres name=master port=5401

it's not entirely clear that "postgres" is the real database name (and not a database type) and the "name"s are more like aliases.

You mention that your idea on how to handle truncates for swap syncs is "one possibility". Are there others that would be valid? None come to mind for me.

Semantically a truncate is equivalent to removing all rows any other atomic way (is there any other way with Postgres besides DELETE?), so the way you describe seems like the only valid way to handle a truncate. I'd be interested to hear of any others I'm not thinking of.

Jon: actually, the way I laid out won't really work, for as soon as you truncate B, *all* of its delta rows are meaningless. So things are good if we truncate A and Bucardo truncates B shortly afterwards, but a time gap will erase changes made on B that occurred after the truncate on A. The canonical way to do it I suppose would be to record all the current rows and enter them in the delta tables (as if they were deleted). Maybe we need an option to allow it to be done that "safe" way versus the "quick" way of just blowing away B. I'll update the post to reflect the alternate strategy.

Re multi-masterOption of expanding the truncate seems the cleanest to me.

Option#2 (idea)It occurs to me on a quick thought that if second level masters have deltas timestamped then if you would add a timestamped log of truncates on master it would be possible to replicate truncate on second level masters as following (pseudo) SQL:DELETE FROM table WHERE rec NOT IN (SELECT rec FROM bucardo_delta WHERE txntime > truncate.time)Which should preserve space in master delta and hence amount of data to transfer.

Goran: I've considered something like that, but it would be a radical shift from "row-based" comparison to "timestamp-based" comparisons. In other words, right now it does not matter how in sync two databases are with regards to their timeofday() calls. The problem is compounded by the fact that we don't track what type of action was performed (update vs insert), so someone could truncate the table on A, then have it reappear because someone updated a row on B before Bucardo came over and replicated the truncation. On the other hand, truncation is such a severe and (presumably) rare thing to do for MM replication, I'm willing to cut any solution some slack. :) I'm thinking overall a before truncate trigger that populates bucardo_delta with all current rows might be expensive with regards to the amount of stored rows, but safest overall: in the above scenario, the 'delete' on A would trump the update on B, and the row would be removed from B, rather than added back to A.