Bucardo replication from Postgres to sqlite and mariadb using pgbench

While Bucardo is known for doing “multi-master” Postgres replication, it can do a lot more than simple “master to master” replication (better known as “source to source” replication). As people have been asking for simple Bucardo Bucardo 5 recipes based on pgbench, I decided to present a few here. Since Bucardo allows any number of sources and targets, I will demonstrate a source-source-source-target replication. Targets do not have to be Postgres, so let’s also show that we can do source—​MariaDB—​SQLite replication. Because my own boxes are so customized, I find it easier and more honest when writing demos to start with a fresh system, which also allows you to follow along at home. For this example, I decided to fire up Amazon Web Services (AWS) again.

After logging in at https://aws.amazon.com, I visited the AWS Management Console, selected “EC2”, clicked on “Launch Instance”, and picked the Amazon Linux AMI (in this case, “Amazon Linux AMI 2015.03 (HVM), SSD Volume Type—​ami-1ecae776”). Demos like this require very little resources, so choosing the smallest AMI (t2.micro) is more than sufficient. After waiting a couple of minutes for it to start up, I was able to SSH in and begin. The first order of business is always updating the box and installing some standard tools. After that I make sure we can install the most recent version of Postgres. I’ll skip the initial steps and jump to the Major Problem I encountered:

$ sudo yum install postgresql94-plperl
Error: Package: postgresql94-plperl-9.4.4-1PGDG.rhel6.x86_64 (pgdg94)
Requires: perl(:MODULE_COMPAT_5.10.1)
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest

Well, that’s not good (and the “You could try” are useless in this case). Although all the other Postgres packages installed without a problem (postgresql94, postgresql94-server, and postgresql94-libs), there is a major incompatibility preventing Pl/Perl from working. Basically, the rpm was compiled against Perl version 5.10, but Amazon Linux is using 5.16! There are many solutions to this problem, from using perlbrew, to downgrading the system Perl, to compiling Postgres manually. However, this is the Age of the Cloud, so a simpler solution is to ditch this AMI and pick a different one. I decided to try a RHEL (Red Hat Enterprise Linux) AMI. Again, I used a t2.micro instance and launched RHEL-7.1 (AMI ID RHEL-7.1_HVM_GA-20150225-x86_64-1-Hourly2-GP2). As always when starting up an instance, the first order of business when logging in is to update the box. Then I installed some important tools, and set about getting the latest and greatest version of Postgres up and running:

Luckily, there is excellent support for Postgres packaging on most distros. The first step is to find a rpm to use to get the “pgdg” yum repository in place. Visit http://yum.postgresql.org/ and choose the latest version (as of this writing, 9.4). Then find your distro, and copy the link to the rpm. Going back to the AWS box, add it in like this:

This installs a new entry in the /etc/yum.repos.d/ directory named pgdg-94-redhat.repo. However, we want to make sure that we never touch the old, stale versions of Postgres provided by the native yum repos. Keeping it from appearing is as simple as finding out which repo it is in, and adding an exclusion to that repository section by writing exclude=postgres*. Finally, we verify that all yum searches for Postgres return only the 9.4 items:

## We saw above that repo was "rhui-REGION-rhel-server-release"
## Thus, we know which file to edit
$ sudo emacs /etc/yum.repos.d/redhat-rhui.repo
## At the end of the [rhui-REGION-rhel-server-releases] section, add this line:
exclude=postgres*
## Now we can retry the exact same command as above
$ sudo yum list postgresql*-server
Loaded plugins: amazon-id, rhui-lb
Installed Packages
postgresql94-server.x86_64 9.4.4-1PGDG.rhel7 pgdg94

Now it is time to install Postgres 9.4. Bucardo currently needs to use Pl/Perl, so we will install that package (which will also install the core Postgres packages for us). As we are going to need the pgbench utility, we also need to install the postgresql-contrib package.

$ sudo yum install postgresql-plperl postgresql-contrib

This time it went fine—​and Perl is at 5.16.3. The next step is to start Postgres up. Red Hat has gotten on the
systemd bandwagon, for better or for worse, so gone is the familiar
/etc/init.d/postgresql script. Instead, we need to use systemctl. We will find the exact service name, enable it, then try to start it up:

As in the pre-systemd days, we need to run initdb before we can start Postgres. However, the simplicity of the init.d script is gone (e.g. “service postgresql initdb”). Poking in the systemd logs reveals the solution:

That’s ugly output, but what can you do? Let’s run initdb, start things up, and create a test database. As I really like to use
Postgres with checksums,
we can set the environment variables to pass that flag to initdb. After that completes, we can startup Postgres.

Now that Postgres is up and running, it is time to create some test databases and populate them via the pgbench utility. First, a few things to make life easier. Because pgbench installs into /usr/pgsql-9.4/bin, which is certainly not in anyone’s PATH, we will put it in a better location. We also want to loosen the Postgres login restrictions, and reload Postgres so it takes effect:

Before Bucardo can be fully installed, some dependencies must be installed. What you need will depend on what your particular OS already has.
For RHEL 7.1, this means a few things via yum, as well as some things via the cpan program:

Time to test that it works. The initial database, “test1”, should have many rows in the pgbench_accounts table, while the other databases should have none. Once we update some of the rows in the test1 database, it should replicate to all the others. Changes in test2 and test3 should go everywhere as well, because they are source databases. Changes made to the database test4 should stay in test4, as it is only a target.

Let’s create one more sync—​this time, we want to replicate our Postgres data to a MariaDB and a SQLite database.
(Bucardo can also do systems like Oracle, but getting it up and running is NOT an easy task for a quick
demo like this!). The first step is to get both systems up and running, and provide them with a copy of the pgbench schema:

Now we can create the necessary tables for both. Note that SQLite does not allow you to change a table's structure once it has been created, so we cannot use the MySQL/Postgres way of using ALTER TABLE after the fact to add the primary keys. Knowing this, we can put everything into the CREATE TABLE statement. This schema will work on all of our systems:

Teach Bucardo about these new databases, then add them to a new sync. As we do not want changes to get immediately replicated, we set this sync to “autokick off”. This will ensure that the sync will only run when it is manually started via the “bucardo kick” command. Since database C is also part of another Bucardo sync and may get rows written to it that way, we need to set it as a “makedelta” database, which ensures that the replicated rows from the other sync are replicated onwards in our new sync.

Excellent. Everything is working as expected. Note that the changes from the test4 database were not replicated onwards, as test4 is not a source database. Feel free to ask any questions in the comments below, or better still, on the Bucardo mailing list.