Friday, November 28, 2008

postgresql replication using slony-I

As most postgresql users must be knowing, postgresql does not provide any inbuilt replication solution. There are lots of 3rd party replication products available for postgresql. Slony is one of them. Slony is a trigger based replication solution, that is it used triggers to push data to the slave. Slony is supposed to be one of the most stable replication solutions for postgresql.

You can download slony from www.slony.info. There are two major versions of slony - slony-I & slony-II. Slony-I is a simple master-slave replication solution. Whereas slony-II is a advanced multi-master replication solution. We will go ahead with simple master-slave replication solution. So we will download Slony-I. The latest version available is Slony-I 1.2.15. Slony-I 2.0 is in RC and should be soon released. But we will go with a stable release - 1.2.15.

Postgresql version being used is 8.3.3. To install slony, simply untar the downloaded file and run ./configure --with-pgconfigdir=<path to pg_config>makesudo make install

I have used two machines for setting up replication. Installed postgresql and slony-I on both of them.

master server ip : 172.16.3.211slave server ip : 172.16.3.173

We will be working with the superuser postgres which is used to start and stop the postgresql server.

Quick steps

Define environment variables on master & slave. The main purpose is to make our task easier. Lets create an env.sh file containing all the definitions.

As you can see here, my postgresql is installed in /usr/local/pgsql. I have defined the IP addresses & ports of master and slave servers. I have used the superuser postgres for replication. And i have defined the master and slave databases to be used for replication. You can replicate between databases with different names on master and slave - just change the names in all the scripts.

Since slony-I depends on triggers for replication, you will need to install the plsql procedural language on master to generate and run triggers & stored procedures for pushing data to slave./usr/local/pgsql/bin/createlang -h $MASTERHOST -p $MASTERPORT plpgsql $MASTERDBNAME

Put some tables in the $MASTERDBNAME on master, which you want to replicate. And port the tables to slave. It has to be done manually.

And now configure the databases for replication. When you install Slony-I, it puts two binaries slonik and slon in the pgsql/bin directory. Slonik is the tool which is used for creating configuration tables, stored procedures and triggers. All we need to do is create a configuration file to pass it to the slonik tool. Here i am assuming that there are two tables which need to be replicated - parent & child.

vim replconfig.cnf # define the namespace the replication system uses in our example it is # replcluster cluster name = replcluster; # admin conninfo's are used by slonik to connect to the nodes one for each # node on each side of the cluster, the syntax is that of PQconnectdb in # the C-API node 1 admin conninfo = 'dbname=repltestdb host=172.16.3.211 port=5432 user=postgres'; node 2 admin conninfo = 'dbname=repltestdb host=172.16.3.173 port=5432 user=postgres'; # init the first node. Its id MUST be 1. This creates the schema # _$CLUSTERNAME containing all replication system specific database # objects. init cluster ( id=1, comment = 'Master Node'); # Add unique keys to table that do not have one. # This command adds a bigint column named "_Slony-I_$CLUSTERNAME_rowID" to the table which will have a default value of nextval('_$CLUSTERNAME.s1_rowid_seq') and have UNIQUE & NOT NULL constraints applied on it. # table add key (node id = 1, fully qualified name = 'table_name'); # Slony-I organizes tables into sets. The smallest unit a node can # subscribe is a set. # you need to have a set add table() for each table you wish to replicate create set (id=1, origin=1, comment='parent child table') set add table (set id=1, origin=1, id=1, fully qualified name = 'public.parent', comment='parent table'); set add table (set id=1, origin=1, id=2, fully qualified name = 'public.child', comment='child table'); # Create the second node (the slave) tell the 2 nodes how to connect to # each other and how they should listen for events. store node (id=2, comment = 'Slave node'); store path (server = 1, client = 2, conninfo='dbname=repltestdb host=172.16.3.211 port=5432 user=postgres'); store path (server = 2, client = 1, conninfo='dbname=repltestdb host=172.16.3.173 port=5432 user=postgres'); store listen (origin=1, provider = 1, receiver =2); store listen (origin=2, provider = 2, receiver =1);

Now everything is setup and from the slon.log files on master and slave you can see that both the servers are trying to sync with each other. But still replication is not on way. To start replication we need to make the slave subscribe to the master. Here is the required config file for doing this

Passing this file to slonik will do the trick and replication would start happening.

/usr/local/pgsql/bin/slonik startrepl.cnf

Now simply make some inserts, updates and deletes on the master and check out whether they are happening on the slave as well. Officially, since replication is on full swing all changes in master tables should be replicated on the slave.

Please note that new tables & changes to table structures wont be replicated automatically. So whenever a new table is created or an existing table is altered the changes has to be manually propagated to slave and the scripts need to be run to make appropriate changes in the triggers and config tables.

Another important thing to note is that postgresql on master and slave should be able to communicate with both the ip addresses. For this add the ip addresses in the pgsql/data/pg_hba.conf.

For the able replication i had added the lines host all all 172.16.3.211/32 trusthost all all 172.16.3.173/32 trustto the pg_hba.conf file in both master & slave.