I have written some posts on how you can make your PostgreSQL deployment high available by using PostgreSQL’s streaming replication feature in the past ( 1, 2 ). The main issue you’ll have to resolve with such a setup is how the application can be made aware of a new master when a fail over happened. You could use EDB Failover Manager (1, 2, 3, 4) for that because it provides the functionality to move a VIP from one host to another so the application can always connect to the very same IP address no matter where the current master is running (EDB EFM requires a subscription). You could also use Pacemaker and Corosync for that. But, which is the scope of this post, you can also use pgpool which is widely known in the PostgreSQL community. When you configure it the right way you can even spread your read operations over all hot standby servers in your configuration and only write operations go to the master. This allows you to vertically scale your PostgreSQL deployment by adding more standby nodes when you need more resources. Lets go …

To start with a picture is always a good idea. This is what we want to setup:

We will have two nodes dedicated to pgpool (centos7_pgpool_m1/m2). pgpool will be running in a watchdog configuration so that one node can take over in case the other goes down. pgpool will provide a virtual IP address for the clients to connect to (which fails over to the surviving node in case a node goes down for any reason). In the background there are two nodes which host the PostgreSQL 9.6.2 primary and hot standby instances (centos7_pgpool_1/2). At the very beginning the master is running on centos7_pgpool_1 although that does not really matter once the whole setup is completed.

I’ll not describe the setup of the PostgreSQL master->standby setup. When you need assistance there take a look here, here or search the web, there are many great howtos.

You can download pgpool here. As pgpool requires libpq we’ll just install the PostgreSQL binaries on the hosts dedicated for pgpool as well before proceeding with the installation of pgpool. Of course these steps need to be done on both hosts (centos7_pgpool_m1/m2):

For switching the VIP from one host to another pgpool must be able to bring up and shutdown the virtual interface. You could use sudo for that or change the suid bit on the ifconfig and arping binaries:

The other important configuration file for pgpool is the pcp.conf file. This file holds the authentication for pgpool itself and requires a user name and a md5 hashed password. To generate the password you can use the pg_md5 utility which comes with the installation of pgpool:

The various *path* variables are obvious, they tell pgpool where to find the binaries for ping, arping and ifconfig (you can also use the ip command instead). The other0* variables specify which other host runs a pgpool instance on which pgpool and watchdog ports. This is essential for the communication between the two pgpool hosts. And then we have the commands to bring up the virtual interface and to bring it down (if_up_cmd,if_down_cmd). In addition we need an address for the virtual interface which is specified by the “delegate_IP” variable. Lets see if it works and start pgpool on both nodes:

Cool, now we have a VIP the application can connect to which switches between the pgpool hosts automatically in case the host where it currently runs on experiences an issue or is shutdown intentionally. There is a pcp command which shows you more details in regards to the watchdog:

To summarize: We now have a pgpool instance running on two nodes. Only one of these nodes hosts the VIP and the VIP switches to the other host in case there is an issue. Client connections from now on can go the VIP and pgpool will redirect the connection to one of the PostgreSQL nodes (depending if it is a write or a pure read operation).

In the next post we’ll dig deeper into the pgpool configuration, how you can tell on which instance you actually landed and how we can instruct pgpool to automatically promote a new master, dsiconnect the old master and the rebuild the old master as a new standby that follows the new master.

10 Comments

Hi,
Thank you for writing awesome article, its works all good but When I try to execute the query “SHOW pool_nodes”, I get this error:
ERROR: unrecognized configuration parameter “pool_nodes”
Can you please assist, how I can fix it.

1- heartbeat_destination0 =
I can see that you have specified same value (‘host0_ip1′) against this parameter in both the ‘pgpool.conf’ file (for both pgpool instances). Is this correct?
I want to understand the use of this parameter? Also, whether this parameter should point to pgpool host (m1/m2) or DB host (primary/slave)?

2- heartbeat_device0 = ‘ ‘
Should I leave this parameter empty? Just FYI.. In my case NIC name on both pgpool nodes (m1/m2) is ‘eth1′.

Thank you very much, Daniel for clarification.
Could you please answer my first question too (I’m posting it here again):-
————-
Hi Daniel,

Please if could help me on below two parameters:

1- heartbeat_destination0 =
I can see that you have specified same value (‘host0_ip1′) against this parameter in both the ‘pgpool.conf’ file (for both pgpool instances). Is this correct?
I want to understand the use of this parameter? Also, whether this parameter should point to pgpool host (m1/m2) or DB host (primary/slave)?

2- heartbeat_device0 = ‘ ‘
Should I leave this parameter empty? Just FYI.. In my case NIC name on both pgpool nodes (m1/m2) is ‘eth1′.