Create a Highly Available PostgreSQL Cluster Using Patroni and HAProxy

What is PostgreSQL?

PostgreSQL (Postgres) is an open-source, fully ACID compliant relational database that runs on all major operating systems. While Postgres is a highly versatile, feature-rich, and powerful database, it doesn’t have a built-in solution for high availability.

This guide shows you how to create a highly available Postgres cluster of three servers using Patroni.

Before You Begin

Familiarize yourself with our Getting Started guide and familiarize yourself with SSH and connecting to your linode.

This guide will use sudo wherever possible. Complete the sections of our Securing Your Server to create a standard user account and harden SSH access.

Update your system:

sudo apt update && sudo apt upgrade

Create five Linodes on your account, all within the same data center. Take note of each Linode’s private IP address

Note

This guide is written for a non-root user. Commands that require elevated privileges are prefixed with sudo. If you’re not familiar with the sudo command, see the Users and Groups guide.

Install PostgreSQL

Install Postgres on three Linodes in your setup. Because the configuration in this guide uses private IP addresses to communicate between Linodes in the same data center, this setup may not meet certain Highly Available requirements. For more information about private IPs, visit our Remote Access guide.

The examples in this guide assign the private IP addresses of the three Postgres Linodes 192.0.2.11, 192.0.2.12 and 192.0.2.13. To setup a private IP address on a Linode, refer to the Remote Access guide for more information.

On the three Linodes where you want to install Postgres, update the package lists:

sudo apt update

Install Postgres:

sudo apt install postgresql-9.5 -y

Upon installation, Postgres automatically runs as a service. Stop the Postgres service so that Patroni can manage it from this point on:

sudo systemctl stop postgresql

Patroni uses utilities that come installed with Postgres, located in the /usr/lib/postgresql/9.5/bin directory by default on Ubuntu 16.04. Create symbolic links in the PATH to ensure that Patroni can find the utilities:

sudo ln -s /usr/lib/postgresql/9.5/bin/* /usr/sbin/

Instead of creating symlinks, you can include the /usr/lib/postgresql/9.5/bin directory in your PATH.

Repeat these steps on each of the three Linodes.

Install Patroni

Patroni is an open-source Python package that manages Postgres configuration. It can be configured to handle tasks like replication, backups and restorations.

In this guide, you will use Patroni to:

Configure the Postgres instance running on the same server

Configure replication from master to slaves

Automatically failover to the best slave in case the master goes down.

Install python and pip:

sudo apt install python python-pip -y

Ensure that you have latest version of the setuptools python package:

sudo pip install --upgrade setuptools

Use pip to install Patroni:

sudo pip install patroni

Repeat these steps on each of the three Linodes.

Install etcd

Etcd is a fault-tolerant, distributed key-value store that is used to store the state of the Postgres cluster. Via Patroni, all of the Postgres nodes make use of etcd to keep the Postgres cluster up and running.

In this guide you use a single-server etcd cluster. However, in production, it may be best to use a larger etcd cluster so that one etcd node fails, it doesn’t affect your Postgres servers.

On the Linode where you want etcd installed, update the package lists:

sudo apt update

Install etcd:

sudo apt install etcd -y

The remainder of this guide uses 192.0.2.21 as the private IP address of this Linode.

Install HAProxy

When developing an application that uses a database, it can be cumbersome to keep track of the database endpoints if they keep changing. Using HAProxy simplifies this by giving a single endpoint to which you can connect the application.

HAProxy forwards the connection to whichever node is currently the master. It does this using a REST endpoint that Patroni provides. Patroni ensures that, at any given time, only the master Postgres node will appear as online, forcing HAProxy to connect to the correct node.

On the Linode where you want HAProxy installed, update the package lists:

sudo apt update

Install HAProxy:

sudo apt install haproxy -y

This guide uses 192.0.2.31 as the private IP address of this server and 203.0.113.1 as its public IP address.

Configure Patroni

Patroni can be configured using a YAML file which can be placed anywhere. In this guide, you will place this file at /etc/patroni.yml.

Create a patroni.yml file on all three Linodes that have Postgres and Patroni installed (192.0.2.11, 192.0.2.12, and 192.0.2.13 in this guide). Change name to something unique, and change listen and connect_address (under postgresql and restapi) to the appropriate values on each Linode.

INFO: no action. i am a secondary and i am following a leader
Lock owner: postgresql0; I am postgresql2

Repeat these steps on each of the three Linodes with Postgres installed to create a highly available Postgres cluster with one master and two slaves.

Configure HAProxy

With the Postgres cluster set up, you need a way to connect to the master regardless of which of the servers in the cluster is the master. This is where HAProxy comes in. All Postgres clients (your applications, psql, etc.) will connect to HAProxy which will make sure you connect to the master in the cluster.

On the Linode that has HAProxy installed, edit the configuration file at /etc/haproxy/haproxy.cfg to contain the following:

This configuration exposes HAProxy stats on a public URL. In a production setup, it might be better to restrict this to an internal network/localhost and access it via an SSH tunnel.

Restart HAProxy to use the new settings:

sudo systemctl restart haproxy

If HAProxy fails to start, check for syntax errors:

/usr/sbin/haproxy -c -V -f /etc/haproxy/haproxy.cfg

Test the Setup

Connect Postgres clients to the public IP address of the Linode on which you installed HAProxy (in this guide, 203.0.113.1) on port 5000.

You can also connect to the HAProxy Linode on port 7000 to see the HAProxy dashboard:

In the postgres section, the postgresql_192.0.2.11_5432 row is highlighted in green. This indicates that 192.0.2.11 is currently acting as the master.

If you kill the primary Linode (using sudo systemctl stop patroni or by shutting down the server), the dashboard will look similar to:

In the postgres section, the postgresql_192.0.2.11_5432 row is now red and the postgresql_192.0.2.13_5432 row is highlighted in green. This indicates that 192.0.2.13 is currently acting as the master.

Note

In this case, it just so happens that the third Postgres server is promoted to master. This might not always be the case. It is equally likely that the second server may be promoted to master.

When you now bring up the first server, it will rejoin the cluster as a slave and will sync up with the master.

You now have a robust, highly available Postgres cluster ready for use.

Possible Next Steps

While the setup in this guide should go far in making your Postgres deployment highly available, here are steps you can take to improve it further:

Use a larger etcd cluster to improve availability.

Use PgBouncer to pool connections.

Add another HAProxy server and configure IP failover to create a highly available HAProxy cluster.

More Information

You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.