HammerDB Best Practice for PostgreSQL Performance and Scalability

This post gives a HOWTO guide on system configuration for achieving top levels of performance with the HammerDB PostgreSQL TPC-C test. As an Intel employee (#IAMINTEL) the examples are taken from a PostgreSQL on Linux on Intel system, the approach is the same for whatever system you are testing although some of the settings you see may be different.

Firstly for system choice a 2 socket system is optimal for PostgreSQL OLTP performance at the time of writing. This limitation is at the database level rather than the hardware level, nevertheless with up to date hardware (from mid-2018) PostgreSQL on a 2 socket system can be expected to deliver more than 2M PostgreSQL TPM and 1M NOPM with the HammerDB TPC-C test.

I/O

On the 2 socket system you will need I/O that is able to keep up with writing to the WAL – An example is an NVME PCIe SSD formatted as an XFS file system and mounted as below:

/dev/nvme0n1p1 /mnt/ssd xfs noatime

The OS username you are using it can be anything but for people familair with Oracle this becomes the equivalent of the system username. Many people use postgres.

CPU Configuration

Make sure that your CPU is configured for optimal performance. To do this make sure that the Linux tools package is installed. If so there are a number of tools under the /usr/lib/linux-tools directory under the specific kernel name.

Check that the settings have been applied and the frequency settings as expected. From the following output key things to check are that the driver is shown as intel_pstate (for Intel CPUs), the governor shows as performance, the frequency range goes to the maximum frequency for the CPU and boost state is supported (if your CPU supports turbo boost).

# ./cpupower frequency-infoanalyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: Cannot determine or is not supported. hardware limits: 1000 MHz - 3.80 GHz available cpufreq governors: performance powersave current policy: frequency should be within 1000 MHz and 3.80 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 1.99 GHz (asserted by call to kernel)boost state support:Supported: yesActive: yes

Also check that the idle settings are enabled and (for Intel CPUs) that the intel_idle driver is used. Note that the pstate and cstate drivers work together to provide the most efficient use of turbo boost. Initially it is best leave all of these states enabled. Disabling all C-states or Idle states is likely to reduce overall performance by reducing the available turbo frequency. You can experiment with the C-states between ‘all or nothing’ to find the optimal settings.

You are not quite done. There is another tool in the directory called x86_energy_perf_policy that determines how the boost states are used. By default this is set to normal so you will want to set it to performance.

To reiterate this will only be when you have reached the stage of starting the database, so firstly you will need to install the software.

Install PostgreSQL from Source

Make sure that you have the necessary software to compile PostgreSQL from source, download the postgresql source and extract it to your PostgreSQL file system configured above. Although 9.6.5 has been used here, feel free to download and use the latest release.

Find and change change the file pg_config.h (after having run configure first)

find . -name pg_config.h -print
./src/include/pg_config.h

So it looks like below, this will set the WAL file size to be 1GB instead of the default 16MB.

/* XLOG_SEG_SIZE is the size of a single WAL file. This must be a power of 2 and larger than XLOG_BLCKSZ (preferably, a great deal larger than XLOG_BLCKSZ). Changing XLOG_SEG_SIZE requires an initdb. */
#define XLOG_SEG_SIZE (1024 * 1024 * 1024)

The default location for the install is in /usr/local/pgsql so the easiest way to configure is to set a symbolic link to a directory from your ssd to this location using the “ln –s” command

Now follow the steps previously to compile PostgreSQL from source. When you do “make install” it will create the binaries in this location. You also need to create a new directory called “data” and if all went well you should now have a directory that looks like this:

postgres:/usr/local/pgsql$ lsbin data include lib share

Configure PostgreSQL

Run initdb from the bin directory specifying the data directory

./bin/initdb -D ./data

This creates your database with the username of the OS user you are using as the superuser. Now edit 2 configuration files postgresql.conf and pg_hba.conf.

For Postgresql.conf you have a number of options. It is your test so you can set the options as you see fit. Remember to set huge_pages=on and note that whereas wal_level minimal=minimal and synchronous_commit=off will give you the best WAL performance for a test it may not be what you would want in a production environment. (and you can run further tests to quantify the impact of these options).

Configure the HammerDB Client

Download and install HammerDB on a test client system, another 2 socket server is ideal. You need the client libraries so you can either copy the postgres lib directory you have just compiled or compile from source again on this host – you don’t need to create a database or set the config files. Then add the library to the library path:

Create the Schema and Run the Test

You can now start HammerDB on the client and from the Database options choose PostgreSQL. Follow the HammerDB documentation to build the schema and run the test. Do not select the EnterpriseDB Oracle compatible schema as you installed the software from source. Creating an 800 Warehouse schema is a good starting point, the data should load quickly but note that creating indexes can take up to 20 mins or so this is to be expected. If you have a system with an up to date CPU (as of mid-2018), enough memory and everything installed on a fast SSD then more than 2M PostgreSQL TPM and more than 1M NOPM should be achievable.