Give Codeship’s CI/CD Platform a Try

Want to learn more?

When it comes to performance tuning an environment, often the first place to start is with the database. The reason for this is that most applications rely very heavily on a database of some sort.

Unfortunately, databases can be one of the most complex areas to tune. The reason I say that is because tuning a database service properly often involves tuning more than the database service itself; it often requires making hardware, OS, or even application modifications.

On top of requiring a diverse skill set, one of the biggest challenges with tuning a database is creating enough simulated database traffic to stress the database service. Which is why today’s article will explore pgbench, a benchmarking tool used to measure performance of a PostgreSQL instance.

PostgreSQL is a highly popular open-source relational database. One of the nice things about PostgreSQL is that there are quite a few tools that have been created to assist with the management of PostgreSQL; pgbench is one such tool.

While exploring pgbench, we will also use it to measure the performance gains/loss for a common PostgreSQL tunable.

Setting Up a PostgreSQL Instance

Before we can use pgbench to tune a database service, we must first stand up that database service. The below steps will outline how to set up a basic PostgreSQL instance on an Ubuntu 16.04 server.

Installing with apt-get

Installing PostgreSQL on an Ubuntu system is fairly easy. The bulk of the work is accomplished by simply running the apt-get command.

# apt-get install postgresql postgresql-contrib

The above apt-get command installs both the postgresql and postgresql-contrib packages. The postgresql package installs the base PostgreSQL service.

The postgresql-contrib package installs additional contributions to PostgreSQL. These contributions have not yet been added to the official package but often provide quite a bit of functionality.

With the packages installed, we now have a running PostgreSQL instance. We can verify this by using the systemctl command to check the status of PostgreSQL.

The above indicates our instance started without any issues. We can now move on to our next step, creating a database.

Creating a database

When we installed the postgresql package, this package included the creation of a user named postgres. This user is used as the owner of the running instance. It also serves as the admin user for the PostgreSQL service.

In order to create a database, we will need to login to this user, which is accomplished by executing the su command.

# su - postgres

Once switched to the postgres user, we can log in to the running instance by using the PostgreSQL client, psql.

$ psql
psql (9.5.5)
Type "help" for help.
postgres=#

After executing the psql command, we were dropped into PostgreSQL’s command line environment. From here, we can issue SQL statements or use special client commands to perform actions.

As an example, we can list the current databases by issuing the \list command.

At this point, we now have an empty database named example. From this point, we will need to return to our bash shell to execute pgbench commands. We can do this by issuing the \q (quit) command.

postgres-# \q

Once logged out of the PostgreSQL command line environment, we can get started using pgbench to benchmark our database instance’s performance.

Using pgbench to Measure Performance

One of the most difficult things in measuring database performance is generating enough load. A popular option is to simply bombard test instances of the target application/s with test transactions. While this is a useful test that provides DB performance in relation to the application, it can be problematic sometimes as application bottlenecks can limit database testing.

For situations such as this, tools like pgbench come in handy. With pgbench, you can either use a sample database provided with pgbench or have pgbench run custom queries against an application database.

In this article, we will be using the example database that comes with pgbench.

Setting up the pgbench sample database

The set up of the sample database is quite easy and fairly quick. We can start this process by executing pgbench with the -i (initialize) option.

By default, pgbench will create the tables above with the number of rows shown above. This creates a simple 16MB database.

Since we will be using pgbench to measure changes in performance, a small 16MB database will not be enough to stress our instance. This is where the -s (scaling) option comes into play.

The -s option is used to multiply the number of rows entered into each table. In the command above, we entered a “scaling” option of 50. This told pgbench to create a database with 50 times the default size.

What this means is our pgbench_accounts table now has 5,000,000 records. It also means our database size is now 800MB (50 x 16MB).

To verify that our tables have been created successfully, let’s go ahead and run the psql client again.

$ psql -d example
psql (9.5.5)
Type "help" for help.
example=#

In the command above, we used the -d (database) flag to tell psql to not only connect to the PostgreSQL service but to also switch to the example database.

Since we are currently using the example database, we can issue the \dt command to list the tables available within that database.

From the table above, we can see that pgbench created the four expected tables. This means our database is now populated and ready to be used to measure our database instance’s performance.

Establishing a baseline

When doing any sort of performance tuning, it is best to first establish a baseline performance. This baseline will serve as a measurement as to whether or not the changes you have performed have increased or decreased performance.

Let’s go ahead and call pgbench to establish the baseline for our “out of the box” PostgreSQL instance.

When calling pgbench, we add quite a few options to the command. The first is -c (clients), which is used to define the number of clients to connect with. For this testing, I used 10 to tell pgbench to execute with 10 clients.

What this means is that when pgbench is executing tests, it opens 10 different sessions.

The next option is the -j (threads) flag. This flag is used to define the number of worker processes for pgbench. In the above command, I specified the value of 2. This will tell pgbench to start two worker processes during the benchmarking.

The third option used is -t (transactions), which is used to specify the number of transactions to execute. In the command above, I provided the value of 10,000. However this doesn’t mean that only 10,000 transactions will be executed against our database service. What it means is that each client session will execute 10,000 transactions.

To summarize, the baseline test run was two pgbench worker processes simulating 10,000 transactions from 10 clients for a total of 100,000 transactions.

With that understanding, let’s take a look at the results of this first test.

From these results, it seems our baseline is 2,394 database transactions per second. Let’s go ahead and see if we can increase this number by modifying a simple configuration parameter within PostgreSQL.

Adding More Cache

One of the go-to parameters for anyone tuning PostgreSQL is the shared_buffers parameter. This parameter is used to specify the amount of memory the PostgreSQL service can utilize for caching. This caching mechanism is used to store the contents of tables and indexes in memory.

To show how we can use pgbench for performance tuning, we will be adjusting this value to test performance gains/losses.

By default, the shared_buffers value is set to 128MB, a fairly low value considering the amount of available memory on most servers today. We can see this setting for ourselves by looking at the contents of the /etc/postgresql/9.5/main/postgresql.conf file. Within this file, we should see the following.

In our earlier baseline test, we were able to hit a rate of 2,394 transactions per second. In this last run, after updating the shared_buffers parameter, we were able to achieve 2,550 transactions per second, an increase of 156. While this is not a bad start, we can still go further.

While the shared_buffers parameter might start off at 128MB, the recommended value for this parameter is one-fourth the system memory. Our test system has 2GB of system memory, a value we can verify with the free command.

In the output above, we can see that the total column shows a value of 2000MB on the row for memory. This column shows the total physical memory available to the system. We can also see in the available column that 1223MB is showing available. This means we have up to 1.2 GB of free memory we can use for our tuning purposes.

If we change our shared_buffers parameter to the recommended value of one-fourth system memory, we would need to change it to 512MB. Let’s go ahead and make this change and rerun our pgbench test.

This time, our system was able to reach 2,662 transactions per second, an additional increase of 112 transactions per second. Since our transactions per second increased by at least 100 both times, let’s go a step further and see what happens when changing this value to 1GB.

This time, our transactions per second went from 2,662 to 2,671 and increase of 9 transactions per second. This is a situation where we are hitting diminishing returns.

While it is feasible for many environments to increase the shared_buffers value beyond the one-fourth guideline, doing so does not return the same results for this test database.

Summary

Based on the results of our testing, we can see that changing the value of the shared_buffers from 128MB to 512MB on our test system resulted in a 268 transactions per second increase in performance. Based on our baseline results, that is a 10 percent increase in performance.

We did this all on a base PostgreSQL instance using pgbench‘s sample database. Meaning, we did not have to load our application to get a baseline metric on how well PostgreSQL performs.

While we were able to increase our throughput by modifying the shared_buffers parameter within PostgreSQL, there are many more tuning parameters available. For anyone looking to tune a PostgreSQL instance, I would highly recommend checking out PostgreSQL’s wiki.

Subscribe via Email

Over 60,000 people from companies like Netflix, Apple, Spotify and O'Reilly are reading our articles. Subscribe to receive a weekly newsletter with articles around Continuous Integration, Docker, and software development best practices.

We promise that we won't spam you. You can unsubscribe any time.

Join the Discussion

Leave us some comments on what you think about this topic or if you like to add something.