Introducing PGXN Tester

Over the past few years, I wrote a number of extensions, and most of them are available on PGXN. I also use extensions published by other users (and I even had a talk about it at Nordic PostgreSQL Day 2014). My experience both as a user and a developer is that it's quite difficult to keep the extensions working, for a number of reasons.

For example when the extension aims to support a range of PostgreSQL versions, the internal APIs change from time to time, and suddenly an extension is broken. The PGXN packaging means additional complexity, which sometimes leads to other issues (e.g. when you increase the version in the control file, but forget to do that in the META.json file).

So eventually I started to do "semi-automatic" testing of my extensions - building, installing and testing the extensions across multiple PostgreSQL versions, using dumb shell scripts. This improved quality of my extensions, but had no impact on the extensions published by others. I could probably run the tests for all extensions, and somehow report the issues, however that seemed really tedious and time consuming.

But what if there was something like a pgbuildfarm.org, but testing extensions? Something that would run the tests on a range of PostgreSQL versions, and then report the results along with all the information useful when debugging the failure (like logs from compilation, loading the extension or diffs from the regression tests) to some central server. So I went and did that, and I'd like to present pgxn-tester.org to you.

Of course, it's still still work in progress, but I believe it may be already useful. Also, I'd like to get some feedback so that I can shape the system to serve the intended purpose better. Either post into the discussion here, post to pgxn-users or send me an email.

Basic architecture

Just like the buildfarm, the system has two parts - a central repository of test results, and client machines that run the tests (and report the results). Both parts are available at github:

The server then stores the results, and makes them available using a simple HTTP API and simple HTML/AJAX user interface (based on the HTTP API) - available at http://pgxn-tester.org. Which is probably the most interesting thing to most.

User interface

The first thing you'll see is a summary of results by PostgreSQL (major) version, and release status of the distribution. So you can immediately see how well the "stable" extensions are doing on 9.3, for example. This is probably a bit too high-level overview, because users and developers are usually interested in particular extensions - so click on the Distributions in the top menu, and you'll see something like this:

For each PGXN distribution, the table summarizes results from all testing machines and supported PostgreSQL versions. How exactly that's done is a bit complicated (because each distribution may have multiple versions, the machines may repeat the tests several times etc.). For now, all you need to know is "green == test passed == good" and "red == test failed == bad" (and "orange == regression tests missing == not good").

You can also click on a user name, and in that case you'll get an overview of distributions published by the selected user. It's a bit more detailed (it shows versions of the distributions), and for example if you choose user tomasv (which is me) you'll see something like this:

Apparently, I should look into some of the extensions, because there are still some failures in the "pgxnclient check" phase (which runs the regression tests provided by the extension). So let's see one of the extensions - quantile-1.1.3, and see why it fails - just click on the version number, and you should get this:

This is a quite common failure pattern - the extension works fine on recent PostgreSQL versions, but fails on older ones (and does not reflect this in the META.json properly). I might click on the "details" to see where exactly is the problem, which brings me to a page with the logs reported from the testing machines - it's often quite long, so I'm not posting a screenshot here.

But this part of the regression diff log clearly shows where's the problem: