Adam Williamson: Getting started with Pagure CI

I spent a few hours today setting up a couple of the projects I look after, fedfind and resultsdb_conventions, to use Pagure CI. It was surprisingly easy! Many thanks to Pingou and Lubomir for working on this, and of course Kevin for helping me out with the Jenkins side.

You really do just have to request a Jenkins project and then follow the instructions. I followed the step-by-step, submitted a pull request, and everything worked first time. So the interesting part for me was figuring out exactly what to run in the Jenkins job.

The instructions get you to the point where you’re in a checkout of the git repository with the pull request applied, and then you get to do…whatever you can given what you’re allowed to do in the Jenkins builder environment. That doesn’t include installing packages or running mock. So I figured what I’d do for my projects – which are both Python – is set up a good tox profile. With all the stuff discussed below, the actual test command in the Jenkins job – after the boilerplate from the guide that checks out and merges the pull request – is simply tox.

First things first, the infra Jenkins builders didn’t have tox installed, so Kevin kindly fixed that for me. I also convinced him to install all the variant Python version packages – python26, and the non-native Python 3 packages – on each of the Fedora builders, so I can be confident I get pretty much the same tox run no matter which of the builders the job winds up on.

Of course, one thing worth noting at this point is that tox installs all dependencies from PyPI: if something your code depends on isn’t in there (or installed on the Jenkins builders), you’ll be stuck. So another thing I got to do was start publishing fedfind on PyPI! That was pretty easy, though I did wind up cribbing a neat trick from this PyPI issue so I can keep my README in Markdown format but have setup.py convert it to rst when using it as the long_description for PyPI, so it shows up properly formatted, as long as pypandoc is installed (but work even if it isn’t, so you don’t need pandoc just to install the project).

After playing with it for a bit, I figured out that what I really wanted was to have two workflows. One is to run just the core test suite, without any unnecessary dependencies, with python setup.py test – this is important when building RPM packages, to make sure the tests pass in the exact environment the package is built in (and for). And then I wanted to be able to run the tests across multiple environments, with coverage and linting, in the CI workflow. There’s no point running code coverage or a linter while building RPMs, but you certainly want to do it for code changes.

So I put the install, test and CI requirements into three separate text files in each repo – install.requires, tests.requires and tox.requires – and adjusted the setup.py files to do this in their setup():

By adding a few args to our py.test call we get a coverage report for our library with the pull request applied. The subsequent commands use the neat diff_cover tool to add some more information. diff-cover basically takes the full coverage report (coverage.xml is produced by --cov-report xml) and considers only the lines that are touched by the pull request; the --fail-under arg tells it to fail if there is less than 90% coverage of the modified lines. diff-quality runs a linter (in this case, pylint) on the code and, again, considers only the lines changed by the pull request. As you might expect, --fail-under=90 tells it to fail if the ‘quality’ of the changed code is below 90% (it normalizes all the linter scores to a percentage scale, so that really means a pylint score of less than 9.0).

So without messing around with shipping all our stuff off to hosted services, we get a pretty decent indicator of the test coverage and code quality of the pull request, and it shows up as failing tests if they’re not good enough.

It’s kind of overkill to run the coverage and linter on all the tested Python environments, but it is useful to do it at least on both Python 2 and 3, since the pylint results may differ, and the code might hit different paths. Running them on every minor version isn’t really necessary, but it doesn’t take that long so I’m not going to sweat it too much.

But that does bring me to the last refinement I made, because you can vary what tox does in different environments. One thing I wanted for fedfind was to run the tests not just on Python 2.6, but with the ancient versions of several dependencies that are found in RHEL / EPEL 6. And there’s also an interesting bug in pylint which makes it crash when running on fedfind under Python 3.6. So my tox.ini really looks this:

As you can probably guess, what’s going on there is we’re installing different dependencies and running different commands in different tox ‘environments’. pip doesn’t really have a proper dependency solver, which – among other things – unfortunately means tox barfs if you try and do something like listing the same dependency twice, the first time without any version restriction, the second time with a version restriction. So I had to do a bit more duplication than I really wanted, but never mind. What the files wind up doing is telling tox to install specific, old versions of some dependencies for the py26 environment:

tox.requires.py26 is just shorter, skipping the coverage and pylint bits, because it turns out to be a pain trying to provide old enough versions of various other things to run those checks with the older pytest, and there’s no real need to run the coverage and linter on py26 as long as they run on py27 (see above). As you can see in the commands section, we just run plain py.test and skip the other two commands on py26; on py36 and py37 we skip the diff-quality run because of the pylint bug.

So now on every pull request, we check the code (and tests – it’s usually the tests that break, because I use some pytest feature that didn’t exist in 2.3.5…) still work with the ancient RHEL 6 Python, pytest, mock, setuptools and six, check it on various other Python interpreter versions, and enforce some requirements for test coverage and code quality. And the package builds can still just do python setup.py test and not require coverage or pylint. Who needs github and coveralls?

Of course, after doing all this I needed a pull request to check it on. For resultsdb_conventions I just made a dumb fake one, but for fedfind, because I’m an idiot, I decided to write that better compose ID parser I’ve been meaning to do for the last week. So that took another hour and a half. And then I had to clean up the test suite…sigh.