Time for another installment of my ongoingmission
to convert the world to Python 3! This time, a little Debian
packaging-fu for modifying an existing Python 2 package to include
support for Python 3 from the same source package.

Today, I added a python3-feedparser package to Ubuntu Precise. What's interesting about this is that, despite variousreported problems, upstream feedparser 5.1 claims to support Python 3, via 2to3 conversion. And indeed it does (although the test suite does not).

Before today, Ubuntu had feedparser 5.0.1 in its archive, and while some work has been done to update the Debian package to 5.1,
this has not been released. The uninteresting precursor to Python 3
packaging was to upgrade the Ubuntu version of the python-feedparser
source package to 5.1. I'll spare you the boring details about missing
data files in the upstream tarball, and other problems, since they don't
really relate to the Python 3 effort.

The first step was to verify that feedparser 5.1 works with Python 3.2
in a virtualenv, and indeed it does. This is good news because it means
that the setup.py
does the right thing, which is always the best way to start supporting
Python 3. I've found that it's much easier to build a solid Debian
package if you have a solid setup.py in upstream to begin with.

Now, what I'd like to do is to give you a recipe for modifying your existing debian/
directory files to add Python 3 support to a package that already
exists for Python 2. This is a little trickier for feedparser because
it used an older debhelper
standard, and carried some crufty old stuff in its rules file. My
first step was to update this to debhelper compatibility level 8 and
greatly simplify the debian/rules file. Here's what it might have looked like with just Python 2 support, so let's start there.

This is all pretty standard stuff. dh_python2 is used (the --with python2 option to dh),
and we just provide a couple of overrides for idiosyncrasies in the
feedparser package. We clean a couple of extra things that aren't
cleaned automatically, and we run the test suite in the slightly
non-standard way that upstream requires. Also, we override the
installation of a huge amount of test files that would otherwise get
installed as documentation (they aren't docs).

So far so good. What do we have to do to add support for Python 3?

First, we need to make a few modifications to the debian/control file. The current convention with dh_python2 is to use an X-Python-Version header in the source package stanza, so we just need to add this header to the same stanza for Python 3:

X-Python3-Version: >= 3.2

This just says we support any Python 3 version from 3.2 onwards. You also need to add a few additional packages to the Build-Depends. In the feedparser case, I added the following build dependencies: python3, python3-chardet, python3-setuptools. Even though for Python 2 there are a couple of other build dependencies (e.g. python-libxml2 and python-utidylib) these aren't available for Python 3, but lucky for us, they are optional anyway.

Next, you need to add a new binary package stanza. There was already a python-feedparser
binary package stanza for Python 2 support. In Debian, Python 3 is
provided as a separate stack, meaning packages for Python 3 will always
start with the python3- prefix. Thus, it is pretty easy to just copy the python-feedparser stanza and paste it to the bottom of debian/rules, changing the package name to python3-feedparser. You have to update the Depends line to use ${python3:Depends} and I updated the Recommends line to name python3-chardet, and that was about it. Here's what the new stanza looks like:

The first thing to do is to add support for dh_python3, which is analogous to dh_python2, and is the only accepted helper for Python 3. The rules line then becomes:

%:
dh $@ --with python2,python3

Now, one problem with debhelper is that it doesn't have any built-in
support for Python 3 like it does for Python 2. This means dh will not automatically build or install any Python 3 packages, so you have to do this manually. Eventually, this will be fixed, and fortunately with a solid setup.py file, you don't have to do to much, but it's something to be aware of. In the feedparser case, we need to add overrides for dh_auto_build and dh_auto_install. Here's what these rules look like:

Not too bad, eh? You'll notice that the first thing these rules do is call the standard dh_auto_build and dh_auto_install
respectively. This preserves the Python 2 support. Then we just loop
over all the available Python 3 versions, doing a fairly normal
equivalent of setup.py install
(split into a build step and an install step). The install rule looks a
little odd, but should be familiar to Debian Python hackers. It just
installs the package into the proper Debian locations, and will pretty
much be the same for any Python 3 package you build.

The one odd bit is the last line in the override_dh_auto_install rule. This is there just to work around an peculiarity in the feedparser 5.1 upstream package, where it depends on sgmllib.py,
but that is no longer in the Python standard library in Python 3.
Upstream provides an already 2to3 converted version of it, and
recommends you install the module as sgmllib.py somewhere on your Python 3 sys.path. Well, I don't like the namespace pollution that would cause, so I install the file as feedparser_sgmllib3.py and add a quilt patch to the package to try an import of that module if importing sgmllib fails (as it will on Python 3).

An aside: If you look in the debian/rules file for what I actually uploaded, you'll see some additional modifications to override_dh_auto_test.
This just works around the upstream bug where some test suite data
files were accidentally omitted from the release tarball. You can
pretty much ignore those lines for the purposes of this article.

We're almost done. The last thing we need to do is make sure that
debhelper installs the right files into the right binary packages. We
want the python-feedparser binary package to include only the Python 2 files, and the python3-feedparser
binary package to only include the Python 3 files. Keep in mind that
when a source package builds only a single binary package (as was the
case before I added Python 3 support), debhelper will include everything
under the build directory's debian/tmp subdirectory in the single binary package. That's why you see things get installed into $(CURDIR)/debian/tmp.
But when a source package builds multiple binary packages, as is now
the case here, we have to tell debhelper which files go into which
binary packages. We do this by adding two new files to the debian directory: python-feedparser.install and python3-feedparser.install

Reading the manpage for dh_install
will explain the reasons for this, and describe the format of the file
contents. In our case, we're really lucky, because for Python 2,
everything gets installed under usr/lib/python2.* and in Python 3, everything gets installed under usr/lib/python3 (relative to $(CURDIR)/debian/tmp).
You'll notice a few things here. Because we could be building for
multiple versions of Python 2, we have to wildcard the actual directory
under usr/lib, e.g. it might be python2.6 or python2.7. But because we have PEP 3147 and PEP 3149
in Python 3.2, there's only one directory for all supported versions of
Python 3, so we don't need to wildcard the subdirectory. Also, if you
look at the actual .install files in the package, you'll see a few other trailing path components, so the actual contents of the files are:

usr/lib/python2.*/*-packages/*

and

usr/lib/python3/*-packages/*

for the python-feedparser.install and python3-feedparser.install files respectively. The trailing bits just wildcard what on a Debian system will always be dist-packages, just for safety (cargo culting FTW!).

And that really is it! Of course, things could be a little more
complicated if you have extension modules, but maybe not that much more
so, and if the package you're adding Python 3 support to isn't setuptools-based,
you may have more work to do even still. The feedparser package has a
few other oddities that are really unrelated to adding Python 3 support,
so I'm ignoring them here, but feel free to ask for additional details
in the comments, in IRC, or in email.

Hopefully this gives you some insight into how to extend an existing
Python 2 Debian package into including Python 3 support, given that your
upstream already supports Python 3. Now, go forth and hack!

Addendum: my colleague Colin Watson just today packaged up Benjamin Peterson's very fine Python package called six.
This is a nice package that provides some excellent Python 2 and 3
compatibility utilities. You may find this helpful if you're trying to
support both Python 2 and Python 3 in a single code base, especially if
you have to support back to Python 2.4 (poor you :). This will be
available in Ubuntu Precise, although if you're submitting patches back
upstream, you may have to convince the upstream author to accept the
additional dependency. It's worth it to add a little more Python 3 love
to the world.