Debian Package Build Tools

Table of Contents

One of the features of Debian packaging over other Linux distributions is
its extensive and excellent standards documentation (independent of
method) and a rich set of tools that can produce packages that meet those
standards. This abundance of riches can be rather staggering and
confusing, though, and I keep discovering new tools that I wish I had
known about months ago. This document is therefore my attempt to catalog
the various useful tools and approaches to Debian packaging I've used, as
well as some suggestions and recommendations for approaches I've found to
work well.

I am also the upstream maintainer of many of the software packages that I
package for Debian, so I'll also discuss ways of maintaining the Debian
packaging in the same repository with the software package itself. This
poses a few interesting challenges.

In the following, I assume a basic knowledge of packaging terminology, and
much of this will likely only be useful to someone who has already done a
few basic packages and is familiar with the overall process. You should
probably have read the New Maintainer's Guide,
Debian Policy Manual, and the Debian
Developer's Reference before reading this, or at least be familiar with
the contents of those documents.

When I'm also the upstream maintainer of a software package that I'm
packaging for Debian, I find it most convenient to keep the packaging
files in the same revision control system as the rest of the package.
However, there are a few issues to be aware of:

The debian directory and the packaging files should not be included in
the upstream release of the package. There are several reasons for
this: you will often do multiple Debian package releases for the same
upstream source to fix packaging-only problems but won't want to do a
new source release each time you change something, having outdated
Debian packaging rules in the source distribution can be confusing to
users, and someone who wants to build the package for Debian is far
better off retrieving the Debian source package than using the files
in your regular release.

I use a few different methods for generating release tarballs (a topic
for another document), but the simplest one I've found for small
packages is to use rsync to copy the source into a release directory.
With that method, just pass --exclude /debian/ to rsync when
generating the release.

The .orig.tar.gz file in the Debian source package should be identical
to the upstream release tarball. In fact, you want to use a copy of
your last upstream release tarball to go with the Debian package
rather than even trying to generate a new one (or use pristine-tar),
since it's too hard to get a newly generated tarball to match. That
way, any signatures, MD5 checksums, and the like can still be
verified.

I'm in the process of converting all of my packages to use Git, and with
Git I can maintain the Debian packaging on a separate branch that's merged
with a branch that holds the exact release software. I describe that
technique in my Git for Debian packaging notes.

When I was using Subversion or CVS for my packages, what I'd do to build a
new Debian package is pull the last released tarball, untar it, and then
export just the debian directory from the package revision control system
to form the build tree. This is easy to automate. The one drawback is
that all changes specific to Debian have to be localized in the debian
directory. Usually that's fine, and any changes necessary to the rest of
the code can be a reason for a new release of the package. Sometimes,
though, a change specific to Debian is necessary and can't be included in
the regular release. In that case, use a patch management
system that keeps all the files in the debian directory except during the
build.

You can still use systems like svn-buildpackage to build Debian packages
when using a combined repository, but it's a bit more awkward and
svn-buildpackage won't do as much for you. You don't want to use
svn-inject or svn-upgrade, only the svn-buildpackage component, and you'll
have to handle updating the tarballs directory yourself.

If this all sounds like too much hassle, consider either switching to Git
and using Git branches (or using a similar branching idea with another
good distributed VCS, such as bzr). Failing that, try maintaining the
Debian packaging completely independent of the upstream package. This
means having two separate repositories, shuffling patches around, and
keeping multiple copies of the source tree. However, it also means that
you can use all of the same techniques that you use for any other package.

Git

I'm slowly converting nearly all of my software and Debian packaging to
Git. I prefer it these days for several reasons. It has a large, active
community. It has some excellent capabilities for editing recent history
that let one avoid the embarassing "fix typo in previous commit" commits.
Its branching support is rich and capable of supporting nice workflows for
Debian packaging. pristine-tar requires it, and pristine-tar allows a
Debian repository to be self-contained, including the upstream tarball,
without serious bloat. It's extremely fast. And it has a very nice web
repository browser, much better than anything I've found for Subversion.

There's quite a bit to talk about with Debian packaging using Git, enough
that my notes are a separate document.

Subversion

Before switching to Git, I primarily used Subversion for Debian packaging,
and still do for some packages that I've not converted yet.

The basic and easiest way of maintaining Debian packages in Subversion is
to use svn-buildpackage. Start by creating a Subversion repository for
Debian packaging, create the initial package (either from scratch or by
downloading the existing package with apt-source if you're adopting a new
one), and then put the package files in the directory you're planning on
using as your working area for packaging. Run svn-inject on the .dsc
file, passing it the URL of your repository as the second argument, and it
will set up the initial repository structure and tag the current Debian
package. It will also create a build-area and tarballs directory in the
current directory and give you an initial checkout. The upstream orig
tarball will be copied into tarballs. You can delete the Debian package
now.

When making changes to the Debian package, I generally change the
debian/changelog file at the same time and then run debcommit from the top
level of the working directory. It commits all modified files and uses
the changelog entry as the commit message.

To do the package builds, it's best to use pbuilder to make sure that the
package builds in an isolated environment. I use the following as my
~/.svn-buildpackage.conf file:

The second line cleans up the extra changes file created by pdebuild so
that you can run debsign and dput on *.changes in the build-area
directory. The last line prevents svn-buildpackage from changing
debian/changelog after tagging a release. I use pdebuild with cowbuilder
(from the cowdancer package) to improve the startup speed of the build.

svk with Subversion

The drawback to svn-buildpackage is that it doesn't have great merging
support for new upstream versions. Subversion's default merge support
isn't particularly great. svk is a wrapper around Subversion (and other
revision control systems) that supports much better merging and also adds
distributed development.

You can use svk with the Subversion repository created by
svn-buildpackage, but that has the drawback that you have to keep the orig
tarball around outside of the revision control system and you can't really
use svn-upgrade as easily because you want to instead use svk to do the
merge. Normally, svn-buildpackage creates a package structure like:

branches/upstream/
tags/
trunk/

Instead, create a repository structure like:

branches/
tags/
trunk/
vendor/current/

and then import the original source into vendor/current/package.
In other words, don't put the source directly into vendor/current;
instead, create another level of subdirectory matching the name of the
package and import the source into that. I generally then svk cp the
vendor/current directory to vendor/version to tag that import of
the upstream source.

Then, use svk smerge to merge vendor/current to trunk. This will leave
you with trunk/package containing the package source. Check out
trunk, and then put the orig tarball (with the proper Debian name) in the
top level directory of your checkout (at the same level as the
package directory containing the package source). Remember to
mark the orig tarball as a binary with:

svk propset svn:mime-type application/octet-stream *.orig.tar.gz

Now, you can do development in your checkout of the trunk and commit
changes to the upstream source, relying on svk's smerge to handle merges
with new versions. And, because of the extra nesting in the directory
structure, you can then use pdebuild or debuild directly to build the
package. Better yet, anyone else using this repository can just check out
a working copy and have everything, including the orig tarball, needed to
build the package.

When a Debian package is finished and uploaded, svk cp trunk to
tags/version. When a new upstream version is released, follow
exactly the same procedure as given above for starting off the
repository. Branches can be used to maintain stable security updates,
long-lived experimental branches, or similar forks in development.

One other nice feature of svk over svn-buildpackage besides distributed
development and better merging is that svk gets rid of all the .svn
metadirectories. svk keeps that metadata elsewhere, so grep -r and
similar commands work as expected. (The tradeoff is that you can't just
move checkouts around; svk relies on knowing where they are.)

For many packages, making necessary modifications in
Git branches merged into the master or
debian branches, or with Subversion, directly to the upstream
source is sufficient. Many packages don't require any modifications to
the upstream source, and even when they are required, they're often small.
The Subversion merge from svn-upgrade or the merging support in svk is
often sufficient. However, this results in a monolithic diff file and
nothing else for someone else who is trying to make changes to the package
(for an NMU, perhaps), and means that you have to take apart that large
diff file to submit patches upstream. The utilities in the patchutils
package can help with that process, but even those utilities don't help
much if there are overlapping patches to the same file that are
conceptually separate.

This is where more complex Debian build systems come in. I've used three:
quilt, dpatch, and dbs. All three have a similar basic concept, namely to
not include any modifications to the upstream source in the Debian package
diff and instead ship patches in a subdirectory of the debian directory
and apply them at build time. Then, each patch can be easily studied by
users and by other distributions and easily sent upstream.

Currently, using my Git packaging workflow, I don't use
any of these systems. The Git merge facilities are more powerful.
However, this has the disadvantage of requiring others to use my Git
repository rather than the resulting source package if they want to see
broken-out patches. There is therefore some merit to these systems even
if you love Git.

quilt

quilt is (in my perception) the current leader in Debian in patch build
systems. quilt isn't Debian-specific; it's a set of scripts for managing
a set of patches that was originally developed to work on the Linux
kernel. Used in a Debian package, it keeps all patches to the upstream
source (anything that touches files outside of the debian directory) in
the debian/patches directory. The order in which those patches
should be applied is controlled by debian/patches/series. The
patches are applied during the build, generally by including the quilt
makefile fragment and making the build target (or its stamp target) depend
on $(QUILT_STAMPFN). They're removed during the clean target, generally
by making clean depend on unpatch.

One of the advantages of quilt is that it works very similar to a source
control system and doesn't require a separate working tree. You can
create a new patch, add files to it, edit those files, and then run
quilt refresh to save those changes to the patch. quilt
header -e will let you edit the header on the patch, where what the patch
does should be explained. You can also easily apply all of the patches
with quilt push -a, remove them all with quilt pop -a, and
do many other useful things; see the quilt man page for all the details.

If you plan on using quilt for Debian packaging, I recommend the following
as ~/.quiltrc:

The first two lines remove all the extraneous unnecessary bits from diffs
that change each time a patch is refreshed but serve no purpose. The
first line also prefers color for quilt diff. The third line saves
reject files as unified diffs instead of context diffs since I find them
more readable. The last line sets the correct patch directory for working
with Debian packages.

The quilt package format may become a native package format for Debian
source packages in the next release of Debian after lenny.

dpatch

dpatch is very similar to quilt, with one substantial advantage: it uses a
richer patch syntax that includes a shell script, so patches can rename
files and perform other more complex operations. However, it has the
substantial disadvantage of not having as many tools for managing the
patches, selectively applying and unapplying them to a working tree, and
easily creating or changing existing patches. There is a tool,
dpatch-edit-patch, to save changes as patches, but I don't find it as
intuitive as quilt's VCS-style command set.

Despite the richer patch format and some neat integration with cowbuilder,
my impression is that dpatch is becoming less popular and most people
using a patch system are moving towards quilt.

dbs

dbs is now obsolete and should not be used for new packages. However, I
wanted to describe it here for the sake of completeness, and for people
who run into packages in this format.

dbs is the most comprehensive in its changes to how the package is
maintained. When using dbs, the orig tarball isn't a copy of the upstream
tarball, but rather is a tarball containing the upstream tarball. dbs
then unpacks that tarball as part of the build process. There are
variants on this approach; some people roll their own rather than using
the dbs package itself.

Some people like this a great deal, in part because it handles multiple
upstream tarballs and other complex upstreams better than the default
Debian source package format. However, I find it complicated and don't
like how it makes it more difficult to just start hacking on the package
when testing or trying something.

Note that cdbs is unrelated to dbs. cdbs is a makefile-based framework
for building packages, similar to debhelper but using included makefiles
instead of separate programs. cdbs is orthogonal to the systems
described; it supports its own simple patch system, dpatch, or quilt, or
no patch system at all.

A lot of the packaging work I do is specific to Stanford, packaging local
software or Stanford variants of packages that we customize. I also
package some of my own software that isn't of broad enough interest to
warrant uploading to Debian, and it's sometimes nice to have a staging
area where I can point apt at a repository before I actually upload to
Debian. (This is particularly true before one is a Debian developer and
one needs someplace to put the package for a sponsor to download it to
build, sign, and upload.)

The best of breed system these days appears to be reprepro. It handles
the repository pool structure, can generate Contents files, has support
for migrating packages from one repository to another (simulating the
testing distribution or letting you pull architecture-independent packages
easily into a stable repository), and does a lot of repository validation.
The documentation is a bit incomplete, but what's there is quite good.

I maintain a testing distribution which has a stanza the like the above
with changes to Codename and Suite, and with the addition of:

Pull: everything

The corresponding stanza in the pulls file looks like:

Name: everything
From: sid

This automatically copies all packages uploaded to unstable to testing. I
do this so that systems that will use the next stable release can track
testing and get all new packages as they're uploaded, and then once
testing becomes the next stable release, I can just change the Pull
configuration to the configuration for stable and only upload new
backported packages.

Similarly, there's an entry for the current stable with changed Codename
and Suite lines and the addition of:

Pull: arch-all

The corresponding stanza in the pulls file looks like:

Name: arch-all
From: sid
FilterList: hold pull-list

This references a file named pull-list in the conf directory
which lists all the architecture-independent packages in my repository
that are safe to copy into stable whenever I upload a new version for
stable. Each line of that file looks like:

eyrie-keyring install

In other words, the name of a package and then the keyword install.

I maintain my local repository on a separate system from my normal working
system and don't want to forward my GnuPG key, so I upload packages via
scp to the repository system into an incoming directory and then use the
reprepro processincoming command. The incoming configuration file
looks like:

I created the repository signing key (in my case, ftp@eyrie.org is
the e-mail address on the key) using the --homedir argument to gpg
to /srv/debian/keyring and gave it an empty passphrase so that I
don't need to enter a passphrase each time I upload. (I think this is an
acceptable security risk. Your mileage may vary.)

One very useful set tools I've found other than the basic infrastructure
mentioned above are the utilities that come in the patchutils package.
These utilities manipulate diff files in various ways. Probably the most
useful is filterdiff, which allows one to extract or suppress sections of
a diff. This is extremely useful for taking apart a Debian package diff,
or removing all the debian directory modifications to find the changes to
the upstream source. There are also utilities to remove fuzz on diffs,
split diffs apart, and similar operations.

devscripts is the other obvious collection of utilities. I mentioned
debcommit above as a good way of committing changes to Debian packages
with an appropriate change message. debuild is the easiest way of doing a
quick package build, although I always use pbuilder and its pdebuild
command for doing real builds. uscan uses a debian/watch file to check
for a new upstream version (and including those watch files also makes
your Debian QA summary page
more useful). bts is a great command-line interface to the Debian
bug-tracking system. wnpp-alert lets you know about orphaned packages
that you use; rc-alert does the same for release-critical bugs.
tagpending tags bugs as pending based on the current debian/changelog
file. debc is a great way of checking the contents and metadata of a
newly built package.

Since I use Emacs to do all my development, I also use the dpkg-dev-el
package, which sets up major modes for the various Debian control files
and does some simple syntax highlighting. It's the most useful for
editing the changelog file, but the syntax highlighting makes the control,
copyright, and README.Debian files look pretty, and it updates the date at
the bottom of the latter.