Making Packager-Friendly Software, Part 2

My previous article, Making
Packager-Friendly Software, explains why software packaging is sometimes
problematic due to real problems in the mainstream sources. It also discusses
many issues that affect the distribution files and the configuration scripts
(the most visible items when trying out a new program). This part explores the problems found in the build infrastructure and the code itself.

If you haven't yet read Part 1, it may be a good idea to do so
now. It introduces many concepts that you need to know to understand
completely the following explanations.

Recursive Dependencies

If your program has any runtime dependency, such as a shared library or a
helper utility, be careful not to introduce other direct dependencies
involuntarily. If you do, your program will end up depending on more packages
than you expected. That will make it more difficult to issue an update of an
installed package.

To get an idea of the real problem, consider a situation I faced a
while ago. GNOME VFS can use optional OpenSSL support. When building it with this
feature enabled, the linker flags stored in the pkgconfig file receive an extra
-lssl argument. When any other program requests the flags it
needs to link against GNOME VFS, it will see something like
-lgnomevfs-2 -lssl. What happens here? The program that requested
GNOME VFS support now directly depends on SSL too, although it shouldn't (because it does not care about the internal behavior of GNOME
VFS). This has introduced a hidden dependency.

"Yeah, well, what's the problem with that?" you may ask. "Everything you've said seems harmless." Suppose that the GNOME VFS package has a future
modification to use GnuTLS
instead of OpenSSL. (This happened in pkgsrc a while ago.) You build the new
version of the GNOME VFS package and update the one installed on your system,
doing an in-place update, without removing any package using it. Assuming
nothing else depends on OpenSSL after replacing the old GNOME VFS package, you
remove it. Oops! All those programs that previously used GNOME VFS with OpenSSL
support are now broken, because you've removed one of their dependencies.
(Remember? They linked against OpenSSL by using the -lssl flag).
The removal was possible because the packaging system did not record that
dependency (and it shouldn't); this is another example of the hidden
dependencies explained earlier.

You may argue that the packages that use GNOME VFS need the
-lssl flag because some platforms do not support the recursive loading
of shared libraries. OK, fine, but this is something (as far as I know) that
GNU Libtool handles through the dependency_libs variable in
.la files. Leave that job to a tool that really knows what's going
on.

Handling Configuration Files

Installing configuration files isn't easy, especially if you want to please
all packaging systems and users. Some of them don't care at all about how you
install these files, while others want to do the entire task for themselves to
control what happens.

Before continuing, let me explain how pkgsrc handles them, so that you can
see the big picture from a different perspective. I'll simplify and omit some
details to make the explanation easier.

Pkgsrc installs the configuration files provided by a package in an
examples directory, namely
${PREFIX}/share/examples/package-name/. Every installation
of the respective package always updates these files. The administrator never
modifies them. They are just examples.

When the package is finally installed, pkgsrc looks in the
examples/ directory. Each time it finds a file there, it checks whether that
file exists in the system configuration directory (usually but not always /usr/pkg/etc). If it doesn't exist, pkgsrc copies it verbatim.
However, if it does exist, pkgsrc checks it for local modifications and overrides it with the new version only if there are no changes. Finally, the administrator may choose to install the configuration files for a specific
package in a separate directory, out of the generic system configuration
directory.

As you can see, this particular packaging system wants to do all the
installation by itself. The advantage of this is that a generic and
consistent framework handles all configuration files. It also works
perfectly with binary packages. The disadvantage is that unfortunately, almost
all third-party utilities need patches to make them work within this framework.
I am sure that other packaging systems also have their own structure to manage
configuration files, and that they need to patch programs to work according to
their rules.

"Why do these programs need patching?" you ask. Some of the reasons:

Some packages install configuration files directly into the system
configuration directory (known as sysconfdir in GNU Autoconf
terminology). This is the most common problem. It's a problem because the
package system won't know about those files and won't be able to perform
its sanity checks nor handling of configuration files.

Some packages don't honor the --sysconfdir flag given to the
configuration script properly, or they don't have such a flag nor a similar one. That means you can't change where the programs should look
for configuration files at runtime during configuration. The sources need
patching to remove the /etc string or similar hard-coded paths.

Here are some better ways to handle configuration files, in case you need
them. I know that some people will disagree with the last ones; in that case, consider following these rules only in
--enable-packager-mode.

Provide a --sysconfdir flag (or a similarly named one) in your
configuration script that takes an absolute path to where to place the
configuration files. GNU Autoconf already does this.

If your program installs a lot of configuration files, consider using a
directory under sysconfdir. However, if you do that, add a
--sysconfsubdir flag (or a similarly named one) to specify the name of
this subdirectory; this should be a path relative to sysconfdir, as
determined by the previous point.

Construct the whole path to look for files based on the values passed to
the previous arguments. Then make your program open the configuration files
from that directory exclusively. Try to keep the definition of the
path in a single place (for example, in the configuration script) so that it is
easy to handle.

Never touch the contents of the configuration directory by
yourself. Instead, install your sample configuration files, if any,
under PREFIX/share/examples/package-name/ (or a similar
directory of your choice).

Optionally, add a flag to the configuration script, say
--install-cfg-files, that copies example files from the examples
directory to sysconfdir, following whichever method you think is more
appropriate. However, try to keep this defaulting to no; or, if you still want
to set it to yes, add a big note somewhere saying so.

Unprivileged Builds

Many packaging systems (including pkgsrc) let you build packages as a
regular user and require only superuser privileges to install them (to have
the right permissions, ownerships, setuid flags, and so on). Therefore, you should
make sure that your program builds correctly without superuser privileges to
ease the packaging task. I can't think of an example in which a program requires
full privileges to build.

Furthermore, the installation stage should not trigger any rebuild rule
(causing the creation of new files) and must not leave temporary files in the
source tree.

The rationale behind this is the following: the only stage that happens as a
privileged user is the installation. Other stages, such as the build or the
cleaning, happen as a regular user. The one most affected by incorrect
permissions is the cleaning that happens after an installation, because the
regular user will not be able to remove some of the files owned by the
superuser (especially if he owns a directory in the source tree).

The Make Utility

It is very common to find makefiles that are GNU Make-specific and fail to
build with other tools. This can manifest in two ways: either the other make
utility produces a syntax error and aborts, or it shows strange errors such as
missing rules. The former is easy to diagnose and fix; the latter can lead
to headaches, especially if you're a novice packager.

When developing your program, you should try to build it with at least GNU
Make and BSD's make. If it builds only with GNU's, you can:

Try to fix your makefiles. That can be impossible, depending on where
the incompatibilities come from. If a third-party utility introduced them or
they are because you have to use a GNU Make-specific feature, you won't be able to fix
them.

However, in multiple situations you can overcome the lack of special
functions by using the substitution features of configuration scripts, thus
generating portable makefiles. This may not be a trivial task, but the
portability gain is worth it.

Clearly document that your program requires GNU Make to build. For
example, add a note in the readme file.

Make your configuration script check whether the make utility is available
(you should always do this), check whether it is GNU Make, and fail if it isn't.
For example, if you use GNU Autoconf to generate your configuration script, you
could do something like this:

This does a good job at automatic detection (which is acceptable here) and,
most importantly, lets the user override the check by setting the
MAKE variable in the environment prior to calling
configure.

In my opinion you should try to fix your makefiles first, because many times
the incompatibilities come from minor details. If that's impossible, go for the
configure check--really. Please note that depending on GNU Make by itself is
not a problem, because good packaging systems can use it when needed. The
annoyance comes from having to check manually whether the package will build with
another tool, because some packaging systems don't use GNU Make by default.
Can you imagine building an enormous program such as Mozilla, only to realize after hours of
compilation that it failed because it requires GNU Make?

Another problem related to the make utility is that it may be present with a
name other than make; common names are gmake for GNU
Make, and bmake and pmake for BSD's make. If you call
the make utility within your makefiles using the make name, it may
pick up a different version than the one you need, causing further problems.
To solve this issue, simply avoid hard-coding make in
makefiles; use ${MAKE} instead.

Still, be aware that some old make utilities do not set the
${MAKE} variable while running. In such cases, you can use GNU
Autoconf's @SET_MAKE@ macro to define it.