Thursday, July 19, 2012

Creating portable Linux binaries

For some, the idea of creating a portable Linux binary is somewhat elusive.

In this article, we will be discussing how to create a Linux binary for a specific architecture, that you will have great success running on a large variety of Linux distros. This includes current releases, somewhat old ones, and hopefully far into the future.

A common problem facing those looking to deploy proprietary software on Linux, or for those trying to supply binaries to a very large user-base which will not compile your software themselves, is how to offer one binary that fits most normal scenarios.

There are generally four naive approaches to solving this problem.

The developers set up a bunch of specific distros, and compile the software on each of them, and give out distro specific binaries. This makes sense at first, till you run into some trouble.

You have to juggle many live CDs or maintain a bunch of installed distros which is painful and time consuming.

You end up only offering support for the distros you have handy, and you will get quite a few users on a more exotic distro nagging you for support, or a different and incompatible version of a distro you're already supporting.

The compilers or other build utilities on some distros are too old for your modern software, and you need to build them elsewhere, or figure out how to back port modern software to that old distro.

Builds break when a user upgrades their system.

Users end up needing to install some non standard system libraries, increasing everyone's frustration.

The developers just statically link the binaries. This isn't always a legal option due to some licenses that may be involved. Binaries which are fully statically linked also in many instances exhibit incorrect behavior (more on this later).

The developers just compile the software on one system, and pray that it works for everyone else.

Compile with a really really old distro and hope it works everywhere else. However this succumbs to the last three problems outlined in naive approach #1, and in many cases the binaries produced won't work with modern distros.

Now there are plenty of companies that supply those portable Linux binaries. You find on their website downloads for say Linux i386, AMD64, and PPC. Somehow that i386 binary manages to run on every i386 system you've tested, Red Hat, Debian, SUSE, Ubuntu, Gentoo, and both old and modern versions at that. What is their secret sauce?

Now let us dive into all the important information and techniques to accomplish this worthy goal.

First thing you want to know is what exactly is your binary linked to anyway? For this, the handy ldd command comes in.

The lines which are directed to a file are system libraries that you need to worry about. In this case, there's libstdc++, libm, libgcc, and libc. libc and libm are both part of (E)GLIBC, the C library that most Linux applications will be using. libstdc++ is GCC's C++ library. libgcc is GCC's implementation of some programming constructs that your program may be using, such as exception handling, and things like that.

In general (E)GLIBC is broken up into many sub libraries that your program may be linked against. Other notable examples are libdl for Dynamic Loading, libpthread for threading, librt for various real time functions, and a few others.

Your application will not run on a system unless all these dependencies are found, and are compatible. Therefore, versions of these also come into play. In general a newer minor version of a library will work, but not an older.

In order to find versions numbers, you want to use objdump. Here's an example with finding out what version of (E)GLIBC is needed:

In this case, 2.3 is the highest version number. Therefore this binary needs (E)GLIBC 2.3 or higher on the system. Note, the version numbers have nothing to do with the version installed on your system, rather (E)GLIBC marks each function with the minimum version that contains it.

Of course all this applies to other libraries as well, particularly libgcc and libstdc++.

Now that we know a little bit about what we're doing, I'm going to present the first bit of secret sauce.

If you're using C++, link with -static-libstdc++ this will ensure that libstdc++ is linked statically, but won't link every other lib statically like -static would. You want libstdc++ linked statically, because it's safe to do so, some systems may be using an older version (or have none at all, in the case of some servers), or you want your binary to remain compatible if a new major version of libstdc++ comes out which is no longer backwards compatible. Note that even though libstdc++ is GPL'd, it also offers a linking exception that allows you to link against it and even statically link it in any application.

If you see that your binary needs libgcc, also use -static-libgcc for the same reasons given above. Also, GCC is GPL'd, and has the same linking exception as above. I once had the unfortunate scenario where I sold a client an application without libgcc statically linked, that used exceptions. On his old server, as long as everything went absolutely perfectly, the application ran fine, but if any issue occurred, instead of gracefully handling the issue, the application terminated immediately. Since his libgcc was too old, the application saw the throws, but none of the catches. Statically linking libgcc fixed this issue.

Now you might be thinking, hey what about statically linking (E)GLIBC? Let me warn you that doing so is a bad idea. Some features in (E)GLIBC will only work if the statically linked (E)GLIBC is the exact same version of (E)GLIBC installed on the system, making statically linking pointless, if not downright problematic. (E)GLIBC's libdl is quite notable in this regard, as well as several networking functions. (E)GLIBC is also licensed under LGPL. Which essentially means that if you give out the source to your application, then in most cases you can distribute statically linked binaries with it, but otherwise, not. Also, since 99% of the functions are marked as requiring extremely old versions of (E)GLIBC, statically linking is hardly necessary in most cases.

The next bit of the secret sauce is statically linking those non standard libs your application needs but nothing else.

You probably never learned in school how to selectively static link those libraries you want, but it is indeed possible. Before the list of libraries you wish to static link, place -Wl,-Bstatic and afterwards -Wl,-Bdynamic.

Say in my application I want to statically link libcurl and OpenSSL, but want to dynamically link zlib, and the rest of my libs, such as other parts of (E)GLIBC, I would use the following as my link flags:gcc -o app *.o -static-libgcc -Wl,-Bstatic -lcurl -lssl -lcrypto -Wl,-Bdynamic -lz -ldl -lpthread -lrt

The next step is to ensure that your libraries pull in as few dependencies as possible. Here's the output from ldd on my libcurl.so:

This is quite unacceptable. Distros generally compile packages with everything enabled. Your application generally does not need everything a library has to offer. In the case of libcurl, you can compile it yourself, and disable the features you aren't using. In an example application, I only need HTTP and FTP support, so I could compile libcurl with very little, and now have this:

This is much more manageable. Refer to the documentation of your libraries in order to see how to compile them without features you don't need.

If you're going to be compiling your own libraries, you probably want to set up a second system, virtual machine, or a chroot for building your customized library versions and applications, to ensure it doesn't conflict with your main system. Especially for the upcoming tip.

It seems that OpenSSL would work on (E)GLIBC 2.3, except for one pesky function which needs 2.7+. This is a problem if I want to ship an application with this modern OpenSSL on say Red Hat Enterprise Linux 5 which comes with GLIBC 2.5, or say Debian Stable from ~4 years ago, which only has 2.4.

In this case OpenSSL is using a C99 version of sscanf(), but not actually by choice.

These two blocks of code make fscanf(), scanf(), sscanf(), vfscanf(), vscanf(), and vsscanf() use special C99 versions. Since older applications were already compiled against C89 versions, (E)GLIBC doesn't want to potentially break them and change how an existing function works. So instead, a new set of functions were created which only exist in (E)GLIBC 2.7+, and (E)GLIBC by default will direct all calls to these functions to the proper C99 versions when compiling.

Now there are some defines you can set in your library code and application code to ensure it uses the old more backwards compatible versions, but getting the exact right combination of defines without breaking anything else can be tricky. It may also be tedious to modify a code-base you're not familiar with.

Therefore, I recommend just deleting these two blocks from your <stdio.h> on your build system. You want your build system to be able to build everything for backwards compatibility, right?

If you're recompiling libraries like OpenSSL which are designed for massive portability with all kinds of systems, odds are, they're not looking for C99 support in basic scanf() family functions anyway. If you do happen to need C99 scanf() support in your application, I recommend that you add it manually with a specialized lib, for maximum portability. You can easily find a bunch online.

The last scenario that you may encounter is that you happen to want to use a modern library function. For most libs you can just statically link them, but that won't work for (E)GLIBC. Since some functions depend on system support, or that custom versions don't perform as well as the built in system ones, you definitely want to use the built in ones if they're available. The question is, how to once the binary has already been compiled?

So for our final bit of secret sauce, dynamically load any modern functions that you want to use, and work around them, or disable some functionality if not present.

Remember libdl that we mentioned above? It offers dlopen() for opening system libraries, and dlsym() for finding out if certain functions are present or not, and retrieving a pointer to them.

I'm going to post a full example that you can look at and play with. In this example, we have a program which tries to figure out how big system pipes are. In this application, we are going to see how much data we can stuff in a pipe before
we're told that the pipe is full, and the write would need to block.

Linux offers a function called pipe2() which has the crucial ability to create a pipe in non-blocking mode. If it doesn't exist, we can create it ourselves, but we prefer the built in one if possible.

Now pipe2() Was added to GLIBC in 2.9, yet this binary here according to objdump only needs (E)GLIBC 2.2.5+. Here's the output from an older system with GLIBC 2.7, using the exact same binary created on a newer system:

/tmp> ./pipe_test
Using our pipe2().
Pipe size is: 65536
/tmp>

Lastly, let me recap all the techniques we learned.

Use ldd and objdump to see version requirements of binaries and libraries.

Statically link compiler and language libraries, such as libgcc and libstdc++.

Statically link selected libraries, while dynamically linking others.

Compile selected libraries with as little needed functionality as possible.

Pushing (E)GLIBC requirements down, by being wary of functions which have changed over time, and (E)GLIBC redirects calls to them in newly compiled programs by default.

Pushing (E)GLIBC requirements down by not directly using new functions, and instead working around their presence.

Doing all this, you'll still need to make different builds for different operating systems, and different architectures like x86 and ARM, but at least you won't be forced to for all different distros and versions thereof.

One thing of note, it's possible to have Linux with different C libraries, and in those cases, you may as well be using a different Operating System. You'll be hard pressed to make complex programs compiled against one C library run on Linux which uses a different C library, where the needed one is not present. Thankfully though, all the mainstream desktop and server distros all use (E)GLIBC.

In any case, the techniques you've learned here can also be applied to other setups too. (E)GLIBC was only focused on in this article because of its popularity and its many gotchas, but many other libraries that you may use, particularly video and audio libraries have similar issues as well.

Nice article, but instead of deleting stuff from stdio.h (even it is in your buildsystem only) I recommend using a mapfile (or version-script) which specifes GLIBC_2.3 as highest usable version when linking the program.

Sorry to make an o/t comment, but I was wondering if you might be able to remark on the current state of sound in Linux.

I'm having issues where the stock installation of Ubuntu 12.04 (via Linux Mint because I HATE Unity) where ALSA/PulseAudio/whatever the hell they're using now won't see my sound card, but installing OSS4 will give me some sound.

I'm planning on following your idea to make changes to /etc/asound.conf in order to have ALSA apps use OSS natively, but I wanted to know if this is still accurate. Can you confirm?