Want an easy way to keep up on the latest TUX website stories, news, and reviews? Load the RSS feed featured below into the reader of your choice, and get headlines and summaries downloaded automatically. Click here to learn more about RSS.

Section 8: Compile and Install Source Code

Installing prebuilt binary packages, as I discussed in Lab 5.3, "Install and Upgrade RPMs," is a fine way to extend your SUSE Linux distribution. However, sometimes you'll come across an application for which no binary package is available.

Some people are put off the idea of installing from source, thinking that they will need to look at, and understand, the source code of the applications they are installing. This is hardly ever the case. There is a standard sequence of commands, discussed in this lab, that will build most source packages on your Linux platform. It is not really any more complicated than the commands needed to install a binary package. In the (unlikely) event that you start seeing incomprehensible error messages from the compiler as you try to build the package, my advice is: if you can't figure out what's wrong within 15 minutes, give up! Unless you're a seasoned developer, you have little chance of fixing it (and even if you are, you'd probably just cut and paste those error messages into a bug report anyhow). Lest you become discouraged by this, I should emphasize that most source packages will build without any difficulty on your SUSE Linux distribution.

Tip: The distinction between a binary (machine code) package and a source package is important. A binary package includes one or more executable binary programs—files that contain machine code for a specific processor such as a Pentium, or a SPARC, or an IBM Z-series mainframe. The machine code contained in these packages consists of instructions that can be executed by a specific processor. A binary package built for one processor architecture won't work on another. A source package, on the other hand, contains files of source code written in a specific programming language (often the C language in the case of Linux applications). These files cannot, as they stand, be run on any processor at all. They need to be translated (compiled) into machine code for a specific processor. To do this, you need to have the appropriate development packages installed on your machine.

The advantages of installing an application from source code instead of a binary are:

A single source package can be distributed that can be built and run on multiple architectures—various versions of Linux and Unix, and even (sometimes) on Windows.

You can get the latest and greatest versions of applications for which no-one has (yet) built a binary package for your platform. Availability of binary packages always lags a little behind the source packages. For some applications, you cannot necessarily rely on someone building a binary package for your platform at all, though this is less likely to be true for a popular distribution such as SUSE.

If you build a package from source, you get a lot more flexibility in deciding which features of the application you want to include and exclude. You also get to decide where the various pieces will be installed—for example, the directory in which the executables themselves will go, where the config files will go, where the documentation will go, and so on. If you install a binary package, you must accept the configuration decisions made by whoever built the package.

How Do I Do That?

First, make sure that you have the development tools installed. The key packages are:

gcc

The GNU C Compiler

cpp

The C preprocessor

make

A tool for controlling the build process

These are all included with the SUSE distribution. The easiest way to make sure you have them is to open the YaST package management screen, select the Selections filter, and make sure that the "C/C++ Compiler and Tools" checkbox is checked. Some packages have installation scripts written in Perl, so it wouldn't do any harm to have the perl package installed as well.

Next, you'll need to find and download the source code archive of the package you want to install. You can find some advice on actually locating the package in Lab 5.2, "Finding the Packages You Need."

Most source code archives are delivered as compressed tar archives, commonly known as tarballs. There are two types of compression in common use. One is called gzip, and usually the filename will end in .tar.gz, or occasionally .tgz. The other is bzip2, and usually the filename will end in .tar.bz2. Bzip2 compression is a little more effective than gzip; it gives smaller file sizes. The GNU version of tar is able to decompress both formats on the fly, provided that you supply the right flags: z for gzip decompression and j for bzip2 decompression.

Tip: No, you're right, there is not much mnemonic value in using j for bzip2 decompression. However, tar has so many options that I think they were running out of letters.

You should adopt some sort of directory organization for downloading and building things. My own preference is for a directory ~/downloads into which I download the archive, and a directory called ~/builds into which I unpack the archive and do the build. Each archive unpacks and builds in its own subdirectory. There is nothing particularly special about these choices—I just find them convenient.

Tip: The tilde (~) is shorthand, understood by the shell (and hopefully by you, the reader!) to mean the user's home directory.

As an example in this lab, you will download and install a KDE-based image viewer called showimg. At the time of writing, the most recent version is 0.9.5. I found the archive showimg-0.9.5.tar.bz2 by going initially to the site http://extragear.kde.org, then following the link to the project's home page at http://www.jalix.org/projects/showimg. The name showimg-0.9.5.tar.bz2 is a typical example of the naming convention for a tarball; the name is comprised of three components: the package name, the version information, and an extension denoting the file type. I downloaded this file into my ~/downloads directory.

Next, I changed to my builds directory and unpacked the archive:

$ cd ~/builds
$ tar jxf ~/downloads/showimg-0.9.5.tar.bz2

Don't forget to use the filename completion facility in the shell to avoid typing the whole name. You should now find that you have a directory called showimg-0.9.5. Change into that directory and take a look around. Most packages will provide a file called something like README or INSTALL that provides installation instructions. It's worth reading this through (twice) before you do anything else:

$ cd ~/builds/showimg-0.9.5
$ less README
$ less INSTALL

Some source packages come with their own custom scripts to handle the installation, but the majority are built using a utility called autoconf. This tool (which is run by the package developer, not by you) creates a very clever shell script called configure (which is run by you, not the package developer). The configure script probes your system to determine various features of the platform you're building on (including the libraries that are available and the characteristics of your compiler) and generates one or more Makefiles, which will control the actual build process. As this proceeds, you will see a large number of messages scrolling past as the script repeatedly probes the system. You do not need to read all of these, but you should check the last few lines for any evidence that the process has failed.

Tip: There is a well-known quote from Arthur C. Clarke that "any sufficiently advanced technology is indistinguishable from magic." By this definition, the configure script is magic.

If the configure script runs successfully, you're ready to go ahead and build the application. The command sequence will look something like this:

$ cd ~/builds/showimg-0.9.5
$ ./configure
$ make

The make command actually compiles and builds the application, and can take a while to run—anywhere from a minute to an hour depending on the size of the package and the speed of your machine. During this time, even more (mostly incomprehensible) messages will scroll past. Again, you should check the last few lines for any evidence that something went wrong. A few compilation warnings are nothing to worry about, but any actual errors need to be investigated. (In fact, the showimg compilation produces a huge number of warnings.) Once the make command has completed, all the components of the application have been built, but they all remain within the folder in which you did the build. If you changed your mind at this stage and deleted this folder and the original tarball, you would remove the package without a trace.

The final step is to install the package; that is, to copy the executables, libraries, configuration files, documentation, and so on, into the correct system directories. You need to be root for this. Just cd to the build directory and run the command:

# make install

This step is a little harder to back out from, should you change your mind. Sometimes you'll find a make uninstall option for the package, but many don't provide this. In this case, you would need to manually find and delete each of the installed files. There are usually options available for the ./configure command to specify where the pieces will get put (see the following discussion). Running ./configure --help will generally show these and tell you what the defaults are. You may wish to get into the habit of saving a listing of the install directories prior to the installation, repeating the listing after the installation, and doing a diff on the two. For example, because I know that the package installs into /opt/kde3 on a SUSE system, I might use this sequence of commands:

Okay, so now you have the package installed. What next? Using a new package may simply be a case of running the binary. You may have to invoke it by name from a command prompt, because an installation from source is unlikely to add an entry to your desktop menus for you, though this is not hard to do manually. You can learn how to edit the KDE menus in Lab 3.3, "Configure the KDE Menus and Panel," in Chapter 3. Also, be aware that the directory you install the binaries into may not be on your search path, so you may have to specify a full pathname to the binary. Some packages have wrapper scripts to simplify the task of launching the applications. Other packages have nontrivial configuration files and will require further work before you can get the program to do anything useful.

What About...

...specifying build options? The configure script usually has a shedload of options for customizing the build process. The exact options depend on the package. Run the command:

$ ./configure --help

to see all available options. This example, obtained by running the configure script of the Apache server, is heavily edited:

$ ./configure --help
'configure' configures this package to adapt to many kinds of systems.
Usage: ./configure [OPTION]...[VAR=VALUE]...
To assign environment variables (e.g., CC, CFLAGS...), specify them as
VAR=VALUE. See below for descriptions of some of the useful variables.
Defaults for the options are specified in brackets.
Configuration:
-h, --help display this help and exit
--help=short display options specific to this package
--help=recursive display the short help of all the included packages
-V, --version display version information and exit
-q, --quiet, --silent do not print 'checking...' messages
--cache-file=FILE cache test results in FILE [disabled]
-C, --config-cache alias for '--cache-file=config.cache'
-n, --no-create do not create output files
--srcdir=DIR find the sources in DIR [configure dir or '..']
Installation directories:
--prefix=PREFIX install architecture-independent files in PREFIX
[/usr/local/apache2]
--exec-prefix=EPREFIX install architecture-dependent files in EPREFIX
[PREFIX]
By default, 'make install' will install all the files in
'/usr/local/apache2/bin', '/usr/local/apache2/lib' etc. You can specify
an installation prefix other than '/usr/local/apache2' using '--prefix',
for instance '--prefix=$HOME'.
For better control, use the options below.
Fine tuning of the installation directories:
--bindir=DIR user executables [EPREFIX/bin]
--sbindir=DIR system admin executables [EPREFIX/sbin]
--libexecdir=DIR program executables [EPREFIX/libexec]
--datadir=DIR read-only architecture-independent data [PREFIX/share]
...edited...
System types:
--build=BUILD configure for building on BUILD [guessed]
--host=HOST cross-compile to build programs to run on HOST [BUILD]
--target=TARGET configure for building compilers for TARGET [HOST]
Optional Features:
--disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no)
--enable-FEATURE[=ARG] include FEATURE [ARG=yes]
--enable-layout=LAYOUT
--enable-v4-mapped Allow IPv6 sockets to handle IPv4 connections
--enable-exception-hook Enable fatal exception hook
--enable-maintainer-mode
Turn on debugging and compile time warnings
--enable-modules=MODULE-LIST
Modules to enable
--enable-mods-shared=MODULE-LIST
Shared modules to enable
--disable-access host-based access control
--disable-auth user-based access control
--enable-info server information
...edited...
Optional Packages:
...edited...
--with-suexec-docroot SuExec root directory
--with-suexec-uidmin Minimal allowed UID
--with-suexec-gidmin Minimal allowed GID
--with-suexec-logfile Set the logfile
--with-suexec-safepath Set the safepath
--with-suexec-umask umask for suexec'd process
...edited...

The range of options that we see in this output is typical for a configure script. Usually, there are options to:

Control the operation of the script itself (e.g., -q for the "quiet" option).

Specify the directories into which the various pieces will be installed. Typically, the option --prefix specifies the top-level directory, and the other install directories are derived from that. Optionally, you can fine-tune the directories with options such as --bindir, which determines where the executables will go.

Specify the platform for which the package will be built. Usually, the platform you're building on is the same as the platform you'll be running on, so you just let the script probe the system to determine the platform automatically.

Enable or disable optional features of the package. Apache has quite a lot of these options, because most of its functionality is provided through the use of modules that can be either statically built into the executable or loaded dynamically. For example, the directive --enable-info specifies inclusion of a module that provides a web page of information about the server itself.

A simple example of a configure command for Apache, which includes the info module in the build and installs into /opt/apache2, would be:

$ ./configure --prefix=/opt/apache2 --enable-info

The commands can get much longer than this.

What About...

...installing from source RPMs?

There is another format that's sometimes used for distributing source code: a source RPM. Source RPM files have names ending in .src.rpm. They contain an archive (usually a compressed tar archive) of the source code for a package, along with a spec file that specifies how to build the binary version of the package from the source. Source RPMs are what you'll find on the source code CD or DVD that comes as part of your SUSE Linux distribution. SUSE uses source RPMs to maintain a common code base for SUSE Linux and the many packages it includes, across the various hardware architectures that SUSE supports. This includes 32-bit and 64-bit Intel architectures, IBM Power PC, and IBM zSeries mainframes.

If you install a source RPM package (using the rpm -i command in the usual way), you will end up with the package files in the folder /usr/src/packages/SOURCES and the spec file in /usr/src/packages/SPECS. You can build a binary RPM from a source RPM using a command of the form rpmbuild -bb specfile.

How Does It Work?

The real magic behind installing from source is the configure script. This is a shell script generated by the utility called autoconf, which is run by the developer of the package. Autoconf creates the configure script from a template file that lists the operating system features that the package can use, in the form of m4 macro calls. (m4 is a macro text processor that has been part of Unix/Linux for many years.) The configure script does not actually build the package; it runs a sequence of feature tests to see what features the system can support and, based on the results of these tests, generates one or more "makefiles" that control the build itself. It is this process that enables the same source package to be built on a wide range of Unix and Linux platforms.

Makefiles define a set of rules that can be used to build one or more components (called targets) from their constituent parts. They define the dependencies of the targets on the constituent pieces (in this case, the source modules) and are interpreted by a tool called make, which figures out the commands that need to be executed to build the targets. Makefiles and the make utility are commonly used to control the building (and rebuilding) of software projects. After source code has been edited, make compares the time stamps on the source and binary files to figure out which binaries are out of date and need to be recompiled. In the case of building a package from source code you have just downloaded, none of the binaries exist of course, so everything needs to be compiled.

A full treatment of make would require a chapter (or book) to itself, but I will present a simple example to give you an idea of how it works. The problem that make is designed to solve is that most applications are built from many source files. When one or more of those source files has been edited and the application needs to be rebuilt, only those files that were changed need to be recompiled. The make utility helps to automate the process of figuring out exactly what needs to be done to rebuild the application. Make relies on the "time of last modification" time stamp on the files, and on the notion of a "dependency graph" (discussed shortly) that shows which files depend on which others. This approach is much smarter than (for example) writing a shell script that simply recompiles everything in sight, and can save substantial time in rebuilding an application.

make can be used in any situation where files must be built that depend on other files. For example, documentation files can be built automatically from comments in program source code. Indexes can be built from collections of text files. However, it is in the building and rebuilding of applications from their source code that make finds its widest application. This is the example that I consider here. Let's take a hypothetical (and much simplified) example based on the showimg application, which is distributed across three C source files:

ui.c

The code that creates the user interface

png.c

The code for displaying PNG images

jpg.c

The code for displaying JPEG images

In addition, there are two header files, png.h and jpg.h, that define the interfaces to the png and jpg modules. The build model for an application written in C is that each source file is separately compiled into an object file (for example, ui.c is compiled into ui.o), then the object files are linked together to create the application—in this case, showimg. Figure 5-11 shows the dependency graph of these pieces. This dependency graph shows that (for example) showimg is dependent upon the files ui.o, png.o, and jpg.o. This dependency implies that (a) all three of these .o files must exist before showimg can be built, and (b) if the last-modified time stamp on any of the .o files is more recent than the last-modified time stamp of showimg, then showimg is out of date and needs to be rebuilt. Similarly, working our way down the graph, ui.o is dependent on ui.c and the header files png.h and jpg.h, and so on.

Figure5-11.Dependency graph for make

To use make, a Makefile must be built that describes the dependency graph. The Makefile for this example might look like this:

The target simply names a file that make knows how to rebuild. The dependencies list the files that must be present (and, in their turn, up to date) in order to build the target. The rebuild commands (which appear on a separate line and must be indented by a tab character) specify the commands that must be executed to bring the target up to date, assuming that up-to-date copies of the dependencies exist.

Once the Makefile is created, make is invoked by specifying a target name as an argument, for example:

$ make showimg

If no argument is given, make defaults to building the first target, in this case showimg. It will begin by making sure that this target's dependencies are up to date (in this case the files ui.o, png.o, and jpg.o) by looking them up as targets and, in turn, verifying their own dependencies. In this way, make performs a "depth-first traversal of the dependency graph" (a phrase I have found to be a wonderful conversation stopper at parties), bringing everything up to date as it works its way back up to the top of the dependency tree.

Some of the targets in a Makefile define administrative actions rather than the creation of a file. There is often a target named install that installs the application into the appropriate system directories. In this example, it might look like this:

install: showimg
mv showimg /opt/kde3/bin

Now I can type:

# make install

to move the program into its proper place. Defining showimg as a dependency of the install target will cause make to ensure that showimg is up to date before installing it into the system directory /opt/kde3/bin.

Another common target, conventionally named clean, is used to remove any intermediate files used in the build process. For example, you might add this target to the Makefile:

clean:
rm ui.o png.o img.o

With this target in place, you can type

$ make clean

to get rid of the .o files (thus forcing a full recompile next time).

You can also define variables in Makefiles. Variables are commonly used to centralize the specification of system-dependent commands or directories, or to give a name to a long list of files (often a list of .o files) to avoid repeating that list in multiple places in the file. In fact, almost 40 percent of the real Makefile for the showimg application consists of variable definitions. As a simple example, here's a revised version of the Makefile that uses three variables—one to define the system-specific command to run the C compiler, one to define the directory that the application should be installed into, and one to provide a shorthand for the list of object files. Notice the use of the notation $(CC), which is replaced by the value of the variable:

This brief treatment barely scratches the surface of make. The automatically generated Makefiles that you'll come across if you install software from source represent much more advanced examples of the Makefile writer's art.

For information on making .SRPMs, take a look at http://www.opensuse.org/SUSE_Build_Tutorial. You don't (generally) need to be a seasoned developer to package software, but you do need to know your way around the system you are packaging for quite well.

For everything you always wanted to know about make, but were too embarrassed to ask, try Managing Projects with Make by Mecklenburg (O'Reilly).