Building Tiny Linux Systems with Busybox--Part I

Create a Busybox single-floppy Linux system that includes a kernel, command-line environment and your application.

Because Linux is small and easy to
customize, it's a fine kernel for embedded systems. But what about
all of the other programs that are needed for a minimum functional
GNU/Linux system? The minimum system installed by the Debian or Red
Hat set-up disks, exclusive of the kernel, is about 40 megabytes in
size. Busybox replaces the GNU/Linux distribution with a large set
of command-line tools--all that are needed to boot and run a
practical Linux system with networking--in a very small package.

The typical compiled size of Busybox on the i386 architecture
is 256 to 500K total for all tools, depending on the C library used
and how it is linked. This makes it easily possible to create a
single-floppy Linux systems with a full-featured kernel, a
command-line environment, plus your application.

I originally wrote Busybox in 1996 for the Debian GNU/Linux
setup disk. The goal was to put a complete bootable system on a
single floppy that would be both a rescue disk and an installer for
the Debian system. A rescue disk is used to repair a Linux system
when that system has become unbootable. Thus, a rescue disk needs
to be able to boot the system and mount the hard disk file systems,
and it must provide the command-line tools needed to bring the
hard-disk root file system back to a bootable state. The Debian
installer at the time I wrote Busybox was a Bourne shell script
using dialogto provide a simple
graphical interface on an ANSI terminal or the Linux
console.

Since its creation, many people have added to and maintained
Busybox, including members of the Debian Boot-Floppies team, the
Linux Router Project and Lineo Corporation, where Eric Anderson
maintains Busybox today. In the tradition of Free Software
projects, contributions by other authors now make up the majority
of the project. Busybox is a part of almost every commercial
embedded Linux offerings, and is found in such diverse projects as
the Kerbango Internet Radio and the IBM Wristwatch that runs
Linux.

The name Busybox comes from a child's
toy box with a telephone dial and anumber of knobs, buttons and
other devices, all of which make noises when operated. This was
called a busy boxin the past and is today
commonly referred to as an activity
center.

The Busybox source code can be found at
www.busybox.lineo.com.
By default, the Makefile provided
builds a dynamic-link executable using the default
libc library on your system.
However, it is easily adapted to cross-compilation, and one can
select static linking and other, specialized libc libraries by
editing Makefile variables. Before you exercise the Makefile
options or embed Busybox, you should build and run it on your host
system just so that you can get familiar with it. If you are on a
Linux system with the development tools installed, simply typing
make should build it.

Once you build Busybox, it's time to learn about
multi-call executable files. This is a trick
we use to make Busybox small. There is one executable called
busybox that is linked to 107
different names and provides the functions of 107 different
programs. To illustrate this, run the following shell
commands:

ln busybox ls
ln busybox uptime
ln busybox whoami

Now, run these commands:

./ls
./uptime
./whoami

Be sure to type the leading "./" as illustrated above, or you
will get the system version of these commands rather than the
Busybox version.

The lncommand is used to
apply another name to a file. It does not copy the file. The only
space it uses in the file system is the small amount that is
necessary to store a name in a directory. You can illustrate this
using the following ls
command:

ls -il busybox ls uptime whoami

Be sure to use the -il argument to
ls as above. This causes ls to
print the inode number, a unique number identifying each file in a
file system, along with the usual long listing provided by the -l
argument. Linux allows the same file to have more than one name,
but a file only has one inode number. You should see something
similar to Listing 1.

Note that this is not a listing of four files. It is a
listing of one file with four names, as indicated by the inode
number in the first column and the link count
in the third column. The link count reports how many names a file
has. You'll notice that directories in the common Linux file
systems always have a link count of two, because they have two
names: "." and "..". Most files, however, only have one name and
their link count will be 1.

Because there is a fixed overhead of several kilobytes for
every executable program, compressing 107 commands into one file
saves a significant amount of space. So, just as we have linked
Busybox to four different names, we can link it to 107. This
provides us with a complete, bootable, runnable Linux system in a
very small space. Even static-linking with GNU LIBC 6, which has
become the standard for Linux systems, Busybox occupies only half a
megabgyte.

If you don't need the internationalization of LIBC 6, the old
LIBC 5 is significantly smaller. A new library intended for
embedded use, uC-Libc
(www.opensource.lineo.com),
is even smaller, but use caution if your application is
proprietary. Like Busybox, uC-Libc is covered by the GNU GPL
(General Public License), and can't be linked to proprietary
software. GNU LIBC 5 and 6, in contrast, are under the LGPL (the
Lesser General Public License) and can be linked to proprietary
applications. So, don't use uC-Libc for the libc library of a
non-free program. At this writing, uC-Libc doesn't quite provide
all of the functions required by Busybox, but it's only short a few
and these may be provided by the time you read this article.

On the Debian install floppy, I linked Busybox dynamicaly,
and then stripped down the shared libc library so that it only
provided the functions necessary to support Busybox and the other
executables on the floppy. This was the best way to provide a
library shared by several different executables, since the floppy
contained other programs besides Busybox. Stripping libc down to
only the functions that actually were used cut its size by half.
Rather than strip it by hand, I wrote a script that finds all of
the library functions referenced by a set of dynamic executables,
and then creates a library subset providing those functions (and
the functions they depend on). This script has since been
completely replaced by a version written by Marcus Brinkmann, which
can be found in the Debian boot-floppies package under
scripts/rootdisk/mklibs.sh. The
script and how it works are properly the subject of another article
the size of this one however, until that article is written, one
can puzzle out how mklibs.sh works by installing the boot-floppies
package on a Debian system, building the floppies and then reading
the script carefully. Warning: mklibs.sh is probably the most
complex shell script you will ever examine.

So, now that you know how to build and run Busybox, how do
you make a small Linux system containing it? You'll need a few
pieces: a static-linked Busybox executable, a skeleton root file
system and /dev directory populated with the proper special files,
and a Linux kernel with the features you need plus two features
that will be used to boot and run a small Linux system: RAM disks
and the compressed ROM file system. [Look for the details on how to
build a small Linux system containing Busybox in a future
issue--Ed.]

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.