Large-Scale Linux Configuration Management

Mr. Anderson describes some general principles and techniques for installing and maintaining configurations on a large number of hosts and describes in detail the local configuration system at Edinburgh University.

The difficulty of installing and setting
up Linux is often mentioned as one of the reasons it is not more
widely used. People usually assume that editing the traditional
UNIX configuration files is more difficult than using the graphical
interfaces provided by operating systems like Microsoft Windows.
For a novice user with a single machine, this may be true, and most
commercial UNIX vendors now supply GUI-based tools for at least
some aspects of system configuration. Under Linux, projects like
COAS (see Resources 1) and the Red Hat distribution are starting to
cater to this need.

For a large installation with tens or hundreds of machines,
the GUI approach does not work—entering individual configuration
data for 200 machines is simply not practical. As well as the
ability to install large numbers of machines, big sites usually
need more control over the configuration; for example, they might
need to install new machines with a configuration which is
guaranteed to be identical to an existing one. Machines are also
likely to need periodical reconfiguring as their use changes, or
simply to keep up to date with the latest software and
patches.

To do this effectively requires a good deal of automation,
and large UNIX sites have been developing their own tools for many
years (see Resources 2). The flexibility and accessibility of UNIX
configuration files makes Linux particularly suitable for
automation, and those sites attempting to install and manage large
numbers of NT systems are often likely to find the process more
difficult (see Resources 3).

The Division of Informatics at Edinburgh University has over
500 UNIX machines, with a wide variety of different configurations.
Most of them are installed and maintained automatically using the
LCFG (Local ConFiGuration) system, originally
developed several years ago (see Resources 4). Both client and
server configurations can be easily reproduced to replace failed
machines or to create tens of identical systems for a new
laboratory. Reconfiguration is thus a continuous process; for
example, machines adjust every night to ensure they are carrying
the latest versions of the required software. Linux (we use a
version of the Red Hat distribution) has proven itself well-suited
to this environment, and it has recently overtaken Solaris to
become the most popular desktop system, both for staff use and
student laboratories.

Make-Up of a Good Configuration System

An automatic configuration system should be able to build
working machines from scratch with no manual intervention. This
includes configuration of the basic operating system (disk
partitions, network adaptors), loading of required software, and
configuration of application-specific services such as web servers.
This allows failed machines to be recreated quickly, using
replacement hardware, and new machines to be installed efficiently,
even by junior staff. As a side effect, it also avoids the need for
backups of any system partition.

The set of configuration information that drives this build
process defines the personality of an
individual machine, and it is extremely useful if this
specification is available in an explicit form
(such as a plaintext file or a database). Machines can then be
cloned simply by copying their specification
and applying the automatic build. This is important for installing
multiple similar machines, such as in a student laboratory. The
master copy of the specification should be held remotely from the
machine, so that it is available even when the machine is down.
This allows programs to automatically verify individual
configurations and even the relationships between machines, such as
ensuring every client's specified DNS server is actually configured
to run a name daemon. The specification can also be generated from
higher-level descriptions of a machine's function. An inheritance
model is very useful, since many machine configurations can be
conveniently described as small variations of a generic
configuration for a particular class.

Traditional configuration systems are often
static, in the sense that the configuration is
applied only at the time the machine is installed. Most
vendor-supplied installation processes fall into this category, as
do systems based on cloning by copying disk images. If subsequent
changes to the configuration have to be applied manually, the
configuration is almost certain to “rot”, and it is impossible to
be confident that all machines are correctly configured. Obvious
misconfigurations simply result in users having malfunctioning
machines. More subtle misconfigurations may go unnoticed and pose
serious security problems, for example. Even though a fully
dynamic system is not practical, an ideal
system will continually adjust the configuration to conform to the
specification. Some parameters can be changed immediately to track
a change in the specification; some, such as a network address, may
be changed only when the machine reboots; and others, such as a
disk partitioning, may require a complete rebuild.

If a configuration system is incomplete and manual
intervention is necessary, many of the benefits are lost. However,
constructing a comprehensive system to cover every conceivable
parameter is clearly impractical. The key problem is trying to
create an extensible framework flexible enough to allow new
parameters and components to be incorporated
with little effort. An individual instance of the system can then
evolve at a particular site to suit the local requirements. If it
is going to be extended on demand by working administrators, the
framework needs to be extremely lightweight and comprehensible in a
short amount of time. It must be easy to create components in a
familiar language, and to interface them to new subsystems which
require configuration. Open-source software is an advantage, since
it is often easy to base a new extension on one that already
exists.

Geek Guides

Pick up any e-commerce web or mobile app today, and you’ll be holding a mashup of interconnected applications and services from a variety of different providers. For instance, when you connect to Amazon’s e-commerce app, cookies, tags and pixels that are monitored by solutions like Exact Target, BazaarVoice, Bing, Shopzilla, Liveramp and Google Tag Manager track every action you take. You’re presented with special offers and coupons based on your viewing and buying patterns. If you find something you want for your birthday, a third party manages your wish list, which you can share through multiple social- media outlets or email to a friend. When you select something to buy, you find yourself presented with similar items as kind suggestions. And when you finally check out, you’re offered the ability to pay with promo codes, gifts cards, PayPal or a variety of credit cards.