QAM Principles and Purpose

Introduction

QAM is a framework designed to aid in developing, building, installing testing (including automated testing) and running a software package or group of packages and their associated open-source dependencies. This document will provide an overview of what QAM does and why it does it that way.

The standard use case is that you run the top-level script called Test, which installs code from extsrc and src into release, leaving intermediate files in build, runs all of the unit tests, and then loads up databases, starts up servers, and runs a functional test of the whole system.

QAM is intended to support five main areas:

Development: Developers should be able to develop multiple versions at the same time, and work in an environment close to production to minimize release problems.

Building: QAM should support the building of a wide variety of software commonly used as components of projects, including libraries, web servers, and so on. This should help to minimize the amount of configuration one needs to do to deploy on a new host.

Installing: Installations should be easy and self-testing, and should minimize the amount of host configuration that needs to be done outside the project.

Testing: There should be support for a full automated test framework that can be run anywhere, if the release includes all source.

Running: There should be scripts to deal with the starting and stopping of servers in a common manner, using common configuration systems.

Supported Platforms and Requirements

QAM is primarially designed for use on Unix systems such as BSD and Linux systems. However, we are trying to provide as much Windows support as reasonably possible; depending on the software in the system, a project can generally be compiled and installed on Windows, and we intend to maintain that.

QAM is written purely in Ruby, in order to maximize portability whilst still using a language that’s reasonably powerful, concise and clear. There are places where Unix system utitilies (such as sh or tar) are invoked by Ruby code; these should be minimized as much as possible.

Directories

The remainder of this overview is organized around the various directories in a QAM project. You may find it convenient, when reading this document, to take a checkout of a QAM-based project, build and test it using the top-level test script (“./Test” in the base directory) and examine the various directories and files as we go along.

One good example to use is qam itself, which can be checked out with:

git clone git://git.cynic.net./qam

This has several examples of servers which you would not normally include when importing QAM into your own project. Because of this, you will need to have lighttpd installed on your system to be able to run the tests in this checkout. If you don’t have lighttpd installed, running “./Install” at the top level should produce most of the files you’d want to inspect.

The Base Directory

The base directory, or project directory, is where everything related to the project is kept, including source, intermediate build files, installed files, and application data. Normally this directory and its contents would be checked into a version control system, such as Subversion or Darcs. Checkouts of this are used in the same way by both developers of the system and production versions of it; a production version, a staging version, and several different developer versions can all exist on a single machine without conflict (albeit no servers must try to listen on the same port–more on this later).

Principle: Developers should develop in an environment that is as close as possible to the environment in which the system will run in production. This helps them to develop the release procedures, and makes debugging problems in production easier.

Principle: Several different copies, including different versions, of the system should be able to co-exist on a single machine. This allows developers to have multiple copies with different types of work in each, in turn allowing developers easly to make small fixes, possibly to different branches, even while major work is in progress on another version.

The Release Directory

The release directory (under the base directory) contains the installed version of the system, all of its supporting software, and some data. It is laid out in the same way that a Unix /usr or /usr/local directory tree is laid out, with programs under release/bin, libraries under release/lib, and so on. Note that this is no release/var directory, for reasons described in the next paragraph.

The data stored under the release directory are those that are specific to a particular release of the program and unchanging for that release. For example, the template files used to build configuration files for web (and other) servers are stored under release/lib/server/server-name/. Changing data, or data specific not to any instance of an application but to a particular instance of an application, are stored in the instance directory (see below).

Principle: All of the programs in a project should run using data only from the release directory. In other words, one should be able to distribute just that directory to someone and, if installed in the same location on a different host with the same operating system and packages, should run correctly. This configuration is not expected to be suitable for development or automated testing.

The idea behind the principle above is to allow for “binary” releases. The reason we do not insist that this work in a different location in the filesystem is that it’s very difficult to support ELF’s RPATH mechanism if the absolute path to a shared library changes, and ELF is an extremely popular format for shared libraries.

The Instance Directory

The instance directory is used for storing information specific to a particular instance of a running (or runable) server or other program. (You may, in one project, run multiple instances of a server, say, on different ports; the test system takes advantage of this to avoid disturbing test servers you may be using for manual testing.) Typically a directory under the instance directory will be named something like instance/web.8080, indicating that this is an instance of src/server.web (see below) that runs on port 8080. This number need not be a port number; in particular, daemons that don’t listen on a port may just use 0, particularly on production servers.

Each server instance directory, as we call things such as instance/web.8080, has a fairly standard layout, though particular projects may modify this as they see fit. This layout is designed to partition carefully what data need to be backed up and restored when recreating a server instance (e.g., after loss of a host on which the server instance was running).

Principle: It should be very clear what data need to be backed up to protect against server loss, and what of these data need to be restored in order to reproduce a server after such a loss.

When an instance directory is initially created, the owner and group (which will default to the user running the program) will be given read/write permissions, and others will be given no permissions. This default avoids exposure of sensitive data, control sockets, and the like, so long as you ensure that the user’s primary group is not shared with unrelated users. However, the conf directory under the instance directory, since it often contains especially sensitive information (e.g., passwords), will always have ‘other’ permissions removed whenever ‘server setup’ is run, which includes when it’s started and stopped.

In a particular subdirectory of the instance directory you will normally find:

* `db`: A directory for "permenant" information generated by the
server, such as the users who have registered at a web site. In a
production environment, this directory should always be backed up,
and restored when recreating the server instance.
* `conf`: Locally modifable configuration files. As with the `db`
directory, this directory should always be backed up, and restored
when recreating the server instance. See below for further details
on how these files are set up. Note that this directory is always
reset to have no permissions for "other"; see above.
* `conf/qserver.conf`: The configuration file for QAMs server
configuration/startup/shutdown program. This is used to determine
which server (e.g., lighttpd or apache) to run, and configure
the generation of the files under `run/genconf`. For the latter,
a typical entry might indicate whether the server should be
password-protected or not, or set up DBMS server passwords.
* `log`: For storage of log and other files. This would normally be
backed up (http server logs, for example, are valuable) but it is
not necessary to restore it in order to restore a server after a
disaster.
* `run`: Transient files, such as PID files, that need not be
backed up.
* `run/genconf`: Configuration files generated by the server
configure/start/stop system. The server startup script automatically
generates these from data in the `conf/qserver.conf` file and
templates in `release/libdata/server/{name}/genconf-template` (which
in turn is copied from `src/server.{name}/genconf-template`. These
files are regenerated every time, and as with anything else under
`run`, need not be backed up.

The files in the conf directory are rather special in that they are installed from default versions when not present, but are never modified or overwrittenby the system thereafter. The default versions are installed from src/server.{name}/conf-default into release/libdata/server/{name}/conf-default. The default versions must be the correct configuration of the system for use in automated tests; the automated test system relies on this property.

When modified by an administrator (say, because they are to be used in production, and so password protection should be disabled), the server config/start/stop system will inform the administrator of differences between the default configuration and the actual configuration when the server is configured or started. This aids the sysadmin in noting what the configuration of the server is, and also when configuration parameters have been added or changed in new versions of the software.

Principle: A reasonable default configuration, suitable for development and testing, and also secure, should be installed when no configuration is available. This configuration is not expected to be suitable for a production system.

Principle: Configuration and other files that can be modified by the administrator must never be overwritten by automated systems.

Principle: When releasing a new system, there should be a way of notifying the administrator doing the release of changes to things such as configuration files and database schemas, so that he may modify the configuration of the production system appropriately.

The Build Directory

The build directory is for storage of intermediate files used during compilation. These may be used by developers for debugging and other purposes1 but should never be used by anything expected to be running in a production system (see the principle under The Release Directory section above).

Generated files should never be stored under any directories but release, instance and build; removal of these three directories followed by a rebuild should result in a clean rebuild of every part of the system.

Principle: It should be simple to force a clean rebuild of all parts of the system. This aids in the detection of problems and makes it easy to ensure that the production system is as similar as possible to a development system.

Principle: The number of paths marked “ignored” or “boring” in revision-controlled directories should be minimized, to minimize developers erring by commiting generated files or not commiting source files.

The Extsrc Directory

The extsrc directory is used for (generally unmodified) software packages and libraries that the project uses. Software from the lib, bin, mod, haskell, perl, python, ruby and perhaps other subdirectories will be built and installed in release before going on to building the project itself. Examples might be lib/libpng, bin/lighttpd, mod/nagiosplugins, haskell/QuickCheck, and ruby/maruku.

Principle: It should be easy to include specific versions and configurations of common libraries and software with the project when the project relies on that. The project should use these in preference to any versions installed by the host’s operating system’s package system.

This is intended in particular to minimize configuration of a host and disruption of other applications running on the host. For example, by including Apache Httpd, PHP and some non-standard PHP libraries in extsrc, one can first avoid installing any of that on a new host one wishes to use for development or production, and can also co-exist with any existing systems using different versions and configurations, such as a legacy web system using Apache Httpd 1.x and PHP with a customized system-wide configuration file.

What you do and don’t chose to put into extsrc is a matter of judgement and the particular situations in which you find yourself deploying your project.

In the future we intend to implement a mechanism that can check the versions and configurations of software already installed by the operating system’s native package mechanism (or just installed by hand in /usr/local) and use those if available and appropriate, thus decreasing build time.

The Src Directory

The src directory is where the project itself lives, with various modules each having their own subdirectories under src. This is also frequently used for modified versions of open source or other applications and libraries. src/qam contains the copy of QAM used by the project, and is updated with the qu tool. This should be written up one day.

Not Covered Yet

This document does not yet cover:

* The systems for starting and stopping servers.
* The automated test framework.
* Probably a bunch of other stuff.

For example, the wrappers for the Glasgow Haskell Compiler’s runghc and ghci programs will use compiled versions of files from this directory if they are available and are not older than the source files from which they were presumably generated.