This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; see the file COPYING. If not, write to
the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.

Cons is a system for constructing, primarily, software, but is quite
different from previous software construction systems. Cons was designed
from the ground up to deal easily with the construction of software spread
over multiple source directories. Cons makes it easy to create build scripts
that are simple, understandable and maintainable. Cons ensures that complex
software is easily and accurately reproducible.

Cons uses a number of techniques to accomplish all of this. Construction
scripts are just Perl scripts, making them both easy to comprehend and very
flexible. Global scoping of variables is replaced with an import/export
mechanism for sharing information between scripts, significantly improving
the readability and maintainability of each script. Construction
environments are introduced: these are Perl objects that capture the
information required for controlling the build process. Multiple
environments are used when different semantics are required for generating
products in the build tree. Cons implements automatic dependency analysis
and uses this to globally sequence the entire build. Variant builds are
easily produced from a single source tree. Intelligent build subsetting is
possible, when working on localized changes. Overrides can be setup to
easily override build instructions without modifying any scripts. MD5
cryptographic signatures are associated with derived files, and are used
to accurately determine whether a given file needs to be rebuilt.

While offering all of the above, and more, Cons remains simple and easy to
use. This will, hopefully, become clear as you read the remainder of this
document.

Cons is a make replacement. In the following paragraphs, we look at a few
of the undesirable characteristics of make--and typical build environments
based on make--that motivated the development of Cons.

Traditional make-based systems of any size tend to become quite complex. The
original make utility and its derivatives have contributed to this tendency
in a number of ways. Make is not good at dealing with systems that are
spread over multiple directories. Various work-arounds are used to overcome
this difficulty; the usual choice is for make to invoke itself recursively
for each sub-directory of a build. This leads to complicated code, in which
it is often unclear how a variable is set, or what effect the setting of a
variable will have on the build as a whole. The make scripting language has
gradually been extended to provide more possibilities, but these have
largely served to clutter an already overextended language. Often, builds
are done in multiple passes in order to provide appropriate products from
one directory to another directory. This represents a further increase in
build complexity.

The bane of all makes has always been the correct handling of
dependencies. Most often, an attempt is made to do a reasonable job of
dependencies within a single directory, but no serious attempt is made to do
the job between directories. Even when dependencies are working correctly,
make's reliance on a simple time stamp comparison to determine whether a
file is out of date with respect to its dependents is not, in general,
adequate for determining when a file should be rederived. If an external
library, for example, is rebuilt and then ``snapped'' into place, the
timestamps on its newly created files may well be earlier than the last
local build, since it was built before it became visible.

Make provides only limited facilities for handling variant builds. With the
proliferation of hardware platforms and the need for debuggable
vs. optimized code, the ability to easily create these variants is
essential. More importantly, if variants are created, it is important to
either be able to separate the variants or to be able to reproduce the
original or variant at will. With make it is very difficult to separate the
builds into multiple build directories, separate from the source. And if
this technique isn't used, it's also virtually impossible to guarantee at
any given time which variant is present in the tree, without resorting to a
complete rebuild.

Make provides only limited support for building software from code that
exists in a central repository directory structure. The VPATH feature of
GNU make (and some other make implementations) is intended to provide this,
but doesn't work as expected: it changes the path of target file to the
VPATH name too early in its analysis, and therefore searches for all
dependencies in the VPATH directory. To ensure correct development builds,
it is important to be able to create a file in a local build directory and
have any files in a code repository (a VPATH directory, in make terms) that
depend on the local file get rebuilt properly. This isn't possible with
VPATH, without coding a lot of complex repository knowledge directly into
the makefiles.

Cons is Perl-based. That is, Cons scripts--Conscript and Construct
files, the equivalent to Makefile or makefile--are all written in
Perl. This provides an immediate benefit: the language for writing scripts
is a familiar one. Even if you don't happen to be a Perl programmer, it
helps to know that Perl is basically just a simple declarative language,
with a well-defined flow of control, and familiar semantics. It has
variables that behave basically the way you would expect them to,
subroutines, flow of control, and so on. There is no special syntax
introduced for Cons. The use of Perl as a scripting language simplifies
the task of expressing the appropriate solution to the often complex
requirements of a build.

A key simplification of Cons is the idea of a construction environment. A
construction environment is an object characterized by a set of key/value
pairs and a set of methods. In order to tell Cons how to build something,
you invoke the appropriate method via an appropriate construction
environment. Consider the following example:

$env = new cons(
CC => 'gcc',
LIBS => 'libworld.a'
);

Program $env 'hello', 'hello.c';

In this case, rather than using the default construction environment, as is,
we have overridden the value of CC so that the GNU C Compiler equivalent
is used, instead. Since this version of Hello, World! requires a library,
libworld.a, we have specified that any program linked in this environment
should be linked with that library. If the library exists already, well and
good, but if not, then we'll also have to include the statement:

Library $env 'libworld', 'world.c';

Now if you type cons hello, the library will be built before the program
is linked, and, of course, gcc will be used to compile both modules:

This is a relatively simple example: Cons ``knows'' world.o depends upon
world.c, because the dependency is explicitly set up by the Library
method. It also knows that libworld.a depends upon world.o and that
hello depends upon libworld.a, all for similar reasons.

Now it turns out that hello.c also includes the interface definition
file, world.h:

How does Cons know that hello.c includes world.h, and that hello.o
must therefore be recompiled? For now, suffice it to say that when
considering whether or not hello.o is up-to-date, Cons invokes a scanner
for its dependency, hello.c. This scanner enumerates the files included
by hello.c to come up with a list of further dependencies, beyond those
made explicit by the Cons script. This process is recursive: any files
included by included files will also be scanned.

Isn't this expensive? The answer is--it depends. If you do a full build of a
large system, the scanning time is insignificant. If you do a rebuild of a
large system, then Cons will spend a fair amount of time thinking about it
before it decides that nothing has to be done (although not necessarily more
time than make!). The good news is that Cons makes it very easy to
intelligently subset your build, when you are working on localized changes.

Because Cons does full and accurate dependency analysis, and does this
globally, for the entire build, Cons is able to use this information to take
full control of the sequencing of the build. This sequencing is evident
in the above examples, and is equivalent to what you would expect for make,
given a full set of dependencies. With Cons, this extends trivially to
larger, multi-directory builds. As a result, all of the complexity involved
in making sure that a build is organized correctly--including multi-pass
hierarchical builds--is eliminated. We'll discuss this further in the next
sections.

A larger build, in Cons, is organized by creating a hierarchy of build
scripts. At the top of the tree is a script called Construct. The rest
of the scripts, by convention, are each called Conscript. These scripts
are connected together, very simply, by the Build, Export, and
Import commands.

This is a simple two-level hierarchy of build scripts: all the subsidiary
Conscript files are mentioned in the top-level Construct file. Notice
that not all directories in the tree necessarily have build scripts
associated with them.

This could also be written as a multi-level script. For example, the
Construct file might contain this command:

Build qw(
parser/Conscript
drivers/Conscript
utilities/Conscript
);

and the Conscript file in the drivers directory might contain this:

Build qw(
display/Conscript
mouse/Conscript
);

Experience has shown that the former model is a little easier to understand,
since the whole construction tree is laid out in front of you, at the
top-level. Hybrid schemes are also possible. A separately maintained
component that needs to be incorporated into a build tree, for example,
might hook into the build tree in one place, but define its own construction
hierarchy.

By default, Cons does not change its working directory to the directory
containing a subsidiary Conscript file it is including. This behavior
can be enabled for a build by specifying, in the top-level Construct
file:

Conscript_chdir 1;

When enabled, Cons will change to the subsidiary Conscript file's
containing directory while reading in that file, and then change back
to the top-level directory once the file has been processed.

It is expected that this behavior will become the default in some future
version of Cons. To prepare for this transition, builds that expect
Cons to remain at the top of the build while it reads in a subsidiary
Conscript file should explicitly disable this feature as follows:

You may have noticed that the file names specified to the Build command are
relative to the location of the script it is invoked from. This is generally
true for other filename arguments to other commands, too, although we might
as well mention here that if you begin a file name with a hash mark, ``#'',
then that file is interpreted relative to the top-level directory (where the
Construct file resides). And, not surprisingly, if you begin it with ``/'',
then it is considered to be an absolute pathname. This is true even on
systems which use a back slash rather than a forward slash to name absolute
paths.

The top-level Construct file and all Conscript files begin life in
a common, separate Perl package. Cons controls the symbol table for
the package so that, the symbol table for each script is empty, except
for the Construct file, which gets some of the command line arguments.
All of the variables that are set or used, therefore, are set by the
script itself--not by some external script.

Variables can be explicitly imported by a script from its parent
script. To import a variable, it must have been exported by the parent
and initialized (otherwise an error will occur).

The values of the simple variables mentioned in the Export list will be
squirreled away by any subsequent Build commands. The Export command
will only export Perl scalar variables, that is, variables whose name
begins with $. Other variables, objects, etc. can be exported by
reference--but all scripts will refer to the same object, and this object
should be considered to be read-only by the subsidiary scripts and by the
original exporting script. It's acceptable, however, to assign a new value
to the exported scalar variable--that won't change the underlying variable
referenced. This sequence, for example, is OK:

It doesn't matter whether the variable is set before or after the Export
command. The important thing is the value of the variable at the time the
Build command is executed. This is what gets squirreled away. Any
subsequent Export commands, by the way, invalidate the first: you must
mention all the variables you wish to export on each Export command.

Variables exported by the Export command can be imported into subsidiary
scripts by the Import command. The subsidiary script always imports
variables directly from the superior script. Consider this example:

Import qw( env INCLUDE );

This is only legal if the parent script exported both $env and
$INCLUDE. It also must have given each of these variables values. It is
OK for the subsidiary script to only import a subset of the exported
variables (in this example, $LIB, which was exported by the previous
example, is not imported).

All the imported variables are automatically re-exported, so the sequence:

Import qw ( env INCLUDE );
Build qw ( beneath-me/Conscript );

will supply both $env and $INCLUDE to the subsidiary file. If only
$env is to be exported, then the following will suffice:

The only constraint on the ordering of build scripts is that superior
scripts are evaluated before their inferior scripts. The top-level
Construct file, for instance, is evaluated first, followed by any
inferior scripts. This is all you really need to know about the evaluation
order, since order is generally irrelevant. Consider the following Build
command:

In any complex software system, a method for sharing build products needs to
be established. We propose a simple set of conventions which are trivial to
implement with Cons, but very effective.

The basic rule is to require that all build products which need to be shared
between directories are shared via an intermediate directory. We have
typically called this export, and, in a C environment, provided
conventional sub-directories of this directory, such as include, lib,
bin, etc.

These directories are defined by the top-level Construct file. A simple
Construct file for a Hello, World! application, organized using
multiple directories, might look like this:

To construct a Hello, World! program with this directory structure, go to
the top-level directory, and invoke cons with the appropriate
arguments. In the following example, we tell Cons to build the directory
export. To build a directory, Cons recursively builds all known products
within that directory (only if they need rebuilding, of course). If any of
those products depend upon other products in other directories, then those
will be built, too.

You'll note that the two Conscript files are very clean and
to-the-point. They simply specify products of the directory and how to build
those products. The build instructions are minimal: they specify which
construction environment to use, the name of the product, and the name of
the inputs. Note also that the scripts are location-independent: if you wish
to reorganize your source tree, you are free to do so: you only have to
change the Construct file (in this example), to specify the new locations
of the Conscript files. The use of an export tree makes this goal easy.

Note, too, how Cons takes care of little details for you. All the export
directories, for example, were made automatically. And the installed files
were really hard-linked into the respective export directories, to save
space and time. This attention to detail saves considerable work, and makes
it even easier to produce simple, maintainable scripts.

It's often desirable to keep any derived files from the build completely
separate from the source files. This makes it much easier to keep track of
just what is a source file, and also makes it simpler to handle variant
builds, especially if you want the variant builds to co-exist.

Cons provides a simple mechanism that handles all of these requirements. The
Link command is invoked as in this example:

Link 'build' => 'src';

The specified directories are ``linked'' to the specified source
directory. Let's suppose that you setup a source directory, src, with the
sub-directories world and hello below it, as in the previous
example. You could then substitute for the original build lines the
following:

Build qw(
build/world/Conscript
build/hello/Conscript
);

Notice that you treat the Conscript file as if it existed in the build
directory. Now if you type the same command as before, you will get the
following results:

Again, Cons has taken care of the details for you. In particular, you will
notice that all the builds are done using source files and object files from
the build directory. For example, build/world/world.o is compiled from
build/world/world.c, and export/include/world.h is installed from
build/world/world.h. This is accomplished on most systems by the simple
expedient of ``hard'' linking the required files from each source directory
into the appropriate build directory.

The links are maintained correctly by Cons, no matter what you do to the
source directory. If you modify a source file, your editor may do this ``in
place'' or it may rename it first and create a new file. In the latter case,
any hard link will be lost. Cons will detect this condition the next time
the source file is needed, and will relink it appropriately.

You'll also notice, by the way, that no changes were required to the
underlying Conscript files. And we can go further, as we shall see in the
next section.

Variant builds require just another simple extension. Let's take as an
example a requirement to allow builds for both the baNaNa and peAcH
operating systems. In this case, we are using a distributed file system,
such as NFS to access the particular system, and only one or the other of
the systems has to be compiled for any given invocation of cons. Here's
one way we could set up the Construct file for our Hello, World!
application:

Other variations of this model are possible. For example, you might decide
that you want to separate out your include files into platform dependent and
platform independent files. In this case, you'd have to define an
alternative to $INCLUDE for platform-dependent files. Most Conscript
files, generating purely platform-independent include files, would not have
to change.

You might also want to be able to compile your whole system with debugging
or profiling, for example, enabled. You could do this with appropriate
command line options, such as DEBUG=on. This would then be translated
into the appropriate platform-specific requirements to enable debugging
(this might include turning off optimization, for example). You could
optionally vary the name space for these different types of systems, but, as
we'll see in the next section, it's not essential to do this, since Cons
is pretty smart about rebuilding things when you change options.

Whenever Cons creates a derived file, it stores a signature for that
file. The signature is stored in a separate file, one per directory. After
the previous example was compiled, the .consign file in the
build/peach/world directory looked like this:

The first number is a timestamp--for a UNIX systems, this is typically the
number of seconds since January 1st, 1970. The second value is an MD5
checksum. The Message Digest Algorithm is an algorithm that, given an
input string, computes a strong cryptographic signature for that string. The
MD5 checksum stored in the .consign file is, in effect, a digest of all
the dependency information for the specified file. So, for example, for the
world.o file, this includes at least the world.c file, and also any
header files that Cons knows about that are included, directly or indirectly
by world.c. Not only that, but the actual command line that was used to
generate world.o is also fed into the computation of the
signature. Similarly, libworld.a gets a signature which ``includes'' all
the signatures of its constituents (and hence, transitively, the signatures
of their constituents), as well as the command line that created the
file.

The signature of a non-derived file is computed, by default, by taking the
current modification time of the file and the file's entry name (unless
there happens to be a current .consign entry for that file, in which case
that signature is used).

Notice that there is no need for a derived file to depend upon any
particular Construct or Conscript file--if changes to these files
affect the file in question, then this will be automatically reflected in
its signature, since relevant parts of the command line are included in the
signature. Unrelated changes will have no effect.

When Cons considers whether to derive a particular file, then, it first
computes the expected signature of the file. It then compares the file's
last modification time with the time recorded in the .consign entry, if
one exists. If these times match, then the signature stored in the
.consign file is considered to be accurate. If the file's previous
signature does not match the new, expected signature, then the file must be
rederived.

Notice that a file will be rederived whenever anything about a dependent
file changes. In particular, notice that any change to the modification
time of a dependent (forward or backwards in time) will force recompilation
of the derived file.

The use of these signatures is an extremely simple, efficient, and effective
method of improving--dramatically--the reproducibility of a system.

Many software development organizations will have one or more central
repository directory trees containing the current source code for one or
more projects, as well as the derived object files, libraries, and
executables. In order to reduce unnecessary recompilation, it is useful to
use files from the repository to build development software--assuming, of
course, that no newer dependency file exists in the local build tree.

Cons provides a mechanism to specify a list of code repositories that will
be searched, in-order, for source files and derived files not found in the
local build directory tree.

The following lines in a Construct file will instruct Cons to look first
under the /usr/experiment/repository directory and then under the
/usr/product/repository directory:

Repository qw (
/usr/experiment/repository
/usr/product/repository
);

The repository directories specified may contain source files, derived files
(objects, libraries and executables), or both. If there is no local file
(source or derived) under the directory in which Cons is executed, then the
first copy of a same-named file found under a repository directory will be
used to build any local derived files.

Cons maintains one global list of repositories directories. Cons will
eliminate the current directory, and any non-existent directories, from the
list.

Cons will also search for Construct and Conscript files in the
repository tree or trees. This leads to a chicken-and-egg situation,
though: how do you look in a repository tree for a Construct file if the
Construct file tells you where the repository is? To get around this,
repositories may be specified via -R options on the command line:

% cons -R /usr/experiment/repository -R /usr/product/repository .

Any repository directories specified in the Construct or Conscript
files will be appended to the repository directories specified by
command-line -R options.

If the source code (include the Conscript file) for the library version
of the Hello, World! C application is in a repository (with no derived
files), Cons will use the repository source files to create the local object
files and executable file:

If a repository tree contains derived files (usually object files,
libraries, or executables), Cons will perform its normal signature
calculation to decide whether the repository file is up-to-date or a derived
file must be built locally. This means that, in order to ensure correct
signature calculation, a repository tree must also contain the .consign
files that were created by Cons when generating the derived files.

This would usually be accomplished by building the software in the
repository (or, alternatively, in a build directory, and then copying the
result to the repository):

(This is safe even if the Construct file lists the /usr/all/repository
directory in a Repository command because Cons will remove the current
directory from the repository list.)

Now if we want to build a copy of the application with our own hello.c
file, we only need to create the one necessary source file, and use the
-R option to have Cons use other files from the repository:

Notice that Cons has not bothered to recreate a local libworld.a library
(or recompile the world.o module), but instead uses the already-compiled
version from the repository.

Because the MD5 signatures that Cons puts in the .consign file contain
timestamps for the derived files, the signature timestamps must match the
file timestamps for a signature to be considered valid.

Some software systems may alter the timestamps on repository files (by
copying them, e.g.), in which case Cons will, by default, assume the
repository signatures are invalid and rebuild files unnecessarily. This
behavior may be altered by specifying:

Repository_Sig_Times_OK 0;

This tells Cons to ignore timestamps when deciding whether a signature is
valid. (Note that avoiding this sanity check means there must be proper
control over the repository tree to ensure that the derived files cannot be
modified without updating the .consign signature.)

Why does Cons say that the hello program is up-to-date when there is no
hello program in the local build directory? Because the repository (not
the local directory) contains the up-to-date hello program, and Cons
correctly determines that nothing needs to be done to rebuild this
up-to-date copy of the file.

There are, however, many times in which it is appropriate to ensure that a
local copy of a file always exists. A packaging or testing script, for
example, may assume that certain generated files exist locally. Instead of
making these subsidiary scripts aware of the repository directory, the
Local command may be added to a Construct or Conscript file to
specify that a certain file or files must appear in the local build
directory:

Local qw(
hello
);

Then, if we re-run the same command, Cons will make a local copy of the
program from the repository copy (telling you that it is doing so):

Notice that, because the act of making the local copy is not considered a
``build'' of the hello file, Cons still reports that it is up-to-date.

Creating local copies is most useful for files that are being installed into
an intermediate directory (for sharing with other directories) via the
Install command. Accompanying the Install command for a file with a
companion Local command is so common that Cons provides a
Install_Local command as a convenient way to do both:

Install_Local $env, '#export', 'hello';

is exactly equivalent to:

Install $env '#export', 'hello';
Local '#export/hello';

Both the Local and Install_Local commands update the local .consign
file with the appropriate file signatures, so that future builds are
performed correctly.

Due to its built-in scanning, Cons will search the specified repository
trees for included .h files. Unless the compiler also knows about the
repository trees, though, it will be unable to find .h files that only
exist in a repository. If, for example, the hello.c file includes the
hello.h file in its current directory:

Solving this problem forces some requirements onto the way construction
environments are defined and onto the way the C #include preprocessor
directive is used to include files.

In order to inform the compiler about the repository trees, Cons will add
appropriate -I flags to the compilation commands. This means that the
CPPPATH variable in the construct environment must explicitly specify all
subdirectories which are to be searched for included files, including the
current directory. Consequently, we can fix the above example by changing
the environment creation in the Construct file as follows:

The order of the -I flags replicates, for the C preprocessor, the same
repository-directory search path that Cons uses for its own dependency
analysis. If there are multiple repositories and multiple CPPPATH
directories, Cons will append the repository directories to the beginning of
each CPPPATH directory, rapidly multiplying the number of -I flags.
As an extreme example, a Construct file containing:

Because Cons relies on the compiler's -I flags to communicate the order
in which repository directories must be searched, Cons' handling of
repository directories is fundamentally incompatible with using
double-quotes on the #include directives in your C source code:

#include "file.h" /* DON'T USE DOUBLE-QUOTES LIKE THIS */

This is because most C preprocessors, when faced with such a directive, will
always first search the directory containing the source file. This
undermines the elaborate -I options that Cons constructs to make the
preprocessor conform to its preferred search path.

Consequently, when using repository trees in Cons,
always use angle-brackets for included files:

Cons' handling of repository trees interacts correctly with other Cons
features--which is to say, it generally does what you would expect.

Most notably, repository trees interact correctly, and rather powerfully,
with the 'Link' command. A repository tree may contain one or more
subdirectories for version builds established via Link to a source
subdirectory. Cons will search for derived files in the appropriate build
subdirectories under the repository tree.

Until now, we've demonstrated invoking Cons with an explicit target
to build:

% cons hello

Normally, Cons does not build anything unless a target is specified,
but specifying '.' (the current directory) will build everything:

% cons # does not build anything

% cons . # builds everything under the top-level directory

Adding the Default method to any Construct or Conscript file will add
the specified targets to a list of default targets. Cons will build
these defaults if there are no targets specified on the command line.
So adding the following line to the top-level Construct file will mimic
Make's typical behavior of building everything by default:

Default '.';

The following would add the hello and goodbye commands (in the
same directory as the Construct or Conscript file) to the default list:

Default qw(
hello
goodbye
);

The Default method may be used more than once to add targets to the
default list.

Cons provides two methods for reducing the size of given build. The first is
by specifying targets on the command line, and the second is a method for
pruning the build tree. We'll consider target specification first.

Like make, Cons allows the specification of ``targets'' on the command
line. Cons targets may be either files or directories. When a directory is
specified, this is simply a short-hand notation for every derivable
product--that Cons knows about--in the specified directory and below. For
example:

% cons build/hello/hello.o

means build hello.o and everything that hello.o might need. This is
from a previous version of the Hello, World! program in which hello.o
depended upon export/include/world.h. If that file is not up-to-date
(because someone modified src/world/world.h), then it will be rebuilt,
even though it is in a directory remote from build/hello.

In this example:

% cons build

Everything in the build directory is built, if necessary. Again, this may
cause more files to be built. In particular, both export/include/world.h
and export/lib/libworld.a are required by the build/hello directory,
and so they will be built if they are out-of-date.

If we do, instead:

% cons export

then only the files that should be installed in the export directory will be
rebuilt, if necessary, and then installed there. Note that cons build
might build files that cons export doesn't build, and vice-versa.

With Cons, make-style ``special'' targets are not required. The simplest
analog with Cons is to use special export directories, instead. Let's
suppose, for example, that you have a whole series of unit tests that are
associated with your code. The tests live in the source directory near the
code. Normally, however, you don't want to build these tests. One solution
is to provide all the build instructions for creating the tests, and then to
install the tests into a separate part of the tree. If we install the tests
in a top-level directory called tests, then:

% cons tests

will build all the tests.

% cons export

will build the production version of the system (but not the tests), and:

% cons build

should probably be avoided (since it will compile tests unecessarily).

If you want to build just a single test, then you could explicitly name the
test (in either the tests directory or the build directory). You could
also aggregate the tests into a convenient hierarchy within the tests
directory. This hierarchy need not necessarily match the source hierarchy,
in much the same manner that the include hierarchy probably doesn't match
the source hierarchy (the include hierarchy is unlikely to be more than two
levels deep, for C programs).

If you want to build absolutely everything in the tree (subject to whatever
options you select), you can use:

% cons .

This is not particularly efficient, since it will redundantly walk all the
trees, including the source tree. The source tree, of course, may have
buildable objects in it--nothing stops you from doing this, even if you
normally build in a separate build tree.

In conjunction with target selection, build pruning can be used to reduce
the scope of the build. In the previous peAcH and baNaNa example, we have
already seen how script-driven build pruning can be used to make only half
of the potential build available for any given invocation of cons. Cons
also provides, as a convenience, a command line convention that allows you
to specify which Conscript files actually get ``built''--that is,
incorporated into the build tree. For example:

% cons build +world

The + argument introduces a Perl regular expression. This must, of
course, be quoted at the shell level if there are any shell meta-characters
within the expression. The expression is matched against each Conscript
file which has been mentioned in a Build statement, and only those
scripts with matching names are actually incorporated into the build
tree. Multiple such arguments are allowed, in which case a match against any
of them is sufficient to cause a script to be included.

In the example, above, the hello program will not be built, since Cons
will have no knowledge of the script hello/Conscript. The libworld.a
archive will be built, however, if need be.

There are a couple of uses for build pruning via the command line. Perhaps
the most useful is the ability to make local changes, and then, with
sufficient knowledge of the consequences of those changes, restrict the size
of the build tree in order to speed up the rebuild time. A second use for
build pruning is to actively prevent the recompilation of certain files that
you know will recompile due to, for example, a modified header file. You may
know that either the changes to the header file are immaterial, or that the
changes may be safely ignored for most of the tree, for testing
purposes.With Cons, the view is that it is pragmatic to admit this type of
behavior, with the understanding that on the next full build everything that
needs to be rebuilt will be. There is no equivalent to a ``make touch''
command, to mark files as permanently up-to-date. So any risk that is
incurred by build pruning is mitigated. For release quality work, obviously,
we recommend that you do not use build pruning (it's perfectly OK to use
during integration, however, for checking compilation, etc. Just be sure to
do an unconstrained build before committing the integration).

Cons provides a very simple mechanism for overriding aspects of a build. The
essence is that you write an override file containing one or more
Override commands, and you specify this on the command line, when you run
cons:

% cons -o over export

will build the export directory, with all derived files subject to the
overrides present in the over file. If you leave out the -o option,
then everything necessary to remove all overrides will be rebuilt.

The override file can contain two types of overrides. The first is incoming
environment variables. These are normally accessible by the Construct
file from the %ENV hash variable. These can trivially be overridden in
the override file by setting the appropriate elements of %ENV (these
could also be overridden in the user's environment, of course).

The second type of override is accomplished with the Override command,
which looks like this:

Override <regexp>, <var1> => <value1>, <var2> => <value2>, ...;

The regular expression regexp is matched against every derived file that
is a candidate for the build. If the derived file matches, then the
variable/value pairs are used to override the values in the construction
environment associated with the derived file.

then any cons invocation with -o over that creates .o files via
this environment will cause them to be compiled with -O and no -g. The
override could, of course, be restricted to a single directory by the
appropriate selection of a regular expression.

Here's the original version of the Hello, World! program, built with this
environment. Note that Cons rebuilds the appropriate pieces when the
override is applied or removed:

It's important that the Override command only be used for temporary,
on-the-fly overrides necessary for development because the overrides are not
platform independent and because they rely too much on intimate knowledge of
the workings of the scripts. For temporary use, however, they are exactly
what you want.

Note that it is still useful to provide, say, the ability to create a fully
optimized version of a system for production use--from the Construct and
Conscript files. This way you can tailor the optimized system to the
platform. Where optimizer trade-offs need to be made (particular files may
not be compiled with full optimization, for example), then these can be
recorded for posterity (and reproducibility) directly in the scripts.

We have mentioned, and used, the concept of a construction environment,
many times in the preceding pages. Now it's time to make this a little more
concrete. With the following statement:

$env = new cons();

a reference to a new, default construction environment is created. This
contains a number of construction variables and some methods. At the present
writing, the default list of construction variables is defined as follows:

These variables are used by the various methods associated with the
environment, in particular any method that ultimately invokes an external
command will substitute these variables into the final command, as
appropriate. For example, the Objects method takes a number of source
files and arranges to derive, if necessary, the corresponding object
files. For example:

Objects $env 'foo.c', 'bar.c';

This will arrange to produce, if necessary, foo.o and bar.o. The
command invoked is simply %CCCOM, which expands through substitution, to
the appropriate external command required to build each object. We will
explore the substitution rules further under the Command method, below.

The construction variables are also used for other purposes. For example,
CPPPATH is used to specify a colon-separated path of include
directories. These are intended to be passed to the C preprocessor and are
also used by the C-file scanning machinery to determine the dependencies
involved in a C Compilation. Variables beginning with underscore, are
created by various methods, and should normally be considered ``internal''
variables. For example, when a method is called which calls for the creation
of an object from a C source, the variable _IFLAGS is created: this
corresponds to the -I switches required by the C compiler to represent
the directories specified by CPPPATH.

Note that, for any particular environment, the value of a variable is set
once, and then never reset (to change a variable, you must create a new
environment. Methods are provided for copying existing environments for this
purpose). Some internal variables, such as _IFLAGS are created on demand,
but once set, they remain fixed for the life of the environment.

The CFLAGS, LDFLAGS, and ARFLAGS variables all supply a place
for passing options to the compiler, loader, and archiver, respectively.
Less obviously, the INCDIRPREFIX variable specifies the option string
to be appended to the beginning of each include directory so that the
compiler knows where to find .h files. Similarly, the LIBDIRPREFIX
variable specifies the option string to be appended to the beginning of
each directory that the linker should search for libraries.

Another variable, ENV, is used to determine the system environment during
the execution of an external command. By default, the only environment
variable that is set is PATH, which is the execution path for a UNIX
command. For the utmost reproducibility, you should really arrange to set
your own execution path, in your top-level Construct file (or perhaps by
importing an appropriate construction package with the Perl use
command). The default variables are intended to get you off the ground.

Expansion of construction variables is recursive--that is, the file
name(s) will be re-expanded until no more substitutions can be made. If
a construction variable is not defined in the environment, then the null
string will be substituted.

The new method is a Perl object constructor. That is, it is not invoked
via a reference to an existing construction environment reference, but,
rather statically, using the name of the Perl package where the
constructor is defined. The method is invoked like this:

$env = new cons(<overrides>);

The environment you get back is blessed into the package cons, which
means that it will have associated with it the default methods described
below. Individual construction variables can be overridden by providing
name/value pairs in an override list. Note that to override any command
environment variable (i.e. anything under ENV), you will have to override
all of them. You can get around this difficulty by using the copy method
on an existing construction environment.

The clone method creates a clone of an existing construction environment,
and can be called as in the following example:

$env2 = $env1->clone(<overrides>);

You can provide overrides in the usual manner to create a different
environment from the original. If you just want a new name for the same
environment (which may be helpful when exporting environments to existing
components), you can just use simple assignment.

The copy method extracts the externally defined construction variables
from an environment and returns them as a list of name/value
pairs. Overrides can also be provided, in which case, the overridden values
will be returned, as appropriate. The returned list can be assigned to a
hash, as shown in the prototype, below, but it can also be manipulated in
other ways:

%env = $env1->copy(<overrides>);

The value of ENV, which is itself a hash, is also copied to a new hash,
so this may be changed without fear of affecting the original
environment. So, for example, if you really want to override just the
PATH variable in the default environment, you could do the following:

The Install method arranges for the specified files to be installed in
the specified directory. The installation is optimized: the file is not
copied if it can be linked. If this is not the desired behavior, you will
need to use a different method to install the file. It is called as follows:

Install $env <directory>, <names>;

Note that, while the files to be installed may be arbitrarily named,
only the last component of each name is used for the installed target
name. So, for example, if you arrange to install foo/bar in baz,
this will create a bar file in the baz directory (not foo/bar).

The InstallAs method arranges for the specified source file(s) to be
installed as the specified target file(s). Multiple files should be
specified as a file list. The installation is optimized: the file is not
copied if it can be linked. If this is not the desired behavior, you will
need to use a different method to install the file. It is called as follows:

The Precious method asks cons not to delete the specified file or
list of files before building them again. It is invoked as:

Precious <files>;

This is especially useful for allowing incremental updates to libraries
or debug information files which are updated rather than rebuilt anew each
time. Cons will still delete the files when the -r flag is specified.

The Command method is a catchall method which can be used to arrange for
any external command to be called to update the target. For this command, a
target file and list of inputs is provided. In addition a construction
command line, or lines, is provided as a string (this string may have
multiple commands embedded within it, separated by new lines). Command is
called as follows:

Command $env <target>, <inputs>, <construction command>;

The target is made dependent upon the list of input files specified, and the
inputs must be built successfully or Cons will not attempt to build the
target.

Within the construction command, any variable from the construction
environment may be introduced by prefixing the name of the construction
variable with %. This is recursive: the command is expanded until no more
substitutions can be made. If a construction variable is not defined in the
environment, then the null string will be substituted. A doubled %%
will be replaced by a single % in the construction command.

The full set of inputs. If any of these have been used anywhere else in the
current command line (via %1, %2, etc.), then those will be deleted
from the list provided by %<. Consider the following command found in a
Conscript file in the test directory:

Any of the above pseudo variables may be followed immediately by one of
the following suffixes to select a portion of the expanded path name:

:a the absolute path to the file name
:b the directory plus the file name stripped of any suffix
:d the directory
:f the file name
:s the file name suffix
:F the file name stripped of any suffix

Continuing with the above example, %<:f would expand to foo bar baz,
and %:d> would expand to test.

It is possible to programmatically rewrite part of the command by
enclosing part of it between %[ and %]. This will call the
construction variable named as the first word enclosed in the brackets
as a Perl code reference; the results of this call will be used to
replace the contents of the brackets in the command line. For example,
given an existing input file named tgt.in:

After substitution occurs, strings of white space are converted into single
blanks, and leading and trailing white space is eliminated. It is therefore
not possible to introduce variable length white space in strings passed into
a command, without resorting to some sort of shell quoting.

If a multi-line command string is provided, the commands are executed
sequentially. If any of the commands fails, then none of the rest are
executed, and the target is not marked as updated, i.e. a new signature is
not stored for the target.

Normally, if all the commands succeed, and return a zero status (or whatever
platform-specific indication of success is required), then a new signature
is stored for the target. If a command erroneously reports success even
after a failure, then Cons will assume that the target file created by that
command is accurate and up-to-date.

The first word of each command string, after expansion, is assumed to be an
executable command looked up on the PATH environment variable (which is,
in turn, specified by the ENV construction variable). If this command is
found on the path, then the target will depend upon it: the command will
therefore be automatically built, as necessary. It's possible to write
multi-part commands to some shells, separated by semi-colons. Only the first
command word will be depended upon, however, so if you write your command
strings this way, you must either explicitly set up a dependency (with the
Depends method), or be sure that the command you are using is a system
command which is expected to be available. If it isn't available, you will,
of course, get an error.

If any command (even one within a multi-line command) begins with
[perl], the remainder of that command line will be evaluated by the
running Perl instead of being forked by the shell. If an error occurs
in parsing the Perl or if the Perl expression returns 0 or undef, the
command will be considered to have failed. For example, here is a simple
command which creates a file foo directly from Perl:

Note that when the command is executed, you are in the same package as
when the Construct or Conscript file was read, so you can call
Perl functions you've defined in the same Construct or Conscript
file in which the Command appears:

The Perl string will be used to generate the signature for the derived
file, so if you change the string, the file will be rebuilt. The contents
of any subroutines you call, however, are not part of the signature,
so if you modify a called subroutine such as create_file above,
the target will not be rebuilt. Caveat user.

Cons normally prints a command before executing it. This behavior is
suppressed if the first character of the command is @. Note that
you may need to separate the @ from the command name or escape it to
prevent @cmd from looking like an array to Perl quote operators that
perform interpolation:

# The first command line is incorrect,
# because "@cp" looks like an array
# to the Perl qq// function.
# Use the second form instead.
Command $env 'foo', 'foo.in', qq(
@cp %< tempfile
@ cp tempfile %>
);

If there are shell meta characters anywhere in the expanded command line,
such as <, >, quotes, or semi-colon, then the command
will actually be executed by invoking a shell. This means that a command
such as:

cd foo

alone will typically fail, since there is no command cd on the path. But
the command string:

cd $<:d; tar cf $>:f $<:f

when expanded will still contain the shell meta character semi-colon, and a
shell will be invoked to interpret the command. Since cd is interpreted
by this sub-shell, the command will execute as expected.

To specify a command with multiple targets, you can specify a reference to a
list of targets. In Perl, a list reference can be created by enclosing a
list in square brackets. Hence the following command:

Command $env ['foo.h', 'foo.c'], 'foo.template', q(
gen %1
);

could be used in a case where the command gen creates two files, both
foo.h and foo.c.

The Objects method arranges to create the object files that correspond to
the specified source files. It is invoked as shown below:

@files = Objects $env <source or object files>;

Under Unix, source files ending in .s and .c are currently
supported, and will be compiled into a name of the same file ending
in .o. By default, all files are created by invoking the external
command which results from expanding the CCCOM construction
variable, with %< and %> set to the source and object
files, respectively (see the Command method for expansion details).
The variable CPPPATH is also used when scanning source files for
dependencies. This is a colon separated list of pathnames, and is also
used to create the construction variable _IFLAGS, which will contain
the appropriate list of -I options for the compilation. Any relative
pathnames in CPPPATH is interpreted relative to the directory in
which the associated construction environment was created (absolute
and top-relative names may also be used). This variable is used by
CCCOM. The behavior of this command can be modified by changing any
of the variables which are interpolated into CCCOM, such as CC,
CFLAGS, and, indirectly, CPPPATH. It's also possible to replace
the value of CCCOM, itself. As a convenience, this file returns the
list of object filenames.

The Program method arranges to link the specified program with the
specified object files. It is invoked in the following manner:

Program $env <program name>, <source or object files>;

The program name will have the value of the SUFEXE construction
variable appended (by default, .exe on Win32 systems, nothing on Unix
systems) if the suffix is not already present.

Source files may be specified in place of objects files--the Objects
method will be invoked to arrange the conversion of all the files into
object files, and hence all the observations about the Objects method,
above, apply to this method also.

The actual linking of the program will be handled by an external command
which results from expanding the LINKCOM construction variable, with
%< set to the object files to be linked (in the order presented),
and %> set to the target (see the Command method for expansion
details). The user may set additional variables in the construction
environment, including LINK, to define which program to use for
linking, LIBPATH, a colon-separated list of library search paths,
for use with library specifications of the form -llib, and LIBS,
specifying the list of libraries to link against (in either -llib
form or just as pathnames. Relative pathnames in both LIBPATH and
LIBS are interpreted relative to the directory in which the associated
construction environment is created (absolute and top-relative names may
also be used). Cons automatically sets up dependencies on any libraries
mentioned in LIBS: those libraries will be built before the command
is linked.

The Library method arranges to create the specified library from the
specified object files. It is invoked as follows:

Library $env <library name>, <source or object files>;

The library name will have the value of the SUFLIB construction
variable appended (by default, .lib on Win32 systems, .a on Unix
systems) if the suffix is not already present.

Source files may be specified in place of objects files--the Objects
method will be invoked to arrange the conversion of all the files into
object files, and hence all the observations about the Objects method,
above, apply to this method also.

The actual creation of the library will be handled by an external
command which results from expanding the ARCOM construction variable,
with %< set to the library members (in the order presented),
and %> to the library to be created (see the Command method
for expansion details). The user may set variables in the construction
environment which will affect the operation of the command. These
include AR, the archive program to use, ARFLAGS, which can be
used to modify the flags given to the program specified by AR, and
RANLIB, the name of a archive index generation program, if needed
(if the particular need does not require the latter functionality,
then ARCOM must be redefined to not reference RANLIB).

The Library method allows the same library to be specified in multiple
method invocations. All of the contributing objects from all the invocations
(which may be from different directories) are combined and generated by a
single archive command. Note, however, that if you prune a build so that
only part of a library is specified, then only that part of the library will
be generated (the rest will disappear!).

The Module method is a combination of the Program and Command
methods. Rather than generating an executable program directly, this command
allows you to specify your own command to actually generate a module. The
method is invoked as follows:

The Depends method allows you to specify additional dependencies for a
target. It is invoked as follows:

Depends $env <target>, <dependencies>;

This may be occasionally useful, especially in cases where no scanner exists
(or is writable) for particular types of files. Normally, dependencies are
calculated automatically from a combination of the explicit dependencies set
up by the method invocation or by scanning source files.

A set of identical dependencies for multiple targets may be specified
using a reference to a list of targets. In Perl, a list reference can
be created by enclosing a list in square brackets. Hence the following
command:

Depends $env ['foo', 'bar'], 'input_file_1', 'input_file_2';

specifies that both the foo and bar files depend on the listed
input files.

The Ignore method allows you to ignore explicitly dependencies that
Cons infers on its own. It is invoked as follows:

Ignore <patterns>;

This can be used to avoid recompilations due to changes in system header
files or utilities that are known to not affect the generated targets.

If, for example, a program is built in an NFS-mounted directory on
multiple systems that have different copies of stdio.h, the differences
will affect the signatures of all derived targets built from source files
that #include <stdio.h>. This will cause all those targets to
be rebuilt when changing systems. If this is not desirable behavior, then
the following line will remove the dependencies on the stdio.h file:

Ignore '^/usr/include/stdio\.h$';

Note that the arguments to the Ignore method are regular expressions,
so special characters must be escaped and you may wish to anchor the
beginning or end of the expression with ^ or $ characters.

The SplitPath method looks up multiple path names in a string separated
by the default path separator for the operating system (':' on UNIX
systems, ';' on Windows NT), and returns the fully-qualified names.
It is invoked as follows:

@paths = SplitPath <pathlist>;

The SplitPath method will convert names prefixed '#' to the
appropriate top-level build name (without the '#') and will convert
relative names to top-level names.

The Help method specifies help text that will be displayed when the
user invokes cons -h. This can be used to provide documentation
of specific targets, values, build options, etc. for the build tree.
It is invoked as follows:

Help <helptext>;

The Help method may only be called once, and should typically be
specified in the top-level Construct file.

There are several ways of extending Cons, which vary in degree of
difficulty. The simplest method is to define your own construction
environment, based on the default environment, but modified to reflect your
particular needs. This will often suffice for C-based applications. You can
use the new constructor, and the clone and copy methods to create
hybrid environments. These changes can be entirely transparent to the
underlying Conscript files.

For slightly more demanding changes, you may wish to add new methods to the
cons package. Here's an example of a very simple extension,
InstallScript, which installs a tcl script in a requested location, but
edits the script first to reflect a platform-dependent path that needs to be
installed in the script:

Notice that this method is defined directly in the cons package (by
prefixing the name with cons::). A change made in this manner will be
globally visible to all environments, and could be called as in the
following example:

InstallScript $env "$BIN/foo", "foo.tcl";

For a small improvement in generality, the BINDIR variable could be
passed in as an argument or taken from the construction environment--as
%BINDIR.

Instead of adding the method to the cons name space, you could define a
new package which inherits existing methods from the cons package and
overrides or adds others. This can be done using Perl's inheritance
mechanisms.

The following example defines a new package cons::switch which overrides the
standard Library method. The overridden method builds linked library
modules, rather than library archives. A new constructor is
provided. Environments created with this constructor will have the new
library method; others won't.

The cons command is usually invoked from the root of the build tree. A
Construct file must exist in that directory. If the -f argument is
used, then an alternate Construct file may be used (and, possibly, an
alternate root, since cons will cd to Construct file's containing
directory).

If cons is invoked from a child of the root of the build tree with
the -t argument, it will walk up the directory hierarchy looking for a
Construct file. (An alternate name may still be specified with -f.)
The targets supplied on the command line will be modified to be relative
to the discovered Construct file. For example, from a directory
containing a top-level Construct file, the following invocation:

% cd libfoo/subdir
% cons -t target

is exactly equivalent to:

% cons libfoo/subdir/target

If there are any Default targets specified in the directory hierarchy's
Construct or Conscript files, only the default targets at or below
the directory from which cons -t was invoked will be built.

Show command that would have been executed, when retrieving from cache. No
indication that the file has been retrieved is given; this is useful for
generating build logs that can be compared with real build logs.

And construct-args can be any arguments that you wish to process in the
Construct file. Note that there should be a -- separating the arguments
to cons and the arguments that you wish to process in the Construct file.

Processing of construct-args can be done by any standard package like
Getopt or its variants, or any user defined package. cons will pass in
the construct-args as @ARGV and will not attempt to interpret anything
after the --.

Note that cons -r . is equivalent to a full recursive make clean,
but requires no support in the Construct file or any Conscript
files. This is most useful if you are compiling files into source
directories (if you separate the build and export directories,
then you can just remove the directories).

The options -p, -pa, and -pw are extremely useful for use as an aid
in reading scripts or debugging them. If you want to know what script
installs export/include/foo.h, for example, just type:

QuickScan allows simple target-independent scanners to be set up for source
files. Only one QuickScan scanner may be associated with any given source
file and environment.

QuickScan is invoked as follows:

QuickScan CONSENV CODEREF, FILENAME [, PATH]

The subroutine referenced by CODEREF is expected to return a list of
filenames included directly by FILE. These filenames will, in turn, be
scanned. The optional PATH argument supplies a lookup path for finding
FILENAME and/or files returned by the user-supplied subroutine. The PATH
may be a reference to an array of lookup-directory names, or a string of
names separated by the system's separator character (':' on UNIX systems,
';' on Windows NT).

The subroutine is called once for each line in the file, with $_ set to the
current line. If the subroutine needs to look at additional lines, or, for
that matter, the entire file, then it may read them itself, from the
filehandle SCAN. It may also terminate the loop, if it knows that no further
include information is available, by closing the filehandle.

Whether or not a lookup path is provided, QuickScan first tries to lookup
the file relative to the current directory (for the top-level file supplied
directly to QuickScan), or from the directory containing the file which
referenced the file. This is not very general, but seems good
enough--especially if you have the luxury of writing your own utilities and
can control the use of the search path in a standard way. Finally, the
search path is, currently, colon separated. This may not make the NT camp
happy.

[NOTE that the form $env->QuickScan ... and $env->Command
... should not be necessary, but, for some reason, is required
for this particular invocation. This appears to be a bug in Perl or
a misunderstanding on my part; this invocation style does not always
appear to be necessary.]

This finds all names of the form <name>.smf in the file. It will return the
names even if they're found within comments, but that's OK (the mechanism is
forgiving of extra files; they're just ignored on the assumption that the
missing file will be noticed when the program, in this example, smfgen, is
actually invoked).

A scanner is only invoked for a given source file if it is needed by some
target in the tree. It is only ever invoked once for a given source file.

Here is another way to build the same scanner. This one uses an
explicit code reference, and also (unecessarily, in this case) reads
the whole file itself: