Configurable User Documentation; or, How I Came to Write a Language with a Future
Conditional

In 1984 we decided that the Ohio State University Computer and Information Science
department (OSU-CIS) needed an introduction to the department computing facilities. We
wanted a document that pulled together all the information that a new graduate student or
faculty member might need: site policies, electronic mail addresses, a list of facilities
provided, and basic introductions to the various operating systems the department
supported. Creating such a document was particularly important since there were no good
introductions to our two primary platforms: a DECsystem-20 running TOPS-20, and a VAX
running BSD UNIX.

Much of this important information was getting passed student to student in the form of
oral traditions. This is fine, except that information was not spread around evenly (you
needed to know one of the system gurus or you were missing important information), and
once oral traditions were started it was almost impossible to kill them, even when the
information became incorrect or irrelevant.

Being firm believers in not working any harder than we needed to, we looked around for
a suitable existing document that we could take and adapt to our local requirements. We
found a document which had been written for the Computer Science department at
Carnegie-Mellon. Years went by, and various people at OSU-CIS edited the document, and
edited the document, and edited the document. It grew to be three times its original size,
changed text formatter, developed its own layout, grew an annotated bibliography, added
its own font, and otherwise consumed time and resources.

And then the two of us who had maintained it at OSU-CIS left, to go to other jobs. Lo
and behold, the new sites we were at had no equivalent user documentation. But the old
OSU-CIS Facilities Guide didn't quite work for them. On the other hand, there was a not a
chance that either of us was going to throw it out and start over, having invested several
years into it already. Therefore, in a spirit of enlightened self-interest, we set out to
build it into a user-configurable book -- something that we could use at both our sites,
take to new sites, and let other people use as well.

We found the available documentation inadequate for our users. There are three main
types of documentation that you can get without writing it yourself; manual pages, other
manufacturer's documentation, and books.

Manual pages are not useful for truly na´ve users. They don't have the right sort of
information (you can't find out what your e-mail address is, or what editor you should
use); the information they do have is often in words that new users don't understand; and
they do have all sorts of irrelevant information that acts as noise.

Next is the manufacturers documentation. How good the manufacturer's documentation is
varies; unfortunately, the range is from ``almost good enough'' to ``you're not certain
whether to laugh or to cry''. [The manufacturer that shipped an installation manual
and a third-generation photocopy of the Berkeley 4.2 docs -- complete with font tables for
Berkeley's Versatec printer -- falls into the latter category]. Even the best quality
documentation from manufacturers tends to fall short of what the average user needs.

Most often the manufacturer provides a very simplistic ``getting started''
guide, and then a stack of documentation which includes all the man pages in
printed form and stacks of detailed manuals describing major software systems such as troff,
program development tools, etc. The result is that to accumulate enough
manufacturer-supplied documentation to introduce a new user to a system generally results
in a foot-high stack - the ``getting started'' guide is never enough. The new user's
response to the foot-high stack is to put it in a corner and whimper, never looking at the
documentation because it is too overwhelming. Furthermore, the site that uses only and
exactly the software that their hardware manufacturer supplies has yet to be found.

Because of these well-understood problems with vendor documentation, you can go to your
neighborhood bookstore and buy an introduction to UNIX. These days, you can even buy an
introduction to Berkeley and Berkeley-derived versions of UNIX. Unfortunately, the quality
of these introductions vary as much as the manufacturer-supplied documentation. Even the
best of the books will tell you how to use only and exactly what most hardware
manufacturers supply. For instance, it will almost certainly tell you how to use vi
and troff, which is all very well if that's what your users are supposed to
prefer. If your standard is emacs and LaTeX, or FrameMaker, these books will
merely confuse your users. By the time you have put together the documentation that lists
all the ways your site is different from what the book says, not only have you done as
much work as it would have taken to write the documentation from scratch, but you are also
back to the foot-high stack of documentation.

Many of these goals were met by the OSU-CIS Facilities Guide; it had most of the
information in it, the writing style was accessible, there was a good index, it was
fanatically structured, and it had a good bibliography. In most cases, it already had
macros in place to avoid requiring multiple changes of the same information, and it even
had some macros set to allow chapters or sections to be used as stand-alone documents.
Unfortunately, it failed pretty badly on portability. What portability it had was the
result of adapting to changes in the department's configuration over the years; there had
never been any hesitation about coding assumptions about being a university in the middle
of Ohio, for instance.

The changes that had already been made were in the form of LaTeX macros. After the year
in which the staff offices moved 4 times (and some staff members moved 12), all the names
and addresses had been split out into a configuration file, for instance. We kept those,
and added to them where appropriate

Simple string replacement was not enough to fix all the changeable parts of the
document, however, and we had to add two further methods of customizing it. The first was
taking text that was clearly unsalvageable -- so changeable that it was going to have to
be rewritten for every site -- and isolating it in individual files. Examples of what we
put into separate files include site policies for accounts and user behavior; descriptions
of printing devices; dialup information; and tables showing supported programming
languages. All of these files go in a ``Local'' subdirectory, and examples are provided
with the document. Pulling the files into the main document is done with normal LaTeX
input primitives.

For other parts of the document, we needed a solution intermediate between string
replacement and file insertion; in fact, we needed something like conditional compilation,
where you could have the appearance of text depend on the value of variables. This could
be very simple, like omitting the VMS chapter for sites that don't have VMS, or
considerably more complex. Unfortunately, tools for text production don't provide features
like conditional compilation, except to a limited extent. TeX conditionals really are not
well suited for trying to comment out entire blocks of code; it's possible, but it isn't
pretty.

Tools that are intended for programs do conditionals well, but break other things. For
instance, cpp leaves blank lines where all its directives are, which is lovely
for preserving line numbering, but really upsets life when blank lines change the
semantics of the language -- as they do in TeX and troff. Furthermore, it's
highly inconvenient to have #defined variables interpreted wherever they occur in the
text. Capitalization may suffice to distinguish between a preprocessor directive and a
language element in C, but in English you don't get free choice of how to capitalize
things, and you end up making outrageously ugly #defines to avoid having your text diced.
(Instead of using ``#ifdef UNIX'', you must use something like ``#ifdef UNIXP''. Or
``ifdef unix'' -- after all, ``unix'' is not legal in text unless it happens to be the
name of a UNIX command -- but that really upsets people who are accustomed to C.)

Furthermore, there are some structures in English that are not well handled by ``if''
and ``if not''. Using cpp to try to control a sentence that is supposed to end up
saying something like ``There are three operating systems you can make your home on; UNIX,
TOPS-20, and VMS'' when you don't know how many operating systems will be involved is
very, very, nasty. When I tried it, I ended up with a screen full of intricately nested
ifs, and a bad headache.

Our solution to this was to build a text pre-processor, which we call tpp. tpp
carefully gets rid of lots of the features of cpp, and introduces a few new ones
instead. The future conditionals mentioned in the title are among them; in sentences like
the one above, you can create a case statement that fixes the ``are three'' to ``is one''
or ``are many'' depending on how many things are going to be true when you get to the end
of the sentence.

Tpp is currently a very small language, containing a whole 7 directives. It is
implemented as a Perl program. The 7 tpp directives are define, undef, if,
ifndef, conj, number and bynumber. Tpp currently
considers any line beginning ``%#'' to be information for it; it pays no attention to
other lines, and either emits them unchanged or doesn't emit them at all. ``%#'' was
chosen for LaTeX's benefit, to allow a tpp document to be processed by LaTeX without first
having been run through tpp. Future versions of tpp will allow you to choose an attention
sequence.

In the current version of tpp, variables are either set or unset; they don't take
values. By default, they are unset; it is perfectly legal to unset a variable that is
already unset, set one that's already set, or reference one that was never explicitly set
or unset. if and ifndef control emission of text, and have matching else
and endif. The following code produces ``Perl is fun.'' as its only output:

%#define fun
%#if fun
Perl is fun.
%#else
Perl is not fun.
%#endif

conj is the conjunction statement, used to output lists in English; it takes a
conjunction as an argument, and then cases by variable.

produces ``It's true; mares eat oats, does eat oats, and little lambs eat ivy.'' If
``mares'' and ``does'' are unset, the sentence reads ``It's true; little lambs eat ivy.''
This may seem unimpressive, until you try to produce this effect with only cpp
directives.

number takes ``last'' or ``next'' as an argument, and returns the number of
cases that were true in the most recent conj, or are going to be true in the next
one. This allows you to say

There are
%#number next
main weapons of the Spanish Inquisition:
%#conj and
%#case fear
fear
%#case surprise
surprise
%#case devotion
a fanatical devotion to the pope
%#endconj
.

and not have the usual problem with getting the number right.

bynumber also takes ``last'' or ``next'' as an argument, but it then cases on
the result. It allows ranges, and also the use of the keyword ``many'' to mean ``a bigger
number than I've got a case for'', allowing

There
%#bynumber next
%#case 0
are no weapons
%#case 1-3
are a few main weapons
%#case 4-10
are several main weapons
%#case many
are lots and lots of main weapons
%#endnumber
of the Spanish Inquisition.

Our first goal was for the Facilities Guide to be non-threatening. One part of being
non-threatening was to keeping the Facilities Guide down to a reasonable size without a
novice user needing other reference material. Unfortunately, that's only so acheivable;
the Facilities Guide configured for the Ohio State Physics department produces 180 pages
of text. This is enough to be scary to a new user. We try to ease these concerns with our
introduction which tells a new users that they don't have to learn everything that is
contained in the Facilities Guide. The first chapter is a roadmap, which explains what
each of the chapters is about and instructs a new (or experienced) user what chapters
should be helpful.

We wrote the second chapter of the Facilities Guide for the computer neophyte: someone
who had no idea what a text editor was, much less why using electronic mail is useful.
Having this introduction helps the rest of the Guide be more useful since we could assume
a basic level of understanding. The first chapter advises experienced users to skip or
skim this chapter, since they presumably do not want to be told what a text formatter does
and why you would want to do it.

The middle chapters of the guide address specific operating systems (currently these
are UNIX, VMS, Tops-20, and the Macintosh OS, although an MS-DOS chapter is in progress).
Each of the operating system chapters follows the same basic outline (as does the
introductory chapter), although the details change from operating system to operating
system. The hope is that the information comes in the order that people need it. Roughly,
the structure is:

How to get in.

How to change your password.

How to get out.

How to give commands.

How to use electronic mail (including the information about what your mail address is.)

The file system; how it is arranged, and how you deal with files.

Text editing.

Formatting systems.

Printers.

Programming languages and tools.

Other topics.

Games.

In some cases, not all the useful information will fit into this relatively rigid
structure. The UNIX chapter, for instance, has long since grown into multiple chapters,
with a separate chapter to deal with window systems. Where possible, these additional
chapters echo the same structure (for instance, the description of each window system
starts by telling you how to start it up, how to shut it down, and how to get help).

In order to keep the operating system chapters relatively short and non-repetitive,
information about significant tools that occur on multiple operating systems is pulled out
into separate chapters. For instance, a separate text formatting chapter is provided with
the information about LaTeX. The operating system chapters provide the operating system
dependent information (how to start the executable, how to print a dvi file, where to find
sample documents) and then refer to the specific chapter.

We have created an extensive index and a detailed table of contents, and a large
annotated bibliography. The layout of the Facilities Guide is designed to make specific
sections easy to find by using a lot of white space and horizontal rules.

The use of tpp and various TeX primitives permits us to intersperse
site-specific information within the body of a very generalized section. Examples given in
the text use host names and the command line prompts that are set locally. Rather than
talking in general about printer support, we can say what printer is preferred for high
quality output, and which printer is the fastest. In the section discussing remote access
we can tell users what phone numbers to use and what they need to type to gain access to
our machine via a dialup.

The version of the facility guide I am looking at now takes up nearly three megabytes
of disk; that's built for my configuration as far as LaTeX source. That does not include
either LaTeX or Perl, which work together to create the manual. Along with the actual
text, the manual distribution contains not only tpp, but also an indexing program
and, believe it or not, a PostScript font - the font is only actually required for the
Macintosh chapter.

All in all, it's a lot of trouble to go to for a manual, especially when you consider
that you may still end up creating sections for programs or operating systems that we
don't run. On the other hand, it's a lot less trouble than writing your own manual from
scratch, and we will happily merge other people's sections in, with credit. (The
acknowledgments section is now well into its second page.)

As it turns out, there are programs besides LaTeX and troff that care deeply
about empty lines and may have almost anything in them -- sendmail, for instance.
While the more esoteric text processing features of tpp are out of place in a
sendmail.cf, the ability to do cpp-style preprocessing to create multiple
sendmail.cfs out of a single master can be extremely handy at a large site, and in fact
OSU-CIS, which is not presently using the facilities guide, is using tpp for that purpose.

Mark Verber was a system programmer for the Physics Department at The Ohio State
University. He discovered UNIX in 1978 as a high school student and has been working for
OSU since 1980. Reach him via U.S. Mail at The Ohio State University; Physics Department;
174 W. 18th Avenue; Columbus, Ohio 43210. Reach him electronically at . Mark now works
for WebTV.

Elizabeth Zwicky was a system administrator for the Information, Telecommunication and
Automation Division at SRI International, where she is known for writing peculiar programs
in languages beginning with the letter ``P''. Reach her via U.S. Mail at SRI
International; 333 Ravenswood Avenue; Menlo Park, CA 94025. Reach her electronically at zwicky@erg.sri.com.
Elizabeth now works for SGI.