Dial-Up Network of UNIX Systems SMM:21-1
A Dial-Up Network of UNIXTM Systems
D. A. Nowitz
M. E. Lesk
AT&T Bell Laboratories
Murray Hill, New Jersey 07974
ABSTRACT
A network of over eighty UNIX- computer systems
has been established using the telephone system as its
primary communication medium. The network was designed
to meet the growing demands for software distribution
and exchange. Some advantages of our design are:
- The startup cost is low. A system needs only a
dial-up port, but systems with automatic calling
units have much more flexibility.
- No operating system changes are required to
install or use the system.
_________________________
- UNIX is a registered trademark of AT&T Bell Labora-
tories in the USA and other countries.
SMM:21-2 A Dial-Up Network of UNIX Systems
- The communication is basically over dial-up lines,
however, hardwired communication lines can be used
to increase speed.
- The command for sending/receiving files is simple
to use.
Keywords: networks, communications, software dis-
tribution, software maintenance
1. Purpose
The widespread use of the UNIX system[1] within Bell Labora-
tories has produced problems of software distribution and mainte-
nance. A conventional mechanism was set up to distribute the
operating system and associated programs from a central site to
the various users. However this mechanism alone does not meet all
software distribution needs. Remote sites generate much software
and must transmit it to other sites. Some UNIX systems are them-
selves central sites for redistribution of a particular special-
ized utility, such as the Switching Control Center System. Other
sites have particular, often long-distance needs for software
exchange; switching research, for example, is carried on in New
Jersey, Illinois, Ohio, and Colorado. In addition, general pur-
pose utility programs are written at all UNIX system sites. The
UNIX system is modified and enhanced by many people in many
places and it would be very constricting to deliver new software
in a one-way stream without any alternative for the user sites to
Dial-Up Network of UNIX Systems SMM:21-3
respond with changes of their own.
Straightforward software distribution is only part of the
problem. A large project may exceed the capacity of a single com-
puter and several machines may be used by the one group of peo-
ple. It then becomes necessary for them to pass messages, data
and other information back an forth between computers.
Several groups with similar problems, both inside and out-
side of Bell Laboratories, have constructed networks built of
hardwired connections only.[1,2] Our network, however, uses both
dial-up and hardwired connections so that service can be provided
to as many sites as possible.
2. Design Goals
Although some of our machines are connected directly, others
can only communicate over low-speed dial-up lines. Since the
dial-up lines are often unavailable and file transfers may take
considerable time, we spool all work and transmit in the back-
ground. We also had to adapt to a community of systems which are
independently operated and resistant to suggestions that they
should all buy particular hardware or install particular operat-
ing system modifications. Therefore, we make minimal demands on
the local sites in the network. Our implementation requires no
operating system changes; in fact, the transfer programs look
like any other user entering the system through the normal dial-
up login ports, and obeying all local protection rules.
We distinguish ``active'' and ``passive'' systems on the
SMM:21-4 A Dial-Up Network of UNIX Systems
network. Active systems have an automatic calling unit or a
hardwired line to another system, and can initiate a connection.
Passive systems do not have the hardware to initiate a connec-
tion. However, an active system can be assigned the job of cal-
ling passive systems and executing work found there; this makes a
passive system the functional equivalent of an active system,
except for an additional delay while it waits to be polled. Also,
people frequently log into active systems and request copying
from one passive system to another. This requires two telephone
calls, but even so, it is faster than mailing tapes.
Where convenient, we use hardwired communication lines.
These permit much faster transmission and multiplexing of the
communications link. Dial-up connections are made at either 300
or 1200 baud; hardwired connections are asynchronous up to 9600
baud and might run even faster on special-purpose communications
hardware.[3,4] Thus, systems typically join our network first as
passive systems and when they find the service more important,
they acquire automatic calling units and become active systems;
eventually, they may install high-speed links to particular
machines with which they handle a great deal of traffic. At no
point, however, must users change their programs or procedures.
The basic operation of the network is very simple. Each par-
ticipating system has a spool directory, in which work to be done
(files to be moved, or commands to be executed remotely) is
stored. A standard program, uucico, performs all transfers. This
program starts by identifying a particular communication channel
Dial-Up Network of UNIX Systems SMM:21-5
to a remote system with which it will hold a conversation. Uucico
then selects a device and establishes the connection, logs onto
the remote machine and starts the uucico program on the remote
machine. Once two of these programs are connected, they first
agree on a line protocol, and then start exchanging work. Each
program in turn, beginning with the calling (active system) pro-
gram, transmits everything it needs, and then asks the other what
it wants done. Eventually neither has any more work, and both
exit.
In this way, all services are available from all sites; pas-
sive sites, however, must wait until called. A variety of proto-
cols may be used; this conforms to the real, non-standard world.
As long as the caller and called programs have a protocol in com-
mon, they can communicate. Furthermore, each caller knows the
hours when each destination system should be called. If a desti-
nation is unavailable, the data intended for it remain in the
spool directory until the destination machine can be reached.
The implementation of this Bell Laboratories network between
independent sites, all of which store proprietary programs and
data, illustratives the pervasive need for security and adminis-
trative controls over file access. Each site, in configuring its
programs and system files, limits and monitors transmission. In
order to access a file a user needs access permission for the
machine that contains the file and access permission for the file
itself. This is achieved by first requiring the user to use his
password to log into his local machine and then his local machine
SMM:21-6 A Dial-Up Network of UNIX Systems
logs into the remote machine whose files are to be accessed. In
addition, records are kept identifying all files that are moved
into and out of the local system, and how the requestor of such
accesses identified himself. Some sites may arrange to permit
users only to call up and request work to be done; the calling
users are then called back before the work is actually done. It
is then possible to verify that the request is legitimate from
the standpoint of the target system, as well as the originating
system. Furthermore, because of the call-back, no site can
masquerade as another even if it knows all the necessary pass-
words.
Each machine can optionally maintain a sequence count for
conversations with other machines and require a verification of
the count at the start of each conversation. Thus, even if call
back is not in use, a successful masquerade requires the calling
party to present the correct sequence number. A would-be imperso-
nator must not just steal the correct phone number, user name,
and password, but also the sequence count, and must call in suf-
ficiently promptly to precede the next legitimate request from
either side. Even a successful masquerade will be detected on the
next correct conversation.
3. Processing
The user has two commands which set up communications, uucp
to set up file copying, and uux to set up command execution where
some of the required resources (system and/or files) are not on
the local machine. Each of these commands will put work and data
Dial-Up Network of UNIX Systems SMM:21-7
files into the spool directory for execution by uucp daemons.
Figure 1 shows the major blocks of the file transfer process.
File Copy
The uucico program is used to perform all communications
between the two systems. It performs the following functions:
- Scan the spool directory for work.
- Place a call to a remote system.
- Negotiate a line protocol to be used.
- Start program uucico on the remote system.
- Execute all requests from both systems.
- Log work requests and work completions.
Uucico may be started in several ways;
a) by a system daemon,
b) by one of the uucp or uux programs,
c) by a remote system.
Scan For Work
The file names in the spool directory are constructed to
allow the daemon programs (uucico, uuxqt) to determine the files
they should look at, the remote machines they should call and the
order in which the files for a particular remote machine should
SMM:21-8 A Dial-Up Network of UNIX Systems
be processed.
Call Remote System
The call is made using information from several files which
reside in the uucp program directory. At the start of the call
process, a lock is set on the system being called so that another
call will not be attempted at the same time.
The system name is found in a ``systems'' file. The informa-
tion contained for each system is:
[1] system name,
[2] times to call the system (days-of-week and times-of-
day),
[3] device or device type to be used for call,
[4] line speed,
[5] phone number,
[6] login information (multiple fields).
The time field is checked against the present time to see if
the call should be made. The phone number may contain abbrevia-
tions (e.g. ``nyc'', ``boston'') which get translated into dial
sequences using a ``dial-codes'' file. This permits the same
``phone number'' to be stored at every site, despite local varia-
tions in telephone services and dialing conventions.
Dial-Up Network of UNIX Systems SMM:21-9
A ``devices'' file is scanned using fields [3] and [4] from
the ``systems'' file to find an available device for the connec-
tion. The program will try all devices which satisfy [3] and [4]
until a connection is made, or no more devices can be tried. If a
non-multiplexable device is successfully opened, a lock file is
created so that another copy of uucico will not try to use it. If
the connection is complete, the login information is used to log
into the remote system. Then a command is sent to the remote sys-
tem to start the uucico program. The conversation between the two
uucico programs begins with a handshake started by the called,
SLAVE, system. The SLAVE sends a message to let the MASTER know
it is ready to receive the system identification and conversation
sequence number. The response from the MASTER is verified by the
SLAVE and if acceptable, protocol selection begins.
Line Protocol Selection
The remote system sends a message
Pproto-list
where proto-list is a string of characters, each representing a
line protocol. The calling program checks the proto-list for a
letter corresponding to an available line protocol and returns a
use-protocol message. The use-protocol message is
Ucode
where code is either a one character protocol letter or a N which
means there is no common protocol.
SMM:21-10 A Dial-Up Network of UNIX Systems
Greg Chesson designed and implemented the standard line pro-
tocol used by the uucp transmission program. Other protocols may
be added by individual installations.
Work Processing
During processing, one program is the MASTER and the other
is SLAVE. Initially, the calling program is the MASTER. These
roles may switch one or more times during the conversation.
There are four messages used during the work processing,
each specified by the first character of the message. They are
S send a file,
R receive a file,
C copy complete,
H hangup.
The MASTER will send R or S messages until all work from the
spool directory is complete, at which point an H message will be
sent. The SLAVE will reply with SY, SN, RY, RN, HY, HN,
corresponding to yes or no for each request.
The send and receive replies are based on permission to
access the requested file/directory. After each file is copied
into the spool directory of the receiving system, a copy-complete
message is sent by the receiver of the file. The message CY will
be sent if the UNIX cp command, used to copy from the spool
directory, is successful. Otherwise, a CN message is sent. The
requests and results are logged on both systems, and, if
requested, mail is sent to the user reporting completion (or the
user can request status information from the log program at any
Dial-Up Network of UNIX Systems SMM:21-11
time).
The hangup response is determined by the SLAVE program by a
work scan of the spool directory. If work for the remote system
exists in the SLAVE's spool directory, a HN message is sent and
the programs switch roles. If no work exists, an HY response is
sent.
A sample conversation is shown in Figure 2.
Conversation Termination
When a HY message is received by the MASTER it is echoed
back to the SLAVE and the protocols are turned off. Each program
sends a final "OO" message to the other.
4. Present Uses
One application of this software is remote mail. Normally, a
UNIX system user writes ``mail dan'' to send mail to user
``dan''. By writing ``mail usg!dan'' the mail is sent to user
``dan'' on system ``usg''.
The primary uses of our network to date have been in
software maintenance. Relatively few of the bytes passed between
systems are intended for people to read. Instead, new programs
(or new versions of programs) are sent to users, and potential
bugs are returned to authors. Aaron Cohen has implemented a
``stockroom'' which allows remote users to call in and request
software. He keeps a ``stock list'' of available programs, and
new bug fixes and utilities are added regularly. In this way,
SMM:21-12 A Dial-Up Network of UNIX Systems
users can always obtain the latest version of anything without
bothering the authors of the programs. Although the stock list is
maintained on a particular system, the items in the stockroom may
be warehoused in many places; typically each program is distri-
buted from the home site of its author. Where necessary, uucp
does remote-to-remote copies.
We also routinely retrieve test cases from other systems to
determine whether errors on remote systems are caused by local
misconfigurations or old versions of software, or whether they
are bugs that must be fixed at the home site. This helps identify
errors rapidly. For one set of test programs maintained by us,
over 70% of the bugs reported from remote sites were due to old
software, and were fixed merely by distributing the current ver-
sion.
Another application of the network for software maintenance
is to compare files on two different machines. A very useful
utility on one machine has been Doug McIlroy's ``diff'' program
which compares two text files and indicates the differences, line
by line, between them.[5] Only lines which are not identical are
printed. Similarly, the program ``uudiff'' compares files (or
directories) on two machines. One of these directories may be on
a passive system. The ``uudiff'' program is set up to work simi-
larly to the inter-system mail, but it is slightly more compli-
cated.
To avoid moving large numbers of usually identical files,
uudiff computes file checksums on each side, and only moves files
Dial-Up Network of UNIX Systems SMM:21-13
that are different for detailed comparison. For large files, this
process can be iterated; checksums can be computed for each line,
and only those lines that are different actually moved.
The ``uux'' command has been useful for providing remote
output. There are some machines which do not have hard-copy dev-
ices, but which are connected over 9600 baud communication lines
to machines with printers. The uux command allows the formatting
of the printout on the local machine and printing on the remote
machine using standard UNIX command programs.
5. Performance
Throughput, of course, is primarily dependent on transmis-
sion speed. The table below shows the real throughput of charac-
ters on communication links of different speeds. These numbers
represent actual data transferred; they do not include bytes used
by the line protocol for data validation such as checksums and
messages. At the higher speeds, contention for the processors on
both ends prevents the network from driving the line full speed.
The range of speeds represents the difference between light and
heavy loads on the two systems. If desired, operating system
modifications can be installed that permit full use of even very
fast links.
Nominal speed Characters/sec.
300 baud 27
1200 baud 100-110
9600 baud 200-850
In addition to the transfer time, there is some overhead for mak-
ing the connection and logging in ranging from 15 seconds to 1
SMM:21-14 A Dial-Up Network of UNIX Systems
minute. Even at 300 baud, however, a typical 5,000 byte source
program can be transferred in four minutes instead of the 2 days
that might be required to mail a tape.
Traffic between systems is variable. Between two closely
related systems, we observed 20 files moved and 5 remote commands
executed in a typical day. A more normal traffic out of a single
system would be around a dozen files per day.
The total number of sites at present in the main network is
82, which includes most of the Bell Laboratories full-size
machines which run the UNIX operating system. Geographically, the
machines range from Andover, Massachusetts to Denver, Colorado.
Uucp has also been used to set up another network which con-
nects a group of systems in operational sites with the home site.
The two networks touch at one Bell Labs computer.
6. Further Goals
Eventually, we would like to develop a full system of remote
software maintenance. Conventional maintenance (a support group
which mails tapes) has many well-known disadvantages.[6] There
are distribution errors and delays, resulting in old software
running at remote sites and old bugs continually reappearing.
These difficulties are aggravated when there are 100 different
small systems, instead of a few large ones.
The availability of file transfer on a network of compatible
operating systems makes it possible just to send programs
Dial-Up Network of UNIX Systems SMM:21-15
directly to the end user who wants them. This avoids the
bottleneck of negotiation and packaging in the central support
group. The ``stockroom'' serves this function for new utilities
and fixes to old utilities. However, it is still likely that dis-
tributions will not be sent and installed as often as needed.
Users are justifiably suspicious of the ``latest version'' that
has just arrived; all too often it features the ``latest bug.''
What is needed is to address both problems simultaneously:
1. Send distributions whenever programs change.
2. Have sufficient quality control so that users will install
them.
To do this, we recommend systematic regression testing both on
the distributing and receiving systems. Acceptance testing on the
receiving systems can be automated and permits the local system
to ensure that its essential work can continue despite the con-
stant installation of changes sent from elsewhere. The work of
writing the test sequences should be recovered in lower counsel-
ing and distribution costs.
Some slow-speed network services are also being implemented.
We now have inter-system ``mail'' and ``diff,'' plus the many
implied commands represented by ``uux.'' However, we still need
inter-system ``write'' (real-time inter-user communication) and
``who'' (list of people logged in on different systems). A slow-
speed network of this sort may be very useful for speeding up
counseling and education, even if not fast enough for the distri-
SMM:21-16 A Dial-Up Network of UNIX Systems
buted data base applications that attract many users to networks.
Effective use of remote execution over slow-speed lines, however,
must await the general installation of multiplexable channels so
that long file transfers do not lock out short inquiries.
7. Lessons
The following is a summary of the lessons we learned in
building these programs.
1. By starting your network in a way that requires no hardware
or major operating system changes, you can get going
quickly.
2. Support will follow use. Since the network existed and was
being used, system maintainers were easily persuaded to help
keep it operating, including purchasing additional hardware
to speed traffic.
3. Make the network commands look like local commands. Our
users have a resistance to learning anything new: all the
inter-system commands look very similar to standard UNIX
system commands so that little training cost is involved.
4. An initial error was not coordinating enough with existing
communications projects: thus, the first version of this
network was restricted to dial-up, since it did not support
the various hardware links between systems. This has been
fixed in the current system.
Dial-Up Network of UNIX Systems SMM:21-17
Acknowledgements
We thank G. L. Chesson for his design and implementation of
the packet driver and protocol, and A. S. Cohen, J. Lions, and P.
F. Long for their suggestions and assistance.
References
1. D. M. Ritchie and K. Thompson, "The UNIX Time-Sharing Sys-
tem," Bell Sys. Tech. J., vol. 57, no. 6, pp. 1905-1929,
1978.
2. G. L. Chesson, "The Network UNIX System," Operating SystemsReview, vol. 9, no. 5, pp. 60-66, 1975. Also in Proc. 5thSymp. on Operating Systems Principles.
3. A. G. Fraser, "Spider - An Experimental Data Communications
System," Proc. IEEE Conf. on Communications, p. 21F, June
1974. IEEE Cat. No. 74CH0859-9-CSCB.
4. A. G. Fraser, "A Virtual Channel Network," Datamation, pp.
51-56, February 1975.
5. J. W. Hunt and M. D. McIlroy, "An Algorithm for Differential
File Comparison," Comp. Sci. Tech. Rep. No. 41, Bell Labora-
tories, Murray Hill, New Jersey, June 1976.
6. F. P. Brooks, Jr., The Mythical Man-Month, Addison-Wesley,
Reading, Mass., 1975.