USING THE DR11-W DMA DEVICE
FOR INTERPROCESSOR COMMUNICATIONS IN RT-11.
Mark Pyatetsky, Peter Heinicke,
David Ritchie, Vicky White
Fermi National Accelerator Laboratory
Batavia, Illinois
ABSTRACT
________
At Fermilab, DR11-W's have been used as high speed
data links to interconnect PDP-11's (under both
RT-11 and RSX11-M) in data acquisition
applications. Using this hardware, several
processors can be interconnected to provide
distributed data collection, data monitoring and
control for High Energy Physics experiments.
This paper discusses the implementation of the
DR11-W link under RT-11. It describes the RT-11
handler, and some problems (and solutions)
associated with the RT-11 handler design are
discussed. Among these: implementation of the
internal queues in the handler, serialization of
the completion routines, time-out support, multiple
DR11-W devices, interface with the application
programs, error reporting.
INTRODUCTION
The Data Acquisition (DA) systems, used at
Fermilab, are designed to process large
volumes of data in very short periods of
time. These DA systems run on various
PDP-11 and VAX configurations under either
RT-11, RSX-11M or VMS. Interconnecting the
DA systems via high-speed data links
increases event data rates, provides extra
memory and processing time and other
benefits. These data links must allow, of
course, communications among the DA systems
running under different operating systems.
At Fermilab, these data links are
implemented by using DR11-W DMA devices.
In this paper we will discuss the
implementation of the DR11-W link under
RT-11. We will show how an RT-11
application program can communicate with an
RT-11 or RSX-11M program running in a
different PDP-11 processor, using a DR11-W
interprocessor link.
We will first briefly describe the levels
of protocols for such a communication.
Next, we will discuss why we implemented
our link communication driver as an RT-11
device handler. We will also talk about
some additional features not generally
provided by RT-11 handlers. Implementation
of internal queues in the handler,
serialization of the completion routines,
time-out support and multiple DR11-W
devices will be covered here. Finally, we
will describe the interface with the
application programs, error reporting,
FORTRAN interface routines. Our (so far
limited) experience in using the handler
will also be discussed, together with some
performance data.
THE INTERPROCESSOR LINK
LAYERED ARCHITECTURE.
We will now briefly describe the layers of
interfaces and protocols necessary to
implement an interprocessor link
architecture. A more detailed discussion
is presented in the associated paper (1).
On this project, we implemented a
point-to-point interprocessor communication
link. Figure 1 depicts some possible link
configurations.
An RT-11 application program running in one
processor "talks" to an RT-11 or RSX-11M
application program running in another
processor within a framework of a
three-layered architecture (Figure 2).
Layer 1 implements physical link (hardware)
protocol. It is in essence, two DMA
controllers (DR11-W) connected end-to-end
over a parallel link, interfacing each
processor's UNIBUS. Layer 2 provides data
link control. It is a communication driver
which is implemented as a device handler
under RT-11 (It is a device driver under
RSX-11M - see associated paper (2)).
Two communication drivers (one per
processor) interact with each other
(horizontally) in accordance with the data
transmission protocol, developed at
Fermilab (1). Hardware interrupts and
DR11-W registers provide for a (vertical)
interface between layer 2 and layer 1.
The application programs constitute layer 3
in this hierarchy. Two application
programs running in different processors
interact with each other in accordance with
their own logical protocol which may be
different for different types of
applications. RT-MULTI is one example of
an application program (3). A file
transfer program is another example. The
application programs can be written in
MACRO-11 or in FORTRAN.
The application programs interface
(vertically) the layer 2 communication
driver via software driver interface.
Different interfaces are provided under
RT-11 and under RSX-11M. The RT-11
software driver interface will be further
discussed in more detail in this paper. We
have also developed a set of
FORTRAN-callable subroutines, CDPACK, which
converts the RT-11 or RSX-11M software
driver interface into a standard
application program interface. CDPACK
allows the same application program,
written in FORTRAN, to run under either
RT-11 or RSX-11M. CDPACK is described in
more detail in (1).
THE COMMUNICATION DRIVER (CD:)
The communication driver in the RT-11
environment is written as an RT-11 device
handler. This approach has several
advantages. It allows standard I/O to the
link (writes, reads, special functions)
from the application program. It also
provides all application programs with a
single standard interface to the DR-11W,
and relieves them from having to implement
the data transmission protocol (layer 2),
and will hopefully insulate them from
possible future changes in DR-11W (by DEC)
and the data transmission protocol.
In addition, a mechanism of completion
routines can be used with the device
handler. The completion routines are very
helpful in preventing possible
communication deadlocks, or in performing
several I/O's concurrently (e.g. to a disk
and the communication link). Finally, the
device handler approach adds a greater
degree of flexibility: being written in
PIC (Position Independent Code) it can be
moved around in memory very easily, logical
unit numbers can be assigned to physical
links, "set" function allows for setting or
resetting of various handler features from
the console or the application program, I/O
requests are queued and can be executed
concurrently with the application program,
etc.
It turned out, however, that our
communication driver needed some additional
features not generally provided by RT-11.
Before we get to the implementation of the
internal queues, serialization of the
completion routines, time-out support and
multiple DR11-W devices, let us first
describe the functionality of our
interprocessor communication driver.
HOW I/O REQUESTS ARE PROCESSED
BY THE DRIVER.
There are several types of I/O requests
processed by the driver. As a typical
example, we will consider here the two most
frequently used: sending and receiving
messages of several 16-bit words each
('several' means from 1 to 32000 words).
Each message sent over the link has a
destination address which we call PTC
(Packet Type Code). The DR-11W device has
two modes of transfer: single word and
DMA. In accordance with our transmission
protocol, the driver uses single word mode
to tell the other driver the word count and
the PTC of a message that it wants to send.
Then both drivers simultaneously set up a
DMA transfer of the message over the link.
It should be noted that the driver does not
have a buffer space for the messages it
sends or receives - the driver sets up the
DMA transfer directly from or into the
buffer space provided by the application
program in its write or read request.
Therefore, the driver will not accept
unsolicited messages. The application
program must have issued a READ request,
providing the driver with the buffer area
in which to put the message.
Several read and/or write requests can be
outstanding at any time. The driver
processes all READ requests randomly, i.e.
as soon as the message with the requested
PTC is received on the link. All WRITE
requests are processed sequentially, i.e.
in order they are posted by the application
program. This type of processing of the
I/O requests is drastically different from
commonly used by other RT-11 handlers such
as a disk handler or magtape handler. A
disk handler, for example, processes all
READ/WRITE requests sequentially, i.e. in
the order they are received. Therefore,
our communication driver maintains
internally two separate queues for the
outstanding READs and WRITEs. It turns out
that our driver has to maintain yet another
internal queue - we call it EXIT queue. We
will show why this queue is necessary and
how a queue element travels from one queue
to another later in this paper.
Let us now switch our attention to another
quite common communication problem - a
message timeout. In the course of exchange
of messages between the processors, either
processor may get hung, or stopped or the
link hardware break down. Therefore, each
step (transaction as we call it) of the
message exchange is timed out. That is,
most transactions sent over the link must
be acknowledged by the other side within a
certain period of time (timeout window).
If no acknowledgement is received during
this time, a timeout mechanism triggers
execution of the timeout routine. The
length of the timeout and the timeout
routine may be different for different
transactions. The timeout problem is
compounded by the fact that for multiple
DR11-W links, several timeouts may be
outstanding at any given time. The
implementation of the timeout mechanism is
discussed later in this paper.
INTERNAL QUEUES.
Each RT-11 device handler is normally
assigned one I/O queue, which is
administered by the RT-11 Monitor.
Whenever the application program issues an
I/O request (e.g. in MACRO-11: .READ,
.WRITE, .SPFUN), the RT-11 Monitor picks up
a queue element from the pool of idle queue
elements, fills it out with the parameters,
supplied in the I/O request, and places it
on the device handler I/O queue.
If the device handler I/O queue was
previously empty, the Monitor calls the
handler so that the handler can start
processing of the I/O request. When the
handler finishes processing of the I/O
request, it returns to the Monitor by
executing the .DRFIN macro. The Monitor
reclaims the queue element from the
handler's I/O queue and either assigns it
immediately to the pool of idle queue
elements, or first arranges for the
execution of the completion routine and
assigns the queue element to the idle pool
afterwards. (We say "arranges", because
this step is done differently in SJ and FB
or XM Monitors). If, at this time, the
handler I/O queue is not empty, the Monitor
calls the handler to start processing of
the next I/O request.
This works just fine when the I/O requests
are to be processed sequentially, in the
order they are received by the Monitor. As
we already explained, we needed a different
order of the I/O processing. Therefore, we
implemented internal queues in our handler.
The handler has three internal queues.
They are shown in Figure 4. Here, the
queue marked 'CD' is the standard handler
I/O queue, normally administered only by
the Monitor. This queue is always kept
empty by our handler. Hence, whenever the
Monitor receives a 'write' or 'read'
request, it queues it up in the handler
'CD' queue and always calls the handler.
The handler immediately transfers the queue
element from the 'CD' queue to either
'writes' or 'reads' queue. When the I/O
request completes, its queue element must
be removed from the 'writes' or 'reads'
queue and placed back on the 'CD' queue, so
that it can be eventually reclaimed by the
Monitor.
Well, the removal is easy. However, the
immediate placement on the 'CD' queue may
create a problem. The I/O request may
complete on an interrupt level at priority
higher than zero. It may have interrupted
when a new I/O request was just placed on
the 'CD' queue. Executing .DRFIN or its
equivalent at this moment may cause a mix
up of the queue elements with unpredictable
results. This is why the completed I/O
requests are first transfered into the
'EXIT' queue. They are transfered to the
'CD' queue one at a time, whenever 'CD'
queue is empty.
When a queue element is transfered from one
queue to another, say from queue A to queue
B, it is always removed from the beginning
of queue A and linked at the end of queue
B. The handler processes 'write' requests
sequentially, always working with the
request at the beginning of 'writes' queue.
The 'read' requests are processed randomly.
As soon as the message with the requested
PTC is received on the communication link,
the handler searches the 'reads' queue.
When it finds a queue element with the
requested PTC, the handler moves this
element to a position at the beginning of
the 'reads' queue and works with it until
this READ completes.
Implementation. The data structures for
______________
the 'writes', 'reads' and 'EXIT' queues are
shown in Figure 5. Note, that each
internal queue is represented by: queue
count - showing the number of queue
elements in the queue, the current queue
element pointer - pointing to the beginning
of the queue, and the last queue element
pointer - pointing to the end of the queue.
The movement of the queue elements from one
queue to another is done by macro 'QUEXFR'
shown in Fig. 6. The calls to this macro,
which perform moves 1,2,3,4 and 5 in Fig.
4, are shown in Fig. 7. Figure 8 shows
macro 'CHKPTC' which searches a queue for a
requested PTC and moves it to the beginning
of the queue.
SERIALIZATION OF THE
COMPLETION ROUTINES.
The RT-11 SJ (Single Job) Monitor schedules
completion routines differently, from FB or
XM Monitors. Whereas the FB and XM
Monitors execute the completion routines
serially, in the order in which they are
released by the device handler, the SJ
Monitor does not provide this feature. It
may, in fact, interrupt one (running)
completion routine to start another. This
may cause an unwanted restriction to be
imposed, that the completion routines be
re-entrant. It is a problem if the
completion routines were supposed to be
written in FORTRAN (since the FORTRAN-IV
code is not re-entrant), or when new I/O
requests ought to be posted by the
completion routine (.READ, .WRITE and
.SPFUN are not re-entrant).
Implementation. We already mentioned
______________
".DRFIN or its equivalent" when we talked
about the 'EXIT' queue. We will now
describe it in more detail.
The .DRFIN is the macro a standard RT-11
handler executes to pass a queue element
with the completed I/O to the Monitor. By
executing .DRFIN, the handler also exits
(to the Monitor). Our communication
handler executes a .DRFIN substitute which
we call 'JSRFIN' (see Fig. 9). This
.DRFIN substitute is described in Chapter
7.4 of the RT-11 Software Support Manual.
The 'JSRFIN' macro returns the queue
element to the Monitor without exiting.
That is, after the Monitor is called by the
'JSRFIN', it reclaims the queue element
and, if necessary, calls the completion
routine. After the completion routine
finishes, the Monitor returns to the
'JSRFIN'. The serialization is done (see
Fig. 10) by setting a flag 'D$FFIN' before
the 'JSRFIN' and resetting it after the
'JSRFIN'.
THE TIMEOUT SUPPORT
The RT-11 supports time-out in its device
handlers with two macros: .TIMIO and
.CTIMIO. These macros and their use are
described in Chapter 7.6 of the RT-11
Software Support Manual. We will repeat
some of this information here, and we will
also discuss some problems with the
implementation of the time-out support in
our handler. Finally, we will show our
implementation of the time-out support.
The .TIMIO and .CTIMIO macros can only be
used by device handlers. The handler
requests a timeout with the .TIMIO macro
and cancels its previous timeout request
with the .CTIMIO macro. For each timeout
request, the handler must allocate a 7-word
block in memory which can not be re-used
until either the timeout expires and the
Monitor executes the handler's timeout
completion routine, or the timeout is
cancelled by the handler. This timer block
contains, among other data, the time
interval in number of clock ticks (one tick
is 1/60 of a sec), the timer block number
(from 177400 to 177477 octal) and the
address of the completion routine.
The .TIMIO and .CTIMIO requests can only be
made at priority 0 and so must be preceded
by a .FORK macro call if made at an
interrupt level. If the device has timed
out by the moment the handler placed a
.CTIMIO request, the .CTIMIO call returns
with the carry bit set (.CTIMIO fail
condition). This means that either the
completion routine has already executed or
is about to execute (and it is too late for
the handler to stop it).
Our communication handler must issue most
of its timeout requests at an interrupt
level. The interrupts may come in as often
as once every 1/1000 of a sec. Almost on
each interrupt, the handler must cancel the
old timeout request and issue a new one.
The time interval (to be set) varies with
each interrupt from 2 to 60 ticks, but most
of the time it is 2-4 ticks. On each
interrupt or timeout, the handler sends a
transaction over the DR11-W link, and the
contents of the transaction is defined by
our link transmission protocol (see Fig.
2). This contents depends on the handler's
internal state and is different for
different interrupts and timeouts. Issuing
.CTIMIO/.TIMIO on each interrupt occurence
would involve forking and forking means
delays. These delays may be comparable
with our transmission delays. Besides,
forking would require special care at a
fork level - code re-entrancy or device
interrupt disable.
Implementation. We use a concept of a free
______________
running timer. The timer is not associated
with any particular transaction's interrupt
or DR11-W unit. It is just one timer (and
timer block) for all occasions. When the
message exchange on the link starts, the
timer is started, when the message exchange
ceases, the timer eventually stops. The
.TIMIO macro is used to start or re-start
the timer. The .CTIMIO is not used at all.
When the address of the completion routine
in the timer block is non-zero - the timer
is running, otherwise the timer is not
running.
Each DR11-W unit is assigned a timeout
counter in the handler (Fig. 11). Every
time the Monitor fires up the timer's
completion routine, all non-zero timeout
counters are decremented by the routine
(see Fig. 12). Next, the routine
re-starts the timer (with .TIMIO) if there
was at least one non-zero counter prior to
decrementing. Finally, the routine
executes the timeout actions for all units
whose counters reached zero (after
decrementing).
Whenever the handler needs to start (or
re-start) a timeout at an interrupt level
or elsewhere, to a particular time interval
for a particular DR11-W unit - it just sets
the unit's timeout counter to the required
number of ticks. Fast and convenient, and
no delays. Fig. 13 shows macro 'ENTIME'
which does that, Fig 14 shows an example of
how this macro is used.
MULTIPLE DR11-W DEVICES
WITH SOME COMMENTS ON PIC.
PIC of course stands for Position
Independent Code. In this section we will
discuss some problems of supporting several
units with the same handler, what is
available in RT-11 and the additional
efforts necessary to do the job.
Basically, multiple units (in our case -
multiple DR11-W's) as compared to a single
unit, may impose the following requirements
on the device handler:
(1) separate interrupt vectors, say, one
per each unit
(2) separate sets of registers (CSR, I/O
data registers, etc.), one per each
unit
The first feature is fully supported by the
RT-11 macro .DRVTB and is described in
Chapter 7.2.2.4 of the RT-11 Software
Support Manual. We will not discuss it any
further. The second feature is not
supported by RT-11 (there are no device
block data structures on a per-unit basis).
In our handler we have data structures for
the CSR's, output data registers, timeout
counters (see Fig. 11), etc. The problem
is with accessing these data structures,
since the handler code is written in PIC.
One way of doing it is PC-relative, which
is explained in Appendix G of the Macro-11
Language Reference Manual (see also Fig.
12, "PC-relative"). This usually involves
2-4 instructions per each access. For
large handlers it may produce quite an
overhead. In our handler we do it
base-relative. We first establish a
PC-relative base at the beginning of the
handler (see Fig. 3). Then, say, at the
interrupt or timeout level, after the unit
number of the device is established, say,
in register R1, we add 'base' to it (see
Fig. 12, "1"). After that, the data
structures can be accessed with a single
instruction (see Fig. 12, "2") using R1 as
an index register.
THE HANDLER - APPLICATION PROGRAM
INTERFACE.
Our communication handler (or driver as we
also called it) makes the DR11-W link
appear to the application program as a
non-file structured device, similar to a
magtape filled with card images. As we
already mentioned, the driver writes into
and reads from the application program
buffer area directly, i.e. the driver will
not maintain its own buffers for sending
and receiving the DMA messages over the
DR11-W link. The following RT-11 standard
programming requests are available to the
application programs for interface with the
driver (requests marked ** below do not
directly interface the driver):
** .FETCH (fetch the driver into the
memory)
** .LOOKUP (open logical channel)
* .READC/.READ/.READW (reads)
* .WRITC/.WRITE/.WRITW (writes)
* .SPFUN (special function requests)
** .WAIT (wait for request to complete)
** .CLOSE/.PURGE/.SRESET/.HRESET,etc.
(close logical channel)
** .QSET (allocate additional queue
elements)
Care should be taken in using .READW or
.WAIT after a .READ request, since the
application program could wait indefinitely
if no message of the requested PTC arrived.
The use of .READC and .WRITC requests is
preferred, in order to speed things up (no
waits!) and to avoid confusion or possible
deadlocks. When using .READC or .WRITC,
the application program specifies a
completion routine, which can perform error
handling, transmission restart, etc.
There are several types of .SPFUN requests
available to the application programs. The
.SPFUN 'kill' stops the driver cleanly,
that is, it returns all queue elements to
the Monitor (with an appropriate error
message for the application program),
disables the DR11-W interrupt and puts the
driver into its initial state. This .SPFUN
must be called before closing the logical
channel to prevent system crash. The other
.SPFUN requests provide some specific
communication functions (see (1)). They
will not be covered in this paper.
Message Block Area. It is well known that
_______ _____ ____
the RT-11 is not very generous in providing
its device handlers with the means of
reporting errors and status of the I/O
requests to the application programs.
Therefore, when issuing I/O requests (read,
write, spfun) to our communication driver,
the application programs supply an
additional 4-word Message Block Area (MBA
for short) per each I/O request. There
must be one MBA per each I/O request, which
can not be re-used before the I/O request
completes.
The address of the MBA is placed in the
'block' field of the I/O request. For
example, the .READC call is of the form
(see also the RT-11 Programmers Manual):
.READC area,chan,buf,wcnt,crtn,blk
where blk = address of MBA
If the I/O request provides an invalid MBA
address, the driver returns 'hard' error
(in channel status word), otherwise the
driver uses the MBA for error and status
reporting. The MBA has the following
layout:
word 1: PTC of the request (1-255)
word 2: Message block number (1-32)
word 3: Error status returned by the
driver
word 4: Word count, actually
sent/received by the driver
The Message Block Number is used by the
application program to distinguish between
its various read/write requests, on
completion. It is returned to the
application program in bits 8 to 12 of the
channel status word upon completion of the
read/write (this status word is available
to the completion routine). The
application program can write zero into
word 3 of the MBA before placing its I/O
request. Since the error status (or
success) returned by the driver on
completion of the request is not equal to
zero, the application program could
periodically check this word to see whether
or not the I/O completed.
USING THE HANDLER,
SOME PERFORMANCE DATA.
We have had so far very limited experience
of using the handler (primarily in the test
environment). We have successfully
performed several tests. In particular, we
have had two processors, either both
running RT-11, or one running RT-11 and the
other running RSX-11M (or both running
RSX-11M), talking to each other. They
exchanged, successfully, tens of thousand
messages of various length (from 1 word to
4000 words per message). In each case, the
data communications were very stable and
reliable.
According to its specs, the DR11-W has a
burst rate of 500000 words/sec. (16-bit
words). We have observed the DMA rate of
about 330000 words/sec. The RT-11 driver
has a delay of about 7 msec per each DMA
message transfer across the link. Our
plans call for using it in a limited
production environment by the end of this
year.
CONCLUSIONS.
In this paper, we have presented some
results of our interprocesor communications
project. We have found, among other
things, that RT-11 is capable of supporting
a high-speed interprocessor link. We have
developed quite elaborate communication
protocol (discussed in details in (1)),
which takes full advantage of the
characteristics of the DR11-W, and
implemented this protocol in our RT-11
device handler.
We hope that the problems and solutions we
have discussed in this paper would be
helpful to all those who design
non-standard RT-11 device handlers
particularily in the area of the internal
queues, timeout support, multiple devices
and error/status reporting to the
application program.
REFERENCES
1. J.Biel, D.Burch, R.Dosen,P.Heinicke,
M.Pyatetsky, D.Ritchie, V.White, 1982
"High Speed Interprocessor Data Links
Using The DR11-W", 1982 Fall DECUS U.S.
____ ____ _____ ____
Symposium, Anaheim, Ca
_________
2. D.Burch, V.White, R.Dosen, 1982 "An
RSX-11M Device Driver Implementing A
Network Protocol For The DR11-W", 1982
____
Fall DECUS U.S. Symposium, Anaheim, Ca
____ _____ ____ _________
3. J.Bartlett, J.Biel, D.Curtis, R.Dosen,
T.Lagerlund, D.Ritchie, L.Taff, 1979
"RT/RSX MULTI: Packages For Data
Acquisition And Analysis In High Energy
Physics, IEEE Transactions On Nuclear
____ ____________ __ _______
Science, Vol. NS-26, No 4, August 1979
_______