Combine messages

As it turns out, this concept of combine messages isn't useful just for saving
bandwidth (as in the chown() case, above).
It's also critical for ensuring atomic completion of operations.

Suppose the client process has two or more threads and one file descriptor.
One of the threads in the client does an lseek() followed by a
read().
Everything is as we expect it.
If another thread in the client does the same set of operations, on the same
file descriptor, we'd run into problems.
Since the lseek() and read() functions don't know about each other,
it's possible that the first thread would do the lseek(), and then
get preempted by the second thread.
The second thread gets to do its lseek(), and then its read(),
before giving up CPU.
The problem is that since the two threads are sharing the same file descriptor,
the first thread's lseek() offset is now at the wrong place—it's at
the position given by the second thread's read() function!
This is also a problem with file descriptors that are dup()'d across processes,
let alone the network.

An obvious solution to this is to put the lseek() and read()
functions within a mutex—when the first thread obtains the mutex, we now know that
it has exclusive access to the file descriptor.
The second thread has to wait until it can acquire the mutex before it can go and mess around
with the position of the file descriptor.

Unfortunately, if someone forgot to obtain a mutex for each and every file descriptor
operation, there'd be a possibility that such an unprotected access
would cause a thread to read or write data to the wrong location.

This code is still vulnerable to unprotected access; if some other
thread in the process does a simple non-mutexed lseek() on the file descriptor,
we've got a bug.

The solution to this is to use a combine message, as we discussed above for the
chown() function.
In this case, the C library implementation of readblock() puts both
the lseek()and the read() operations into a single
message and sends that off to the resource manager:

Figure 1. The readblock() function's combine message.

The reason that this works is because message passing is atomic.
From the client's point of view, either the entire message has gone to the resource
manager, or none of it has.
Therefore, an intervening unprotectedlseek() is irrelevant—when the
readblock() operation is received by the resource manager, it's done in one shot.
(Obviously, the damage will be to the unprotected lseek(), because after the
readblock() the file descriptor's offset is at a different place than where
the original lseek() put it.)

But what about the resource manager? How does it ensure that it processes
the entire readblock() operation in one shot?
We'll see this shortly, when we discuss the operations performed for each
message component.