Years of Citing Articles

Bookmark

OpenURL

Abstract

Lightweight Remote Procedure Call (LRPC) is a communication facility designed and optimized for communication between protection domains on the same machine. In contemporary small-kernel operating systems, existing RPC systems incur an unnecessarily high cost when used for the type of communication that predominates-between protection domains on the same machine. This cost leads system designers to coalesce weakly related subsystems into the same protection domain, trading safety for performance. By reducing the overhead of same-machine communication, LRPC encourages both safety and performance. LRPC combines the control transfer and communication model of capability systems with the programming semantics and large-grained protection model of RPC. LRPC achieves a factor-of-three performance improvement over more traditional approaches based on independent threads exchanging messages, reducing the cost of same-machine communication to nearly the lower bound imposed by conventional hardware. LRPC has been integrated into the Taos operating system of the DEC SRC Firefly multiprocessor workstation.

Citations

... communication models-independent threads exchanging messages containing (potentially) large, structured values. In this paper, though, we show that most communication traffic in operating systems is =-=(1)-=- between domains on the same machine (cross-domain), rather than between domains located on separate ACM Transactions on Computer Systems, Vol. 8, No. 1, February 1990.sLightweight Remote Procedure Ca...

... improvement over more traditional approaches. The granularity of the protection mechanisms used by an operating system has a significant impact on the system's design and use. Some operating systems =-=[10, 13]-=- have large, monolithic kernels insulated from user programs by simple hardware boundaries. Within the operating system itself, though, there are no protection boundaries. The lack of strong fire wall...

...UNIX system call is not implemented as a cross-domain RPC, in a more decomposed operating system most calls would result in at least one such RPC. On a diskless Sun-3 workstation running Sun UNIX+NFS =-=[15]-=-, during a period of four days we observed over 100 million operating system calls, but fewer than one million RPCs to file servers. Inexpensive system calls, encouraging ’ UNIX is a trademark of AT&T...

...lnerable to a large mass of complicated operating system software. Capability systems supporting fine-grained protection were suggested as a solution to the problems of large-kernel operating systems =-=[5]-=-. In a capability system, each fine-grained object exists in its own protection domain, but all live within a single name or address space. A process in one domain can act on an object in another only...

...d can be well served by optimization. 2.1 Frequency of Cross-Machine Activity We examined three operating systems to determine the relative frequency of cross-machine activity: (1) The V System. In V =-=[2]-=-, only the basic message primitives (Send, Receive, etc.) are accessed directly through kernel traps. All other system functions are accessed by sending messages to the appropriate server. Concern for...

...B. N. Bershad et al. -updates the thread’s user stack pointer to run off of the new E-stack, -reloads the processor’s virtual memory registers with those of the server domain, and -performs an upcall =-=[3]-=- into the server’s stub at the address specified in the PD for the registered procedure. Arguments are pushed onto the A-stack according to the calling conventions of Modula2+ [14]. Since the A-stack ...

...B. N. Bershad et al. -updates the thread’s user stack pointer to run off of the new E-stack, -reloads the processor’s virtual memory registers with those of the server domain, and -performs an upcall =-=[3]-=- into the server’s stub at the address specified in the PD for the registered procedure. Arguments are pushed onto the A-stack according to the calling conventions of Modula2+ [14]. Since the A-stack ...

...speedup potential of a multiprocessor. We have demonstrated the viability of LRPC by implementing and integrating it into Taos, the operating system for the DEC SRC Firefly multiprocessor workstation =-=[17]-=-. The simplest cross-domain call using LRPC takes 157 ps on a single C-VAX processor. By contrast, SRC RPC, the Firefly’s native communication system [16], takes 464 ps to do the same call; though SRC...

...he DEC SRC Firefly multiprocessor workstation [17]. The simplest cross-domain call using LRPC takes 157 ps on a single C-VAX processor. By contrast, SRC RPC, the Firefly’s native communication system =-=[16]-=-, takes 464 ps to do the same call; though SRC RPC has been carefully streamlined and outperforms peer systems, it is a factor of three slower than LRPC. The Firefly virtual memory and trap handling m...

... -performs an upcall [3] into the server’s stub at the address specified in the PD for the registered procedure. Arguments are pushed onto the A-stack according to the calling conventions of Modula2+ =-=[14]-=-. Since the A-stack is mapped into the server’s domain, the server procedure can directly access the parameters as though it had been called directly. It is important to note that this optimization re...

...actual Null call time reflects the overhead of a particular RPC system. Table II shows this overhead for six systems. The data in Table II come from measurements of our own and from published sources =-=[6, 18, 19]-=-. The high overheads revealed by Table II can be attributed to several aspects of conventional RPC: Stub overhead. Stubs provide a simple procedure call abstraction, concealing from programs the inter...

...tection and programming models used in distributed computing environments and have demonstrated these to be appropriate for managing subsystems, even those not primarily intended for remote operation =-=[11]-=-. In these small-kernel systems, separate components of the operating system can be placed in disjoint domains (or address spaces), with messages used for all interdomain communication. The advantages...

... effort to improve crossdomain performance. The DASH system [la] eliminates an intermediate kernel copy by allocating messages out of a region specially mapped into both kernel and user domains. Mach =-=[7]-=- and Taos rely on handoff scheduling to bypass the general, slower scheduling path; instead, if the two concrete threads cooperating in a domain transfer are identifiable at the time of the transfer, ...

...ss system interfaces and that the majority of interface procedures move only small amounts of data. Others have noticed that most interprocess communication is simple, passing mainly small parameters =-=[2, 4, 8]-=-, and some have suggested optimizations for this case. V, for example, uses a message protocol that has been optimized for fixed-size messages of 32 bytes. Karger describes compiler-driven techniques ...

...actual Null call time reflects the overhead of a particular RPC system. Table II shows this overhead for six systems. The data in Table II come from measurements of our own and from published sources =-=[6, 18, 19]-=-. The high overheads revealed by Table II can be attributed to several aspects of conventional RPC: Stub overhead. Stubs provide a simple procedure call abstraction, concealing from programs the inter...

... improvement over more traditional approaches. The granularity of the protection mechanisms used by an operating system has a significant impact on the system's design and use. Some operating systems =-=[10, 13]-=- have large, monolithic kernels insulated from user programs by simple hardware boundaries. Within the operating system itself, though, there are no protection boundaries. The lack of strong fire wall...

...ss system interfaces and that the majority of interface procedures move only small amounts of data. Others have noticed that most interprocess communication is simple, passing mainly small parameters =-=[2, 4, 8]-=-, and some have suggested optimizations for this case. V, for example, uses a message protocol that has been optimized for fixed-size messages of 32 bytes. Karger describes compiler-driven techniques ...

...actual Null call time reflects the overhead of a particular RPC system. Table II shows this overhead for six systems. The data in Table II come from measurements of our own and from published sources =-=[6, 18, 19]-=-. The high overheads revealed by Table II can be attributed to several aspects of conventional RPC: Stub overhead. Stubs provide a simple procedure call abstraction, concealing from programs the inter...

...sages to the appropriate server. Concern for efficiency, though, has forced the implementation of many of these servers down into the kernel. In an instrumented version of the V System, C. Williamson =-=[20]-=- found that 97 percent of calls crossed protection, but not machine, boundaries. Williamson’s measurements include message traffic to kernel-resident servers. (2) Taos. Taos, the Firefly operating sys...