I would appreciate any donations. Wishlist or send e-mail type donations to maekawa AT daemon-systems.org.

Thank you.

RUMPUSER(3) Library Functions Manual RUMPUSER(3)
NAMErumpuser -- rump kernel hypercall interface
LIBRARY
rump User Library (librumpuser, -lrumpuser)
SYNOPSIS#include<rump/rumpuser.h>DESCRIPTION
The rumpuser hypercall interfaces allow a rump kernel to access host
resources. A hypervisor implementation must implement the routines
described in this document to allow a rump kernel to run on the host.
The implementation included in NetBSD is for POSIX hosts. This document
is divided into sections based on the functionality group of each
hypercall.
Since the hypercall interface is a C function interface, both the rump
kernel and the hypervisor must conform to the same ABI. The interface
itself attempts to assume as little as possible from the type systems,
and for example off_t is passed as int64_t and enums are passed as ints.
It is recommended that the hypervisor converts these to the native types
before starting to process the hypercall, for example by assigning the
ints back to enums.
UPCALLSANDRUMPKERNELCONTEXT
A hypercall is always entered with the calling thread scheduled in the
rump kernel. In case the hypercall intends to block while waiting for an
event, the hypervisor must first release the rump kernel scheduling
context. In other words, the rump kernel context is a resource and
holding on to it while waiting for a rump kernel event/resource may lead
to a deadlock. Even when there is no possibility of deadlock in the
strict sense of the term, holding on to the rump kernel context while
performing a slow hypercall such as reading a device will prevent other
threads (including the clock interrupt) from using that rump kernel
context.
Releasing the context is done by calling the hyp_backend_unschedule()
upcall which the hypervisor received from rump kernel as a parameter for
rumpuser_init(). Before a hypercall returns back to the rump kernel, the
returning thread must carry a rump kernel context. In case the hypercall
unscheduled itself, it must reschedule itself by calling
hyp_backend_schedule().
HYPERCALLINTERFACESInitializationintrumpuser_init(intversion, structrump_hyperup*hyp)
Initialize the hypervisor.
version hypercall interface version number that the kernel
expects to be used. In case the hypervisor cannot
provide an exact match, this routine must return a non-
zero value.
hyp pointer to a set of upcalls the hypervisor can make into
the rump kernel
Memoryallocationintrumpuser_malloc(size_tlen, intalignment, void**memp)
len amount of memory to allocate
alignment size the returned memory must be aligned to. For
example, if the value passed is 4096, the returned
memory must be aligned to a 4k boundary.
memp return pointer for allocated memory
voidrumpuser_free(void*mem, size_tlen)
mem memory to free
len length of allocation. This is always equal to the
amount the caller requested from the rumpuser_malloc()
which returned mem.
FilesandI/Ointrumpuser_open(constchar*name, intmode, int*fdp)
Open name for I/O and associate a file descriptor with it. Notably,
there needs to be no mapping between name and the host's file system
namespace. For example, it is possible to associate the file descriptor
with device I/O registers for special values of name.
name the identifier of the file to open for I/O
mode combination of the following:
RUMPUSER_OPEN_RDONLY open only for reading
RUMPUSER_OPEN_WRONLY open only for writing
RUMPUSER_OPEN_RDWR open for reading and writing
RUMPUSER_OPEN_CREATE do not treat missing name as an
error
RUMPUSER_OPEN_EXCL combined with
RUMPUSER_OPEN_CREATE, flag an
error if name already exists
RUMPUSER_OPEN_BIO the caller will use this file for
block I/O, usually used in
conjunction with accessing file
system media. The hypervisor
should treat this flag as
advisory and possibly enable some
optimizations for *fdp based on
it.
Notably, the permissions of the created file are left up
to the hypervisor implementation.
fdp An integer value denoting the open file is returned
here.
intrumpuser_close(intfd)
Close a previously opened file descriptor.
intrumpuser_getfileinfo(constchar*name, uint64_t*size, int*type)
name file for which information is returned. The namespace
is equal to that of rumpuser_open().
size If non-NULL, size of the file is returned here.
type If non-NULL, type of the file is returned here. The
options are RUMPUSER_FT_DIR, RUMPUSER_FT_REG,
RUMPUSER_FT_BLK, RUMPUSER_FT_CHR, or RUMPUSER_FT_OTHER
for directory, regular file, block device, character
device or unknown, respectively.
voidrumpuser_bio(intfd, intop, void*data, size_tdlen, int64_toff,
rump_biodone_fnbiodone, void*donearg)
Initiate block I/O and return immediately.
fd perform I/O on this file descriptor. The file
descriptor must have been opened with RUMPUSER_OPEN_BIO.
op Transfer data from the file descriptor with
RUMPUSER_BIO_READ and transfer data to the file
descriptor with RUMPUSER_BIO_WRITE. Unless
RUMPUSER_BIO_SYNC is specified, the hypervisor may cache
a write instead of committing it to permanent storage.
data memory address to transfer data to/from
dlen length of I/O. The length is guaranteed to be a
multiple of 512.
off offset into fd where I/O is performed
biodone To be called when the I/O is complete. Accessing data
is not legal after the call is made.
donearg opaque arg that must be passed to biodone.
intrumpuser_iovread(intfd, structrumpuser_iovec*ruiov, size_tiovlen,
int64_toff, size_t*retv)
intrumpuser_iovwrite(intfd, structrumpuser_iovec*ruiov,
size_tiovlen, int64_toff, size_t*retv)
These routines perform scatter-gather I/O which is not block I/O by
nature and therefore cannot be handled by rumpuser_bio().
fd file descriptor to perform I/O on
ruiov an array of I/O descriptors. It is defined as follows:
struct rumpuser_iovec {
void *iov_base;
size_t iov_len;
};
iovlen number of elements in ruiovoff offset of fd to perform I/O on. This can either be a
non-negative value or RUMPUSER_IOV_NOSEEK. The latter
denotes that no attempt to change the underlying objects
offset should be made. Using both types of offsets on a
single instance of fd results in undefined behavior.
retv number of bytes successfully transferred is returned
here
intrumpuser_syncfd(intfd, intflags, uint64_tstart, uint64_tlen)
Synchronizes fd with respect to backing storage. The other arguments
are:
flags controls how synchronization happens. It must contain
one of the following:
RUMPUSER_SYNCFD_READ Make sure that the next read
sees writes from all other
parties. This is useful for
example in the case that fd
represents memory to write a
DMA read is being performed.
RUMPUSER_SYNCFD_WRITE Flush cached writes.
The following additional parameters may be passed in
flags:
RUMPUSER_SYNCFD_BARRIER Issue a barrier. Outstanding
I/O operations which were
started before the barrier
complete before any operations
after the barrier are
performed.
RUMPUSER_SYNCFD_SYNC Wait for the synchronization
operation to fully complete
before returning. For
example, this could mean that
the data to be written to a
disk has hit either the disk
or non-volatile memory.
start offset into the object.
len the number of bytes to synchronize. The value 0 denotes
until the end of the object.
Clocks
The hypervisor should support two clocks, one for wall time and one for
monotonically increasing time, the latter of which may be based on some
arbitrary time (e.g. system boot time). If this is not possible, the
hypervisor must make a reasonable effort to retain semantics.
intrumpuser_clock_gettime(intenum_rumpclock, int64_t*sec, long*nsec)
enum_rumpclock specifies the clock type. In case of
RUMPUSER_CLOCK_RELWALL the wall time should be returned.
In case of RUMPUSER_CLOCK_ABSMONO the time of a
monotonic clock should be returned.
sec return value for seconds
nsec return value for nanoseconds
intrumpuser_clock_sleep(intenum_rumpclock, int64_tsec, longnsec)
enum_rumpclock In case of RUMPUSER_CLOCK_RELWALL, the sleep should last
at least as long as specified. In case of
RUMPUSER_CLOCK_ABSMONO, the sleep should last until the
hypervisor monotonic clock hits the specified absolute
time.
sec sleep duration, seconds. exact semantics depend on clk.
nsec sleep duration, nanoseconds. exact semantics depend on
clk.
Parameterretrievalintrumpuser_getparam(constchar*name, void*buf, size_tbuflen)
Retrieve a configuration parameter from the hypervisor. It is up to the
hypervisor to decide how the parameters can be set.
name name of the parameter. If the name starts with an
underscore, it means a mandatory parameter. The
mandatory parameters are RUMPUSER_PARAM_NCPU which
specifies the amount of virtual CPUs bootstrapped by the
rump kernel and RUMPUSER_PARAM_HOSTNAME which returns a
preferably unique instance name for the rump kernel.
buf buffer to return the data in as a string
buflen length of buffer
Terminationvoidrumpuser_exit(intvalue)
Terminate the rump kernel with exit value value. If value is
RUMPUSER_PANIC the hypervisor should attempt to provide something akin to
a core dump.
Consoleoutput
Console output is divided into two routines: a per-character one and
printf-like one. The former is used e.g. by the rump kernel's internal
printf routine. The latter can be used for direct debug prints e.g. very
early on in the rump kernel's bootstrap or when using the in-kernel
routine causes too much skew in the debug print results (the hypercall
runs outside of the rump kernel and therefore does not cause any locking
or scheduling events inside the rump kernel).
voidrumpuser_putchar(intch)
Output ch on the console.
voidrumpuser_dprintf(constchar*fmt, ...)
Do output based on printf-like parameters.
Signals
A rump kernel should be able to send signals to client programs due to
some standard interfaces including signal delivery in their
specifications. Examples of these interfaces include setitimer(2) and
write(2). The rumpuser_kill() function advises the hypercall
implementation to raise a signal for the process containing the rump
kernel.
intrumpuser_kill(int64_tpid, intsig)
pid The pid of the rump kernel process that the signal is
directed to. This value may be used as the hypervisor
as a hint on how to deliver the signal. The value
RUMPUSER_PID_SELF may also be specified to indicate no
hint. This value will be removed in a future version of
the hypercall interface.
sig Number of signal to raise. The value is in NetBSD
signal number namespace. In case the host has a native
representation for signals, the value should be
translated before the signal is raised. In case there
is no mapping between sig and native signals (if any),
the behavior is implementation-defined.
A rump kernel will ignore the return value of this hypercall. The only
implication of not implementing rumpuser_kill() is that some application
programs may not experience expected behavior for standard interfaces.
As an aside,the rump_sp(7) protocol provides equivalent functionality for
remote clients.
Randompoolintrumpuser_getrandom(void*buf, size_tbuflen, intflags, size_t*retp)
buf buffer that the randomness is written to
buflen number of bytes of randomness requested
flags The value 0 or a combination of RUMPUSER_RANDOM_HARD
(return true randomness instead of something from a
PRNG) and RUMPUSER_RANDOM_NOWAIT (do not block in case
the requested amount of bytes is not available).
retp The number of random bytes written into buf.
Threadsintrumpuser_thread_create(void*(*fun)(void*), void*arg,
constchar*thrname, intmustjoin, intpriority, intcpuidx,
void**cookie)
Create a schedulable host thread context. The rump kernel will call this
interface when it creates a kernel thread. The scheduling policy for the
new thread is defined by the hypervisor. In case the hypervisor wants to
optimize the scheduling of the threads, it can perform heuristics on the
thrname, priority and cpuidx parameters.
fun function that the new thread must call. This call will
never return.
arg argument to be passed to funthrname Name of the new thread.
mustjoin If 1, the thread will be waited for by
rumpuser_thread_join() when the thread exits.
priority The priority that the kernel requested the thread to be
created at. Higher values mean higher priority. The
exact kernel semantics for each value are not available
through this interface.
cpuidx The index of the virtual CPU that the thread is bound
to, or -1 if the thread is not bound. The mapping
between the virtual CPUs and physical CPUs, if any, is
hypervisor implementation specific.
cookie In case mustjoin is set, the value returned in cookie
will be passed to rumpuser_thread_join().
voidrumpuser_thread_exit(void)
Called when a thread created with rumpuser_thread_create() exits.
intrumpuser_thread_join(void*cookie)
Wait for a joinable thread to exit. The cookie matches the value from
rumpuser_thread_create().
voidrumpuser_curlwpop(intenum_rumplwpop, structlwp*l)
Manipulate the hypervisor's thread context database. The possible
operations are create, destroy, and set as specified by enum_rumplwpop:
RUMPUSER_LWP_CREATE Inform the hypervisor that l is now a valid thread
context which may be set. A currently valid value
of l may not be specified. This operation is
informational and does not mandate any action from
the hypervisor.
RUMPUSER_LWP_DESTROY Inform the hypervisor that l is no longer a valid
thread context. This means that it may no longer
be set as the current context. A currently set
context or an invalid one may not be destroyed.
This operation is informational and does not
mandate any action from the hypervisor.
RUMPUSER_LWP_SET Set l as the current host thread's rump kernel
context. A previous context must not exist.
RUMPUSER_LWP_CLEAR Clear the context previous set by
RUMPUSER_LWP_SET. The value passed in l is the
current thread and is never NULL.
structlwp*rumpuser_curlwp(void)
Retrieve the rump kernel thread context associated with the current host
thread, as set by rumpuser_curlwpop(). This routine may be called when a
context is not set and the routine must return NULL in that case. This
interface is expected to be called very often. Any optimizations
pertaining to the execution speed of this routine should be done in
rumpuser_curlwpop().
voidrumpuser_seterrno(interrno)
Set an errno value in the calling thread's TLS. Note: this is used only
if rump kernel clients make rump system calls.
Mutexes,rwlocksandconditionvariables
The locking interfaces have standard semantics, so we will not discuss
each one in detail. The data types structrumpuser_mtx, structrumpuser_rw and structrumpuser_cv used by these interfaces are opaque to
the rump kernel, i.e. the hypervisor has complete freedom over them.
Most of these interfaces will (and must) relinquish the rump kernel CPU
context in case they block (or intend to block). The exceptions are the
"nowrap" variants of the interfaces which may not relinquish rump kernel
context.
voidrumpuser_mutex_init(structrumpuser_mtx**mtxp, intflags)
voidrumpuser_mutex_enter(structrumpuser_mtx*mtx)
voidrumpuser_mutex_enter_nowrap(structrumpuser_mtx*mtx)
intrumpuser_mutex_tryenter(structrumpuser_mtx*mtx)
voidrumpuser_mutex_exit(structrumpuser_mtx*mtx)
voidrumpuser_mutex_destroy(structrumpuser_mtx*mtx)
voidrumpuser_mutex_owner(structrumpuser_mtx*mtx, structlwp**lp)
Mutexes provide mutually exclusive locking. The flags, of which at least
one must be given, are as follows:
RUMPUSER_MTX_SPIN Create a spin mutex. Locking this type of mutex
must not relinquish rump kernel context even when
rumpuser_mutex_enter() is used.
RUMPUSER_MTX_KMUTEX The mutex must track and be able to return the rump
kernel thread that owns the mutex (if any). If
this flag is not specified, rumpuser_mutex_owner()
will never be called for that particular mutex.
voidrumpuser_rw_init(structrumpuser_rw**rwp)
voidrumpuser_rw_enter(intenum_rumprwlock, structrumpuser_rw*rw)
intrumpuser_rw_tryenter(intenum_rumprwlock, structrumpuser_rw*rw)
intrumpuser_rw_tryupgrade(structrumpuser_rw*rw)
voidrumpuser_rw_downgrade(structrumpuser_rw*rw)
voidrumpuser_rw_exit(structrumpuser_rw*rw)
voidrumpuser_rw_destroy(structrumpuser_rw*rw)
voidrumpuser_rw_held(intenum_rumprwlock, structrumpuser_rw*rw,
int*heldp)
Read/write locks provide either shared or exclusive locking. The
possible values for lk are RUMPUSER_RW_READER and RUMPUSER_RW_WRITER.
Upgrading means trying to migrate from an already owned shared lock to an
exclusive lock and downgrading means migrating from an already owned
exclusive lock to a shared lock.
voidrumpuser_cv_init(structrumpuser_cv**cvp)
voidrumpuser_cv_destroy(structrumpuser_cv*cv)
voidrumpuser_cv_wait(structrumpuser_cv*cv, structrumpuser_mtx*mtx)
voidrumpuser_cv_wait_nowrap(structrumpuser_cv*cv, structrumpuser_mtx*mtx)
intrumpuser_cv_timedwait(structrumpuser_cv*cv,
structrumpuser_mtx*mtx, int64_tsec, int64_tnsec)
voidrumpuser_cv_signal(structrumpuser_cv*cv)
voidrumpuser_cv_broadcast(structrumpuser_cv*cv)
voidrumpuser_cv_has_waiters(structrumpuser_cv*cv, int*waitersp)
Condition variables wait for an event. The mtx interlock eliminates a
race between checking the predicate and sleeping on the condition
variable; the mutex should be released for the duration of the sleep in
the normal atomic manner. The timedwait variant takes a specifier
indicating a relative sleep duration after which the routine will return
with ETIMEDOUT. If a timedwait is signaled before the timeout expires,
the routine will return 0.
The order in which the hypervisor reacquires the rump kernel context and
interlock mutex before returning into the rump kernel is as follows. In
case the interlock mutex was initialized with both RUMPUSER_MTX_SPIN and
RUMPUSER_MTX_KMUTEX, the rump kernel context is scheduled before the
mutex is reacquired. In case of a purely RUMPUSER_MTX_SPIN mutex, the
mutex is acquired first. In the final case the order is implementation-
defined.
RETURNVALUES
All routines which return an integer return an errno value. The
hypervisor must translate the value to the the native errno namespace
used by the rump kernel. Routines which do not return an integer may
never fail.
SEEALSOrump(3)
Antti Kantee, "Flexible Operating System Internals: The Design and
Implementation of the Anykernel and Rump Kernels", AaltoUniversityDoctoralDissertations, 2012, Section 2.3.2: The Hypercall Interface.
HISTORY
The rump kernel hypercall API was first introduced in NetBSD 5.0. The
API described above first appeared in NetBSD 7.0.
NetBSD 7.1.2 February 19, 2014 NetBSD 7.1.2