DESCRIPTION

Direct Memory Access (DMA) is a method of transferring data without
involving the CPU, thus providing higher performance. A DMA transaction
can be achieved between device to memory, device to device, or memory to
memory.
The bus_dma API is a bus, device, and machine-independent (MI) interface
to DMA mechanisms. It provides the client with flexibility and
simplicity by abstracting machine dependent issues like setting up DMA
mappings, handling cache issues, bus specific features and limitations.

STRUCTURESANDTYPES

bus_dma_tag_t
A machine-dependent (MD) opaque type that describes the
characteristics of DMA transactions. DMA tags are organized into
a hierarchy, with each child tag inheriting the restrictions of
its parent. This allows all devices along the path of DMA
transactions to contribute to the constraints of those
transactions.
bus_dma_filter_t
Client specified address filter having the format:
intclient_filter(void*filtarg, bus_addr_ttestaddr)
Address filters can be specified during tag creation to allow for
devices whose DMA address restrictions cannot be specified by a
single window. The filtarg argument is specified by the client
during tag creation to be passed to all invocations of the
callback. The testaddr argument contains a potential starting
address of a DMA mapping. The filter function operates on the
set of addresses from testaddr to ‘trunc_page(testaddr) +
PAGE_SIZE - 1’, inclusive. The filter function should return
zero if any mapping in this range can be accommodated by the
device and non-zero otherwise.
bus_dma_segment_t
A machine-dependent type that describes individual DMA segments.
It contains the following fields:
bus_addr_t ds_addr;
bus_size_t ds_len;
The ds_addr field contains the device visible address of the DMA
segment, and ds_len contains the length of the DMA segment.
Although the DMA segments returned by a mapping call will adhere
to all restrictions necessary for a successful DMA operation,
some conversion (e.g. a conversion from host byte order to the
device’s byte order) is almost always required when presenting
segment information to the device.
bus_dmamap_t
A machine-dependent opaque type describing an individual mapping.
One map is used for each memory allocation that will be loaded.
Maps can be reused once they have been unloaded. Multiple maps
can be associated with one DMA tag. While the value of the map
may evaluate to NULL on some platforms under certain conditions,
it should never be assumed that it will be NULL in all cases.
bus_dmamap_callback_t
Client specified callback for receiving mapping information
resulting from the load of a bus_dmamap_t via bus_dmamap_load().
Callbacks are of the format:
voidclient_callback(void*callback_arg, bus_dma_segment_t*segs, intnseg, interror)
The callback_arg is the callback argument passed to dmamap load
functions. The segs and nseg arguments describe an array of
bus_dma_segment_t structures that represent the mapping. This
array is only valid within the scope of the callback function.
The success or failure of the mapping is indicated by the error
argument. More information on the use of callbacks can be found
in the description of the individual dmamap load functions.
bus_dmamap_callback2_t
Client specified callback for receiving mapping information
resulting from the load of a bus_dmamap_t via
bus_dmamap_load_uio() or bus_dmamap_load_mbuf().
Callback2s are of the format:
voidclient_callback2(void*callback_arg, bus_dma_segment_t*segs, intnseg, bus_size_tmapsize, interror)
Callback2’s behavior is the same as bus_dmamap_callback_t with
the addition that the length of the data mapped is provided via
mapsize.
bus_dmasync_op_t
Memory synchronization operation specifier. Bus DMA requires
explicit synchronization of memory with its device visible
mapping in order to guarantee memory coherency. The
bus_dmasync_op_t allows the type of DMA operation that will be or
has been performed to be communicated to the system so that the
correct coherency measures are taken. The operations are
represented as bitfield flags that can be combined together,
though it only makes sense to combine PRE flags or POST flags,
not both. See the bus_dmamap_sync() description below for more
details on how to use these operations.
All operations specified below are performed from the host memory
point of view, where a read implies data coming from the device
to the host memory, and a write implies data going from the host
memory to the device. Alternatively, the operations can be
thought of in terms of driver operations, where reading a network
packet or storage sector corresponds to a read operation in
bus_dma.
BUS_DMASYNC_PREREAD Perform any synchronization required prior
to an update of host memory by the device.
BUS_DMASYNC_PREWRITE Perform any synchronization required after
an update of host memory by the CPU and
prior to device access to host memory.
BUS_DMASYNC_POSTREAD Perform any synchronization required after
an update of host memory by the device and
prior to CPU access to host memory.
BUS_DMASYNC_POSTWRITE Perform any synchronization required after
device access to host memory.
bus_dma_lock_t
Client specified lock/mutex manipulation method. This will be
called from within busdma whenever a client lock needs to be
manipulated. In its current form, the function will be called
immediately before the callback for a dma load operation that has
been deferred with BUS_DMA_LOCK and immediately after with
BUS_DMA_UNLOCK. If the load operation does not need to be
deferred, then it will not be called since the function loading
the map should be holding the appropriate locks. This method is
of the format:
voidlockfunc(void*lockfunc_arg, bus_dma_lock_op_top)
The lockfuncarg argument is specified by the client during tag
creation to be passed to all invocations of the callback. The op
argument specifies the lock operation to perform.
Two lockfunc implementations are provided for convenience.
busdma_lock_mutex() performs standard mutex operations on the
sleep mutex provided via lockfuncarg. dflt_lock() will generate
a system panic if it is called. It is substituted into the tag
when lockfunc is passed as NULL to bus_dma_tag_create() and is
useful for tags that should not be used with deferred load
operations.
bus_dma_lock_op_t
Operations to be performed by the client-specified lockfunc().
BUS_DMA_LOCK Acquires and/or locks the client locking
primitive.
BUS_DMA_UNLOCK Releases and/or unlocks the client locking
primitive.

FUNCTIONS

bus_dma_tag_create(parent, alignment, boundary, lowaddr, highaddr,
*filtfunc, *filtfuncarg, maxsize, nsegments, maxsegsz, flags,
lockfunc, lockfuncarg, *dmat)
Allocates a device specific DMA tag, and initializes it according
to the arguments provided:
parent Indicates restrictions between the parent bridge,
CPU memory, and the device. Each device must use a
master parent tag by calling bus_get_dma_tag().
alignment Alignment constraint, in bytes, of any mappings
created using this tag. The alignment must be a
power of 2. Hardware that can DMA starting at any
address would specify 1 for byte alignment.
Hardware requiring DMA transfers to start on a
multiple of 4K would specify 4096.
boundary Boundary constraint, in bytes, of the target DMA
memory region. The boundary indicates the set of
addresses, all multiples of the boundary argument,
that cannot be crossed by a single
bus_dma_segment_t. The boundary must be a power of
2 and must be no smaller than the maximum segment
size. ‘0’ indicates that there are no boundary
restrictions.
lowaddr, highaddr
Bounds of the window of bus address space that
cannot be directly accessed by the device. The
window contains all addresses greater than lowaddr
and less than or equal to highaddr. For example, a
device incapable of DMA above 4GB, would specify a
highaddr of BUS_SPACE_MAXADDR and a lowaddr of
BUS_SPACE_MAXADDR_32BIT. Similarly a device that
can only dma to addresses bellow 16MB would specify
a highaddr of BUS_SPACE_MAXADDR and a lowaddr of
BUS_SPACE_MAXADDR_24BIT. Some implementations
requires that some region of device visible address
space, overlapping available host memory, be outside
the window. This area of ‘safe memory’ is used to
bounce requests that would otherwise conflict with
the exclusion window.
filtfunc Optional filter function (may be NULL) to be called
for any attempt to map memory into the window
described by lowaddr and highaddr. A filter
function is only required when the single window
described by lowaddr and highaddr cannot adequately
describe the constraints of the device. The filter
function will be called for every machine page that
overlaps the exclusion window.
filtfuncarg Argument passed to all calls to the filter function
for this tag. May be NULL.
maxsize Maximum size, in bytes, of the sum of all segment
lengths in a given DMA mapping associated with this
tag.
nsegments Number of discontinuities (scatter/gather segments)
allowed in a DMA mapped region. If there is no
restriction, BUS_SPACE_UNRESTRICTED may be
specified.
maxsegsz Maximum size, in bytes, of a segment in any DMA
mapped region associated with dmat.
flags Are as follows:
BUS_DMA_ALLOCNOW Pre-allocate enough resources to
handle at least one map load
operation on this tag. If
sufficient resources are not
available, ENOMEM is returned.
This should not be used for tags
that only describe buffers that
will be allocated with
bus_dmamem_alloc(). Also, due to
resource sharing with other tags,
this flag does not guarantee that
resources will be allocated or
reserved exclusively for this tag.
It should be treated only as a
minor optimization.
lockfunc Optional lock manipulation function (may be NULL) to
be called when busdma needs to manipulate a lock on
behalf of the client. If NULL is specified,
dflt_lock() is used.
lockfuncarg Optional argument to be passed to the function
specified by lockfunc.
dmat Pointer to a bus_dma_tag_t where the resulting DMA
tag will be stored.
Returns ENOMEM if sufficient memory is not available for tag
creation or allocating mapping resources.
bus_dma_tag_destroy(dmat)
Deallocate the DMA tag dmat that was created by
bus_dma_tag_create().
Returns EBUSY if any DMA maps remain associated with dmat or ‘0’
on success.
bus_dmamap_create(dmat, flags, *mapp)
Allocates and initializes a DMA map. Arguments are as follows:
dmat DMA tag.
flags The value of this argument is currently undefined and
should be specified as ‘0’.
mapp Pointer to a bus_dmamap_t where the resulting DMA map
will be stored.
Returns ENOMEM if sufficient memory is not available for creating
the map or allocating mapping resources.
bus_dmamap_destroy(dmat, map)
Frees all resources associated with a given DMA map. Arguments
are as follows:
dmat DMA tag used to allocate map.
map The DMA map to destroy.
Returns EBUSY if a mapping is still active for map.
bus_dmamap_load(dmat, map, buf, buflen, *callback, callback_arg, flags)
Creates a mapping in device visible address space of buflen bytes
of buf, associated with the DMA map map. This call will always
return immediately and will not block for any reason. Arguments
are as follows:
dmat DMA tag used to allocate map.
map A DMA map without a currently active mapping.
buf A kernel virtual address pointer to a contiguous (in KVA)
buffer, to be mapped into device visible address space.
buflen The size of the buffer.
callbackcallback_arg
The callback function, and its argument. This function
is called once sufficient mapping resources are available
for the DMA operation. If resources are temporarily
unavailable, this function will be deferred until later,
but the load operation will still return immediately to
the caller. Thus, callers should not assume that the
callback will be called before the load returns, and code
should be structured appropriately to handle this. See
below for specific flags and error codes that control
this behavior.
flags Are as follows:
BUS_DMA_NOWAIT The load should not be deferred in case
of insufficient mapping resources, and
instead should return immediately with an
appropriate error.
Return values to the caller are as follows:
0 The callback has been called and completed. The
status of the mapping has been delivered to the
callback.
EINPROGRESS The mapping has been deferred for lack of resources.
The callback will be called as soon as resources are
available. Callbacks are serviced in FIFO order.
To ensure that ordering is guaranteed, all
subsequent load requests will also be deferred until
all callbacks have been processed.
ENOMEM The load request has failed due to insufficient
resources, and the caller specifically used the
BUS_DMA_NOWAIT flag.
EINVAL The load request was invalid. The callback has been
called and has been provided the same error. This
error value may indicate that dmat, map, buf, or
callback were invalid, or buflen was larger than the
maxsize argument used to create the dma tag dmat.
When the callback is called, it is presented with an error value
indicating the disposition of the mapping. Error may be one of
the following:
0 The mapping was successful and the dm_segs callback
argument contains an array of bus_dma_segment_t
elements describing the mapping. This array is only
valid during the scope of the callback function.
EFBIG A mapping could not be achieved within the segment
constraints provided in the tag even though the
requested allocation size was less than maxsize.
bus_dmamap_load_mbuf(dmat, map, mbuf, callback2, callback_arg, flags)
This is a variation of bus_dmamap_load() which maps mbuf chains
for DMA transfers. A bus_size_t argument is also passed to the
callback routine, which contains the mbuf chain’s packet header
length. The BUS_DMA_NOWAIT flag is implied, thus no callback
deferral will happen.
Mbuf chains are assumed to be in kernel virtual address space.
Beside the error values listed for bus_dmamap_load(), EINVAL will
be returned if the size of the mbuf chain exceeds the maximum
limit of the DMA tag.
bus_dmamap_load_mbuf_sg(dmat, map, mbuf, segs, nsegs, flags)
This is just like bus_dmamap_load_mbuf() except that it returns
immediately without calling a callback function. It is provided
for efficiency. The scatter/gather segment array segs is
provided by the caller and filled in directly by the function.
The nsegs argument is returned with the number of segments filled
in. Returns the same errors as bus_dmamap_load_mbuf().
bus_dmamap_load_uio(dmat, map, uio, callback2, callback_arg, flags)
This is a variation of bus_dmamap_load() which maps buffers
pointed to by uio for DMA transfers. A bus_size_t argument is
also passed to the callback routine, which contains the size of
uio, i.e. uio->uio_resid. The BUS_DMA_NOWAIT flag is implied,
thus no callback deferral will happen. Returns the same errors
as bus_dmamap_load().
If uio->uio_segflg is UIO_USERSPACE, then it is assumed that the
buffer, uio is in uio->uio_td->td_proc’s address space. User
space memory must be in-core and wired prior to attempting a map
load operation. Pages may be locked using vslock(9).
bus_dmamap_unload(dmat, map)
Unloads a DMA map. Arguments are as follows:
dmat DMA tag used to allocate map.
map The DMA map that is to be unloaded.
bus_dmamap_unload() will not perform any implicit synchronization
of DMA buffers. This must be done explicitly by a call to
bus_dmamap_sync() prior to unloading the map.
bus_dmamap_sync(dmat, map, op)
Performs synchronization of a device visible mapping with the CPU
visible memory referenced by that mapping. Arguments are as
follows:
dmat DMA tag used to allocate map.
map The DMA mapping to be synchronized.
op Type of synchronization operation to perform. See the
definition of bus_dmasync_op_t for a description of the
acceptable values for op.
The bus_dmamap_sync() function is the method used to ensure that
CPU’s and device’s direct memory access (DMA) to shared memory is
coherent. For example, the CPU might be used to set up the
contents of a buffer that is to be made available to a device.
To ensure that the data are visible via the device’s mapping of
that memory, the buffer must be loaded and a DMA sync operation
of BUS_DMASYNC_PREWRITE must be performed after the CPU has
updated the buffer and before the device access is initiated. If
the CPU modifies this buffer again later, another
BUS_DMASYNC_PREWRITE sync operation must be performed before an
additional device access. Conversely, suppose a device updates
memory that is to be read by a CPU. In this case, the buffer
must be loaded, and a DMA sync operation of BUS_DMASYNC_PREREAD
must be performed before the device access is initiated. The CPU
will only be able to see the results of this memory update once
the DMA operation has completed and a BUS_DMASYNC_POSTREAD sync
operation has been performed.
If read and write operations are not preceded and followed by the
appropriate synchronization operations, behavior is undefined.
bus_dmamem_alloc(dmat, **vaddr, flags, *mapp)
Allocates memory that is mapped into KVA at the address returned
in vaddr and that is permanently loaded into the newly created
bus_dmamap_t returned via mapp. Arguments are as follows:
dmat DMA tag describing the constraints of the DMA mapping.
vaddr Pointer to a pointer that will hold the returned KVA
mapping of the allocated region.
flags Flags are defined as follows:
BUS_DMA_WAITOK The routine can safely wait (sleep)
for resources.
BUS_DMA_NOWAIT The routine is not allowed to wait for
resources. If resources are not
available, ENOMEM is returned.
BUS_DMA_COHERENT
Attempt to map this memory such that
cache sync operations are as cheap as
possible. This flag is typically set
on memory that will be accessed by
both a CPU and a DMA engine,
frequently. Use of this flag does not
remove the requirement of using
bus_dmamap_sync, but it may reduce the
cost of performing these operations.
The BUS_DMA_COHERENT flag is currently
implemented on sparc64 and arm.
BUS_DMA_ZERO Causes the allocated memory to be set
to all zeros.
mapp Pointer to a bus_dmamap_t where the resulting DMA map
will be stored.
The size of memory to be allocated is maxsize as specified in the
call to bus_dma_tag_create() for dmat.
The current implementation of bus_dmamem_alloc() will allocate
all requests as a single segment.
An initial load operation is required to obtain the bus address
of the allocated memory, and an unload operation is required
before freeing the memory, as described below in
bus_dmamem_free(). Maps are automatically handled by this
function and should not be explicitly allocated or destroyed.
Although an explicit load is not required for each access to the
memory referenced by the returned map, the synchronization
requirements as described in the bus_dmamap_sync() section still
apply and should be used to achieve portability on architectures
without coherent buses.
Returns ENOMEM if sufficient memory is not available for
completing the operation.
bus_dmamem_free(dmat, *vaddr, map)
Frees memory previously allocated by bus_dmamem_alloc(). Any
mappings will be invalidated. Arguments are as follows:
dmat DMA tag.
vaddr Kernel virtual address of the memory.
map DMA map to be invalidated.

RETURNVALUES

Behavior is undefined if invalid arguments are passed to any of the above
functions. If sufficient resources cannot be allocated for a given
transaction, ENOMEM is returned. All routines that are not of type void
will return 0 on success or an error code on failure as discussed above.
All void routines will succeed if provided with valid arguments.

LOCKING

Two locking protocols are used by bus_dma. The first is a private global
lock that is used to synchronize access to the bounce buffer pool on the
architectures that make use of them. This lock is strictly a leaf lock
that is only used internally to bus_dma and is not exposed to clients of
the API.
The second protocol involves protecting various resources stored in the
tag. Since almost all bus_dma operations are done through requests from
the driver that created the tag, the most efficient way to protect the
tag resources is through the lock that the driver uses. In cases where
bus_dma acts on its own without being called by the driver, the lock
primitive specified in the tag is acquired and released automatically.
An example of this is when the bus_dmamap_load() callback function is
called from a deferred context instead of the driver context. This means
that certain bus_dma functions must always be called with the same lock
held that is specified in the tag. These functions include:
bus_dmamap_load()
bus_dmamap_load_uio()
bus_dmamap_load_mbuf()
bus_dmamap_load_mbuf_sg()
bus_dmamap_unload()
bus_dmamap_sync()
There is one exception to this rule. It is common practice to call some
of these functions during driver start-up without any locks held. So
long as there is a guarantee of no possible concurrent use of the tag by
different threads during this operation, it is safe to not hold a lock
for these functions.
Certain bus_dma operations should not be called with the driver lock
held, either because they are already protected by an internal lock, or
because they might sleep due to memory or resource allocation. The
following functions must not be called with any non-sleepable locks held:
bus_dma_tag_create()
bus_dmamap_create()
bus_dmamem_alloc()
All other functions do not have a locking protocol and can thus be called
with or without any system or driver locks held.

HISTORY

The bus_dma interface first appeared in NetBSD 1.3.
The bus_dma API was adopted from NetBSD for use in the CAM SCSI
subsystem. The alterations to the original API were aimed to remove the
need for a bus_dma_segment_t array stored in each bus_dmamap_t while
allowing callers to queue up on scarce resources.