Re: write alignment matters?

On Jun 22, 2011, at 08:28 , der Mouse wrote:
> The issue for me is not that the hardware does or doesn't have
> alignment restrictions. It's that they show through to userland (and
> in a very peculiar way). As someone mentioned upthread, it's possible
> what's going on is that this hardware has alignment issues (at least
> when used with our sequencer program) the driver _doesn't_ deal with.
Back in the day (my memory is that this goes back to v7 UNIX, and probably
earlier) "raw" device driver access had various significant restrictions
imposed by the device which were more or less unmitigated by the kernel. The
classic example is disk drivers - the man pages basically said, "don't expect
to be able to do I/O to these devices with buffers and I/O sizes of anything
other than integer multiples of 512 bytes." Sometimes the man pages were
explicit: if you wanna do disk I/O in any other random size, use the block
interface and let the kernel handle it for you.
Now, at minimum, these restrictions need to be documented, on a per device
basis, right there in the man page for each device. UNIX has a tradition of man
pages which discuss device limitations (bugs) frankly, and we should continue
that to encourage better device design and purchases. Call a spade, "a spade."
After that comes a series of design decisions over who has to do what in order
to make this work, and workable: the kernel, the libraries, and the user space
application, again, on a per-device basis (or at least on a per "class of
device" basis).
If the kernel does nothing at all, the application programmer is stuck more or
less writing a device driver for user space. If we're going to impose that on
them, our libraries (and the documentation thereof) had better be very helpful;
e.g., calloc(3) mentions alignment on a more or less "tell me what you wanna
store, and I'll align for it" basis, but isn't more explicit than that; we
might need a new API for explicitly aligned buffers, even if all that's
required is a macro wrapper on calloc(3).
There are endian swapping macros from the BSD networking code ... but those are
specified in terms of Internet endianness (big-endian) versus ... whatever you
have on the host. Not quite what you'd need for dealing with a little-endian
device on a big-endian host, for example. NetBSD 5's byteorder(3) man page
doesn't have a reference to bswap(3), but I see that -current now does, as of a
bit more than a month ago (thank you, jruoho).
I'd rather not think about processor cache coherency & flushes to RAM from user
space.
Also, given that we often go to raw device access for performance, what
warnings should the kernel give if what the application has asked for will
hurt, e.g. in order for a given I/O request to work, the kernel has to do byte
copies into bounce buffers? Or, for der Mouse's case, unaligned I/O that might
result in garbage?
The question at the bottom (even if der Mouse's problem turns out merely to be
a bug) is: to what degree do we want even the kernel raw device interface to a
particular device to correct for that device's idiosyncrasies and thus make a
flawed device behave at least as well from a programmer's perspective as all
other devices in its class (disk drivers, serial devices, network interfaces,
etc.)? How much abstraction will we provide in raw device access?
Erik <fair%netbsd.org@localhost>