Continue untangling the disklabel. Have most disk device drivers fill out
and install a generic disk_info structure instead of filling out random
fields in the disklabel.
The generic disk_info structure uses a 64 bit integer to represent
the media size in bytes or total sector count.

Change the kernel dev_t, representing a pointer to a specinfo structure,
to cdev_t. Change struct specinfo to struct cdev. The name 'cdev' was taken
from FreeBSD. Remove the dev_t shim for the kernel.
This commit generally removes the overloading of 'dev_t' between userland and
the kernel.
Also fix a bug in libkvm where a kernel dev_t (now cdev_t) was not being
properly converted to a userland dev_t.

MASSIVE reorganization of the device operations vector. Change cdevsw
to dev_ops. dev_ops is a syslink-compatible operations vector structure
similar to the vop_ops structure used by vnodes.
Remove a huge number of instances where a thread pointer is still being
passed as an argument to various device ops and other related routines.
The device OPEN and IOCTL calls now take a ucred instead of a thread pointer,
and the CLOSE call no longer takes a thread pointer.

Make the entire BUF/BIO system BIO-centric instead of BUF-centric. Vnode
and device strategy routines now take a BIO and must pass that BIO to
biodone(). All code which previously managed a BUF undergoing I/O now
manages a BIO.
The new BIO-centric algorithms allow BIOs to be stacked, where each layer
represents a block translation, completion callback, or caller or device
private data. This information is no longer overloaded within the BUF.
Translation layer linkages remain intact as a 'cache' after I/O has completed.
The VOP and DEV strategy routines no longer make assumptions as to which
translated block number applies to them. The use the block number in the
BIO specifically passed to them.
Change the 'untranslated' constant to NOOFFSET (for bio_offset), and
(daddr_t)-1 (for bio_blkno). Rip out all code that previously set the
translated block number to the untranslated block number to indicate
that the translation had not been made.
Rip out all the cluster linkage fields for clustered VFS and clustered
paging operations. Clustering now occurs in a private BIO layer using
private fields within the BIO.
Reformulate the vn_strategy() and dev_dstrategy() abstraction(s). These
routines no longer assume that bp->b_vp == the vp of the VOP operation, and
the dev_t is no longer stored in the struct buf. Instead, only the vp passed
to vn_strategy() (and related *_strategy() routines for VFS ops), and
the dev_t passed to dev_dstrateg() (and related *_strategy() routines for
device ops) is used by the VFS or DEV code. This will allow an arbitrary
number of translation layers in the future.
Create an independant per-BIO tracking entity, struct bio_track, which
is used to determine when I/O is in-progress on the associated device
or vnode.
NOTE: Unlike FreeBSD's BIO work, our struct BUF is still used to hold
the fields describing the data buffer, resid, and error state.
Major-testing-by: Stefan Krueger

Device layer rollup commit.
* cdevsw_add() is now required. cdevsw_add() and cdevsw_remove() may specify
a mask/match indicating the range of supported minor numbers. Multiple
cdevsw_add()'s using the same major number, but distinctly different
ranges, may be issued. All devices that failed to call cdevsw_add() before
now do.
* cdevsw_remove() now automatically marks all devices within its supported
range as being destroyed.
* vnode->v_rdev is no longer resolved when the vnode is created. Instead,
only v_udev (a newly added field) is resolved. v_rdev is resolved when
the vnode is opened and cleared on the last close.
* A great deal of code was making rather dubious assumptions with regards
to the validity of devices associated with vnodes, primarily due to
the persistence of a device structure due to being indexed by (major, minor)
instead of by (cdevsw, major, minor). In particular, if you run a program
which connects to a USB device and then you pull the USB device and plug
it back in, the vnode subsystem will continue to believe that the device
is open when, in fact, it isn't (because it was destroyed and recreated).
In particular, note that all the VFS mount procedures now check devices
via v_udev instead of v_rdev prior to calling VOP_OPEN(), since v_rdev
is NULL prior to the first open.
* The disk layer's device interaction has been rewritten. The disk layer
(i.e. the slice and disklabel management layer) no longer overloads
its data onto the device structure representing the underlying physical
disk. Instead, the disk layer uses the new cdevsw_add() functionality
to register its own cdevsw using the underlying device's major number,
and simply does NOT register the underlying device's cdevsw. No
confusion is created because the device hash is now based on
(cdevsw,major,minor) rather then (major,minor).
NOTE: This also means that underlying raw disk devices may use the entire
device minor number instead of having to reserve the bits used by the disk
layer, and also means that can we (theoretically) stack a fully
disklabel-supported 'disk' on top of any block device.
* The new reference counting scheme prevents this by associating a device
with a cdevsw and disconnecting the device from its cdevsw when the cdevsw
is removed. Additionally, all udev2dev() lookups run through the cdevsw
mask/match and only successfully find devices still associated with an
active cdevsw.
* Major work on MFS: MFS no longer shortcuts vnode and device creation. It
now creates a real vnode and a real device and implements real open and
close VOPs. Additionally, due to the disk layer changes, MFS is no longer
limited to 255 mounts. The new limit is 16 million. Since MFS creates a
real device node, mount_mfs will now create a real /dev/mfs<PID> device
that can be read from userland (e.g. so you can dump an MFS filesystem).
* BUF AND DEVICE STRATEGY changes. The struct buf contains a b_dev field.
In order to properly handle stacked devices we now require that the b_dev
field be initialized before the device strategy routine is called. This
required some additional work in various VFS implementations. To enforce
this requirement, biodone() now sets b_dev to NODEV. The new disk layer
will adjust b_dev before forwarding a request to the actual physical
device.
* A bug in the ISO CD boot sequence which resulted in a panic has been fixed.
Testing by: lots of people, but David Rhodus found the most aggregious bugs.

proc->thread stage 2: MAJOR revamping of system calls, ucred, jail API,
and some work on the low level device interface (proc arg -> thread arg).
As -current did, I have removed p_cred and incorporated its functions
into p_ucred. p_prison has also been moved into p_ucred and adjusted
accordingly. The jail interface tests now uses ucreds rather then processes.
The syscall(p,uap) interface has been changed to just (uap). This is inclusive
of the emulation code. It makes little sense to pass a proc pointer around
which confuses the MP readability of the code, because most system call code
will only work with the current process anyway. Note that eventually
*ALL* syscall emulation code will be moved to a kernel-protected userland
layer because it really makes no sense whatsoever to implement these
emulations in the kernel.
suser() now takes no arguments and only operates with the current process.
The process argument has been removed from suser_xxx() so it now just takes
a ucred and flags.
The sysctl interface was adjusted somewhat.