Remove the pre-NEWPCM sound drivers and the speaker-based emulations.
In detail the devices css, gus, gusxvi, mpu, mss, opl, pas, sb, snd,
speaker, sscape, sscape_mss, trix and uart are no longer supported.
They are ISA-only devices and the more common devices are supported by
NEWPCM too. speaker and pca collide with the timer use of the default
kernel and have been broken for a while now.

Device layer rollup commit.
* cdevsw_add() is now required. cdevsw_add() and cdevsw_remove() may specify
a mask/match indicating the range of supported minor numbers. Multiple
cdevsw_add()'s using the same major number, but distinctly different
ranges, may be issued. All devices that failed to call cdevsw_add() before
now do.
* cdevsw_remove() now automatically marks all devices within its supported
range as being destroyed.
* vnode->v_rdev is no longer resolved when the vnode is created. Instead,
only v_udev (a newly added field) is resolved. v_rdev is resolved when
the vnode is opened and cleared on the last close.
* A great deal of code was making rather dubious assumptions with regards
to the validity of devices associated with vnodes, primarily due to
the persistence of a device structure due to being indexed by (major, minor)
instead of by (cdevsw, major, minor). In particular, if you run a program
which connects to a USB device and then you pull the USB device and plug
it back in, the vnode subsystem will continue to believe that the device
is open when, in fact, it isn't (because it was destroyed and recreated).
In particular, note that all the VFS mount procedures now check devices
via v_udev instead of v_rdev prior to calling VOP_OPEN(), since v_rdev
is NULL prior to the first open.
* The disk layer's device interaction has been rewritten. The disk layer
(i.e. the slice and disklabel management layer) no longer overloads
its data onto the device structure representing the underlying physical
disk. Instead, the disk layer uses the new cdevsw_add() functionality
to register its own cdevsw using the underlying device's major number,
and simply does NOT register the underlying device's cdevsw. No
confusion is created because the device hash is now based on
(cdevsw,major,minor) rather then (major,minor).
NOTE: This also means that underlying raw disk devices may use the entire
device minor number instead of having to reserve the bits used by the disk
layer, and also means that can we (theoretically) stack a fully
disklabel-supported 'disk' on top of any block device.
* The new reference counting scheme prevents this by associating a device
with a cdevsw and disconnecting the device from its cdevsw when the cdevsw
is removed. Additionally, all udev2dev() lookups run through the cdevsw
mask/match and only successfully find devices still associated with an
active cdevsw.
* Major work on MFS: MFS no longer shortcuts vnode and device creation. It
now creates a real vnode and a real device and implements real open and
close VOPs. Additionally, due to the disk layer changes, MFS is no longer
limited to 255 mounts. The new limit is 16 million. Since MFS creates a
real device node, mount_mfs will now create a real /dev/mfs<PID> device
that can be read from userland (e.g. so you can dump an MFS filesystem).
* BUF AND DEVICE STRATEGY changes. The struct buf contains a b_dev field.
In order to properly handle stacked devices we now require that the b_dev
field be initialized before the device strategy routine is called. This
required some additional work in various VFS implementations. To enforce
this requirement, biodone() now sets b_dev to NODEV. The new disk layer
will adjust b_dev before forwarding a request to the actual physical
device.
* A bug in the ISO CD boot sequence which resulted in a panic has been fixed.
Testing by: lots of people, but David Rhodus found the most aggregious bugs.

device switch 1/many: Remove d_autoq, add d_clone (where d_autoq was).
d_autoq was used to allow the device port dispatch to mix old-style synchronous
calls with new style messaging calls within a particular device. It was never
used for that purpose.
d_clone will be more fully implemented as work continues. We are going to
install d_port in the dev_t (struct specinfo) structure itself and d_clone
will be needed to allow devices to 'revector' the port on a minor-number
by minor-number basis, in particular allowing minor numbers to be directly
dispatched to distinct threads. This is something we will be needing later
on.

This commit represents a major revamping of the clock interrupt and timebase
infrastructure in DragonFly.
* Rip out the existing 8254 timer 0 code, and also disable the use of
Timer 2 (which means that the PC speaker will no longer go beep). Timer 0
used to represent a periodic interrupt and a great deal of code was in
place to attempt to obtain a timebase off of that periodic interrupt.
Timer 0 is now used in software retriggerable one-shot mode to produce
variable-delay interrupts. A new hardware interrupt clock abstraction
called SYSTIMERS has been introduced which allows threads to register
periodic or one-shot interrupt/IPI callbacks at approximately 1uS
granularity.
Timer 2 is now set in continuous periodic mode with a period of 65536
and provides the timebase for the system, abstracted to 32 bits.
All the old platform-integrated hardclock() and statclock() code has
been rewritten. The old IPI forwarding code has been #if 0'd out and
will soon be entirely removed (the systimer abstraction takes care of
multi-cpu registrations now). The architecture-specific clkintr() now
simply calls an entry point into the systimer and provides a Timer 0
reload and Timer 2 timebase function API.
* On both UP and SMP systems, cpus register systimer interrupts for the Hz
interrupt, the stat interrupt, and the scheduler round-robin interrupt.
The abstraction is carefully designed to allow multiple interrupts occuring
at the same time to be processed in a single hardware interrupt. While
we currently use IPI's to distribute requested interrupts from other cpu's,
the intent is to use the abstraction to take advantage of per-cpu timers
when available (e.g. on the LAPIC) in the future.
systimer interrupts run OUTSIDE THE MP LOCK. Entry points may be called
from the hard interrupt or via an IPI message (IPI messages have always
run outside the MP lock).
* Rip out timecounters and disable alternative timecounter code for other
time sources. This is temporary. Eventually other time sources, such as
the TSC, will be reintegrated as independant, parallel-running entities.
There will be no 'time switching' per-say, subsystems will be able to
select which timebase they wish to use. It is desireable to reintegrate
at least the TSC to improve [get]{micro,nano}[up]time() performance.
WARNING: PPS events may not work properly. They were not removed, but
they have not been retested with the new code either.
* Remove spl protection around [get]{micro,nano}[up]time() calls, they are
now internally protected.
* Use uptime instead of realtime in certain CAM timeout tests
* Remove struct clockframe. Use struct intrframe everywhere where clockframe
used to be used.
* Replace most splstatclock() protections with crit_*() protections, because
such protections must now also protect against IPI messaging interrupts.
* Add fields to the per-cpu globaldata structure to access timebase related
information using only a critical section rather then a mutex. However,
the 8254 Timer 2 access code still uses spin locks. More work needs to
be done here, the 'realtime' correction is still done in a single global
'struct timespec basetime' structure.
* Remove the CLKINTR_PENDING icu and apic interrupt hacks.
* Augment the IPI Messaging code to make an intrframe available to callbacks.
* Document 8254 timing modes in i386/sai/timerreg.h. Note that at the
moment we assume an 8254 instead of an 8253 as we are using TIMER_SWSTROBE
mode. This may or may not have to be changed to an 8253 mode.
* Integrate the NTP correction code into the new timebase subsystem.
* Separate boottime from basettime. Once boottime is believed to be stable
it is no longer effected by NTP or other time corrections.
CAVETS:
* PC speaker no longer works
* Profiling interrupt rate not increased (it needs work to be
made operational on a per-cpu basis rather then system-wide).
* The native timebase API is function-based, but currently hardwired.
* There might or might not be issues with 486 systems due to the
timer mode I am using.

kernel tree reorganization stage 1: Major cvs repository work (not logged as
commits) plus a major reworking of the #include's to accomodate the
relocations.
* CVS repository files manually moved. Old directories left intact
and empty (temporary).
* Reorganize all filesystems into vfs/, most devices into dev/,
sub-divide devices by function.
* Begin to move device-specific architecture files to the device
subdirs rather then throwing them all into, e.g. i386/include
* Reorganize files related to system busses, placing the related code
in a new bus/ directory. Also move cam to bus/cam though this may
not have been the best idea in retrospect.
* Reorganize emulation code and place it in a new emulation/ directory.
* Remove the -I- compiler option in order to allow #include file
localization, rename all config generated X.h files to use_X.h to
clean up the conflicts.
* Remove /usr/src/include (or /usr/include) dependancies during the
kernel build, beyond what is normally needed to compile helper
programs.
* Make config create 'machine' softlinks for architecture specific
directories outside of the standard <arch>/include.
* Bump the config rev.
WARNING! after this commit /usr/include and /usr/src/sys/compile/*
should be regenerated from scratch.

DEV messaging stage 1/4: Rearrange struct cdevsw and add a message port
and auto-queueing mask. The mask will tell us which message functions
can be safely queued to another thread and which still need to run in the
context of the caller. Primary configuration fields (name, cmaj, flags,
port, autoq mask) are now at the head of the structure. Function vectors,
which may eventually go away, are at the end. The port and autoq fields
are non-functional in this stage.
The old BDEV device major number support has also been removed from cdevsw,
and code has been added to translate the bootdev passed from the boot code
(the boot code has always passed the now defunct block device major numbers
and we obviously need to keep that compatibility intact).

Remove the priority part of the priority|flags argument to tsleep(). Only
flags are passed now. The priority was a user scheduler thingy that is not
used by the LWKT subsystem. For process statistics assume sleeps without
P_SINTR set to be disk-waits, and sleeps with it set to be normal sleeps.
This commit should not contain any operational changes.

MP Implementation 1/2: Get the APIC code working again, sweetly integrate the
MP lock into the LWKT scheduler, replace the old simplelock code with
tokens or spin locks as appropriate. In particular, the vnode interlock
(and most other interlocks) are now tokens. Also clean up a few curproc/cred
sequences that are no longer needed.
The APs are left in degenerate state with non IPI interrupts disabled as
additional LWKT work must be done before we can really make use of them,
and FAST interrupts are not managed by the MP lock yet. The main thing
for this stage was to get the system working with an APIC again.
buildworld tested on UP and 2xCPU/MP (Dell 2550)

proc->thread stage 2: MAJOR revamping of system calls, ucred, jail API,
and some work on the low level device interface (proc arg -> thread arg).
As -current did, I have removed p_cred and incorporated its functions
into p_ucred. p_prison has also been moved into p_ucred and adjusted
accordingly. The jail interface tests now uses ucreds rather then processes.
The syscall(p,uap) interface has been changed to just (uap). This is inclusive
of the emulation code. It makes little sense to pass a proc pointer around
which confuses the MP readability of the code, because most system call code
will only work with the current process anyway. Note that eventually
*ALL* syscall emulation code will be moved to a kernel-protected userland
layer because it really makes no sense whatsoever to implement these
emulations in the kernel.
suser() now takes no arguments and only operates with the current process.
The process argument has been removed from suser_xxx() so it now just takes
a ucred and flags.
The sysctl interface was adjusted somewhat.