* Fix a rare race condition where the acquisition of p_token in the
tsleep callout code can delay the setting of TDF_TIMEOUT, potentially
causing it to skip the current tsleep entirely and trigger on a later
tsleep.

If this occurs the later callout is not terminated and tsleep() can return
with it still active. The callout is declared on the kernel stack, leading
to the assertion and crash.

* During evaluation I noticed that the corrupted callout structure in
Rumko's crash dump contained information that indicated it was part of
a stack frame. I think only tsleep() declares callout structures on the
stack.

* Fix an issue where a cpu-bound user process running in a vkernel cannot
be interrupted from within the vkernel.

The problem occurs because the timer interrupt was not marked MPSAFE,
causing the interrupt thread to hold the MP token which then prevented
the thread preemption code from letting the timer interrupt thread
preempt the currently running user process.

* Fixed by marking the timer interrupt and other vkernel interrupt
handlers as being MPSAFE.

* This is a problem for the vkernel and not for normal kernel. Normal
kernels have a doreti function which 'catches' pending flags on
any attempt to return to userland.

The vkernel does not, instead relying on the preemption mechanic to
catch pending flags.

The reason why we had to lower WARNS to 0 for xlint/lint1 was that
gcc 4.4.2 was unable to fit the value for LDBL_MAX from <float.h>
into a long double on i386 due to some misconfiguration of the
compiler.

* The code to generate section __start_set and __end_set symbols
was using exp_provide() instead of exp_assign(), and exp_provide()
appears to silently discard the symbol due to being assigned to '.'
(the origin).

* Mark buffers created via write() B_NOTMETA if hammer's double buffer
mode is enabled. When both the hammer double buffer mode and swapcache
is enabled this will cause the system to re-read the file from disk
once (via the block device) before attempting to swapcache it.

* This allows swapcache to more efficiently cache file data without
vnode recycling from a limited kern.maxvnodes value getting in the way.

If you have a large dataset spread across many smaller files which would
normally overwhelm maxvnodes, and even on large systems handling very
large data sets where you wish to cache the file data for some of the
files (using use_chflags=1 mode), this makes it possible to cache ALL
the file data AND meta-data on the SSD even though the related vnodes
cached by the kernel get recycled.

* Whereas it may have been inefficient to turn on vm.swapcache.data_enable
before, due to filesystem scans and such, it may now be possible to this
feature on with double buffering also enabled.

Note that you must still be cognizant of the aggregate amount of file
data being accessed by your system if you have set use_chflags to 0, you
simply no longer need to worry about how many files that data belongs to.

* Enabling HAMMER's double_buffer mode will reduce performance somewhat for
the normal best-case file caching, but it will also greatly improve
performance once you start blowing out your memory caches.

* Allow this flag to be set for VM pages associated with regular files
too, the flag prevents the related VM page from being swapcache'd.

The flag is set by HAMMER on normal file buffer cache buffers when
double buffering is enabled to prevent swapcache from caching the
data twice.

* This also fixes an issue when a large number of files exceeding the
maxvnode limit are recycled, and double buffering is enabled along
with vm.swapcache.data_enable. We do not want swapcache to try to
cache the pages via the vnode, instead we'd rather it cache them
via the block device (whos vnode doesn't get recycled).

This problem was triggered by clamav. As the comment in zconf.h states,
we'd prefer to always define Z_HAVE_UNISTD_H, but libstand has some issues
with this, which is why we originally had the change to the vendor
source to include <unistd.h> in gzguts.h.

While we're here, there's no point in defining HAVE_MEMCPY in
Makefile.stand, since it's already defined elsewhere in zconf.h.

* Flapping on one of the member interfaces can cause the entire bridge
to go down due to all member interfaces entering a transient state.
For example, if openvpn is flapping the related tap interfaces will
go up and down without any actual packet traffic making it across.

With these changes openvpn flapping no longer makes the bridge
effectively non-operational.

* When a port is disabled or enabled either manually or due to a TAP
process going away / attaching, only issue a configuration update
when transitioning out of an active state.

Thus disabled<->l1blocking flip-flopping does not cause the other
member interfaces to change state.

* Also change the initial state setup when LINK1 is flagged.
Go into the L1BLOCKING state instead of the BLOCKING state.

If a thread has a hold on a vm_object and enters hold_wait (via either
vm_object_terminate or vm_object_collapse), it will wait forever for the hold
count to hit 0. Record the threads holding an object in a per-object array.

* Consolidate the unicast target interface selection code into a single
routine.

* bridge_start() now calls the unicast target interface selection routine.
Prior to this packets originated on the machine containing the bridge
were not selecting the proper target interface when bonding was operational,
and would also not select a backup interface if the learned target
interface went offline.

* Fix an issue where we were assuming that a root bridge receiving a
configuration packet from a remote bridge would get a path cost
that already include the root bridge's path cost for that port.
In fact the target bridge only includes an aggregate path cost to
root (typically the lowest path cost of all the target bridge's
ports), which is a fixed value.

8254: Don't setup 8254 interrupt, if it is not selected as interrupt cputimer

8254 interrupt is setup, mainly to support C-state > C1, however, on some
systems it could cause system freezing during boot. Change the default value
of hw.i8254.intr_disable to 1, so more systems could boot by default.

* Fix a conditional that was testing the wrong field when determining
how to reflush a directory or regular file. This could prevent sync
sequences from properly syncing the entire dependency chain during
heavy filesystem activity.

* No known bugs are related to this fix as the chains would get flushed
normally by the filesystem syncer eventually.

* Rearrange the handling of TDF_RUNNING, making lwkt_switch() responsible
for it instead of the assembly switch code. Adjust td->td_switch() to
return the previously running thread.

This allows lwkt_switch() to process thread migration between cpus after
the thread has been completely and utterly switched out, removing the
need to loop on TDF_RUNNING on the target cpu.

* Fixes lwkt_setcpu_remote livelock failure

* This required major surgery on the core thread switch assembly, testing
is needed. I tried to avoid doing this but the livelock problems persisted,
so the only solution was to remove the need for the loops that were causing
the livelocks.

* NOTE: The user process scheduler is still using the old giveaway/acquire
method. More work is needed here.