* Rip out most of he VAGEx stuff. It might come back in another form
later.

* Split the vnode_free_list into three parts, separated by two markers
(vnode_free_mid1 and vnode_free_mid2).

* Insert vnodes on the free list based on the following. New vnodes
are allocated from the base of the list.

At the HEAD - If the vnode is VRECLAIMED (i.e. dead)
end of first section - If the vnode has no cached VM or SWAP data
end of second section - If the vnode has cached SWAP data and no cached VM
at the TAIL - If the vnode has cached VM data

* Implement a rover to slowly scan vnodes in the list when allocating
and shift them to the appropriate section. This fixes a degenerate
condition in the placement of the markers.

* A Vnode is removed and usually immediately reinserted whenever it
is accesesd by userland but not held open, giving us a LRU-like
algorithm within each section of the list but non-LRU-like transits
between sections of the list.

Transits between sections are determined more by how the VM system
recycles related VM cache pages. Cached SWAP data only occurs if
the swapcache is turned on.

* Future: Might use VAGE to implement a second go-around in the queue
or a burst re-placement in the queue when the data set is found to
be too big to fit.

* Adjust pmap_inval_init() to enter a critical section and add
a new pmap_inval_done() function which flushes and exits it.

It was possible for an interrupt or other preemptive action to
come along during a pmap operation and issue its own pmap operation,
potentially leading to races which corrupt the pmap.

This case was tested an could actually occur, though the damage (if any)
is unknown. x86_64 machines have had a long standing and difficult to
reproduce bug where a program would sometimes seg-fault for no reason.
It is unknown whether this fixes the bug or not.

* Interlock the pmap structure when invalidating pages using a bit
in the pm_active field.

Check for the interlock in swtch.s when switching into threads
and print a nice warning if it occurs.

It was possible for one cpu to initiate a pmap modifying operation
while another switches into a thread using the pmap the first cpu
was in the middle of modifying. The case is extremely rare but can
occur if the cpu doing the modifying operation receives a SMI
interrupt, stalling it long enough for the other cpu to switch
into the thread and resume running in userspace.

* pmap_protect() assumed no races when clearing PG_RW and PG_M due
to the pmap_inval operations it runs. This should in fact be
true with the above fixes. However, the rest of the pmap code
uses atomic operations so adjust pmap_protect() to also use atomic
operations.

Merge SUBDIRS and SUBDIRS3 and their LSYM* versions.
This also fixes three error messages caused by the wrong symlink
for dev/disk/mpt/mpilib, which doesn't actually belong to SUBDIRS3
because it has four directory components in the path.

We need to pack struct hammer_pseudofs_data as it had differing sizes
on 32-bit vs. 64-bit machines. As this structure is send over-the-wire
this lead to an early abort in the hammer mirroring code (cmd_mirror.c)
when mirroring a PFS from a 32-bit machine to a 64-bit machine or vice
versa as it sanity checks the packets it gets.

Even so the structure is stored on-media, the change in size is no
issue as the tail is zero-padded with reserved fields.

WARNING:

After this change, mirroring between 64-bit machines predating
this commit and updated 64-bit machines will no longer work.
PLEASE UPDATE them all in one go or leave them unmodified.
32-bit machines are not affected at all, as this commit does
not change the size of the structure for 32-bit machines.

acpi_thinkpad(4): revert the half-done rename of acpi_thinkpad to acpi_ibm

* 10f976 ("Sync ACPI with FreeBSD 7.2") from 2009-11-08 broke a number of
functionality in acpi_thinkpad, whereas there were no interesting changes
between acpi_ibm.c#rev1.15 and acpi_ibm.c#rev1.19 on FreeBSD
* `git show 20b3fb 32af04 10f976 acpi_thinkpad.c | patch -R -p5 acpi_thinkpad.c`
brings us back to da42c7

* The manual page is still a work in progress but I'm pushing in everything
I learn about SSDs into it as I learn them.

At least insofar as the Intel X25-V 40G SSD goes the vendor-specified
40TB write endurance limit appears to assume high write magnifications
and significant inefficiencies in write patterns. The theoretical
write endurance limit for this SSD with static wear leveling is 400TB.

My expectation is a practical endurance somewhere between 150-250TB
when configuring 32G of swap on the 40G X25-V. The manual page will be
updated as I get better numbers from testing.

* Specify that disklabel64 should be used when labeling a SSD, so
the partitions are properly aligned. Kernels as of id 4921cba1f6
(late 2.5.x) will align the partition base for virgin disklabel64
labels to a 1MB boundary.

* Someone suggested that instead of using a 32K alignment we use a larger
alignment. I forgot who suggested it but after thinking about it a bit
and messing around with swapcache on a SSD I decided it was a good idea.

SSDs using MLC flash have a physical block size of 128K. SLC flash has
a physical block size of 64K. Most typical cluster operations in
DragonFly are 64K to 128K but clustered writes are often linear on disk
leading to larger linear writes from the point of view of the physical
drive's write cache.

swapcache and swap operation tends to have even larger write linearities
and write amplification effects on SSDs can be reduced to better than 1:2
(verses the 1:10 the vendor typically assumes).

* Virgin disklabel64's layed down by the kernel will now align the
start of the partition space to 1MB (1024 * 1024). In for a penny,
in for a pound.

* Adjust the manual page and note the benefits of using a larger alignment,
particularly when swapcache is used with SSDs.

The algorithm selects a wait point based on the process's perceived
contribution to the inode load. The greater the contribution, the
more readily we stall the process in order to wait for related reclaims
to process.

Processes with lower loads have higher reclaim points and do not stall
as readily as they did before.

* Remove waitreclaims calls based on B-Tree scans. I'm not sure why I had
this in there but it was creating an excessive number of unnecessary
stalls, so if any problems crop up I'll have to find another way to deal
with them.

* These changes (particularly the first) should reduce unnecessary stalls
for the programs not doing heavy inode operations. Hopefully that means
rm -rf and tar extractions will not have as quite the detrimental effect
on other processes as they did before.

* When a vi session managed by a git commit is ^Z'd and then resumed, git
for some reason will set STDIN to O_NONBLOCKING. Not only will it do
this, but the git process will do it in parallel with the resume so
the point at which stdin becomes non-blocking from the point of view
of vi is completely non-determinisitc.

* Do an end-run around badly behaving parent processes by using extpread()
to explicitly override the blocking/non-blocking mode. Now nvi doesn't
care whether stdin is in non-blocking mode or not.

* Before this patch:
o The temperature values were absolutely random, since
the /data/ variable was never initialised from temp[i].
o The braces were missing from the fan error path of
an if statement.
o Whitespaces were used instead of tabs.

asctime.c: Set errno to EINVAL and return "??? ??? ?? ??:??:?? ????\n" if
asctime_r() is called with a NULL struct tm pointer.
(Note that asctime_r() is called by ctime_r() and asctime();
asctime() is called by ctime().)

localtime.c: Set errno to EINVAL and return WRONG if time1() is called with
a NULL struct tm pointer; avoid dereference if a NULL struct
tm pointer is passed to timelocal(), timegm(), or timeoff().
(Note that time1 is called by mktime(), timegm(), and timeoff();
mktime is called by timelocal().)

* Allow directory hierarchies to be selected for data caching when
using vm.swapcache.data_enable.

* Add the vm.swapcache.use_chflags sysctl which defaults to ON and
enables use of the new chflags flags to determine what directory
trees the swapcache will cache data from.

* Add chflags cache and noscache. The flags are tracked recursively
by the namecache and do *NOT* have to be set recursively in the
directory tree. Setting a flag in a top-level directory is sufficient
to cover the entire subtree.

chflags cache - Any regular file in the subtree will be cached
by swapcache.

chflags noscache - Disables any swapcacheing of data in the subtree,
overrides any use of chflags cache in the subtree.

NOTE: Only applies to file data. The caching of file meta-data by
swapcache is controlled globally by vm.swapcache.meta_enable and
ignores chflags flags..

* Adjust the manual pages for swapcache and chflags.

* NOTE! The default has been changed to require the use of chflags, data
caching will not occur unless you either turn off the
vm.swapcache.use_chflags sysctl (which enables data caching globally)
or do something like 'chflags cache /'. Of course vm.swapcache.read_enable
must also be turned on for swapcache to cache file data.

* NOTE! World must be rebuilt for libc, chflags, and ls to understand the
new flags.

* Non-blocking reads from a BPF device not in immediate mode would not
rotate the buffers even if the was data in the store buffer, but not in
the hold buffer. So until the store buffer fills up, the reads would
return -1 and set errno to EWOULDBLOCK.

* vinitvmio() is responsible for assigning the initial VM object size based
on the file size. Adjust vinitvmio() to conform to the new nvextendbuf()
and nvtruncbuf() API.

* vinitvmio() has been given two additional parameters, blksize and boff,
to allow it to determine how much larger the VM object must be relative
to the byte-granular file size passed to it.

* Remove vm_page_alloc() and remove the pgo_alloc vector from struct
pagerops. Convert all the VM pager allocation procedures into global
procedures which are called directly. Trying to feed everything through
a single function was a joke when all the callers knew precisely what
kind of VM object they were creating anyway.

Add the extra arguments to vnode_pager_alloc() which vinitvmio() needs
to pass in.