jme: Don't immediately recycle the TX descriptor even if it is owned by us.

This chip will always update the TX descriptor's 32bits fields in order,
so even if the status field has been updated, i.e. OWN is cleared, it still
does not mean that the buflen field has been updated. To avoid this race
we don't immediately recycle the currently checking TX descriptor. Instead,
next TX descriptor's OWN bit is checked, if it is cleared, then the updating
of the currently checked TX descrptor is really done.

This is intended to fix the seldom watchdog timeout that was observed on this
chip.

Thank devinchiu@jmicron.com very much for providing necessary information.

* Add uiomovebp(), a version of uiomove() which is aware of a locked bp
representing the to or from buffer and can work-around issues related
to VM faults causing recursions and deadlocks on the user buffer.

uiomovebp() does not yet detect or handle deadlocks. Implementing
deadlock handling will require a certain degree of finess related to
the vnode and bp locks and we don't want to have to do it unless we
actually deadlock. TODO.

* When obtaining a LK_SHARED lock in a situation where you already own the
lock LK_EXCLUSIVE, lockmgr would downgrade the lock to shared.

This creates a very serious problem when large procedural recursions get
a lock that is already being held exclusively but request a shared lock.
When these recursions return the original top level will find its lock is
no longer exclusive.

* This problem occured with vnode locks when a VOP_WRITE operation on a mmap'd
space causes a VM fault which then turns around and issues a read().
When the fault returns the vnode wound up locked shared instead of
exclusive.

* Fix the problem by NOT downgrading an exclusive lock to shared when
recursing on LK_SHARED. Simply add another count to the exclusive
lock.

* vm_object_page_collect_flush() was trying to re-protect VM pages that
were still marked dirty after pageout I/O was initiated without owning
the BUSY bit on the page. This operation could race whatever I/O was
going on and multiple issues. Remove the re-protect.

Just don't do it. It's an unnecessary operation. We still re-set
PG_CLEANCHK on the page and that should be fine insofar as the pageout
daemon goes.

With disks >2TB this step will lead to some whining from fdisk(8), but
since we'll boot anyway (with 34ea800d, we resort to the media size
when a maxed out slice size is detected), just ignore any issues which
fdisk(8) reports in this case.

After this commit, installing and booting from that installation on a
disk which is >2TB will work (tested with a 3TB ahci attached drive as
well as with a 4.5TB hptrr(4) RAID).

However, we are subtracting 1 from it (presumably because it's a "last
sector on the device" value starting at 0) so in CAM, it ended up being
0xfffffffe, resulting in disks attached via ahci(4) and sili(4) to be
limited to 2TB.

To fix, set the local var to 0 in this case, so that after subtracting 1
from the value (cast to 32 bit) CAM gets 0xffffffff.

* When boot loader support is compiled w/ UFS and HAMMER together, which
is the default (note: HAMMER booting's never worked well)... the probe
order was to check for the hammer volume header first and UFS second.

* Change the probe order to check for UFS first and HAMMER second. The
reason is that a 'newfs' (for UFS) doesn't wipe the hammer volume header
because the ufs's newfs tries to 'skip' the partition reserved area of
the disk.

This is a huge throwback to the original BSD fdisk/disklabel which put
the boot code INSIDE the 'a' partition.

* The DragonFly disklabel64 (which is now the default) does not have this
problem so we could probably at some point adjust the UFS 'newfs' code to
wipe the old 'reserve' area to really put a cap on the problem.

With following modification:
These chips can handle ip.ip_len and tcphdr.th_sum, if they are setup
according to Microsoft LSO specification, so ip.ip_len should not be
cleared and tcphdr.th_sum should be left as it is.

Having a man page disappear before the read the end (if it indeed appears
at all) is pretty annoying. Equally annoying is using the "more" command
on a file with few lines only to see nothing. The default behavior of
xterm console had few fans.

This removes the smcup and rmcup codes from xterm-basic, the basis for
the xterm console definitions. Now man pages are left on the screen
after viewing, and the pagers work as expected.

The RX max coalesce BDs is limited to 255, which means that the chip will
generate ~5800 interrupts/s when it sinks 1.48Mpps tiny packets. However,
interrupt rate at 4000Hz is already enough for the chip to sink 1.48Mpps
tiny packets, so ticks based RX interrupt moderation should be prefered.