Add a dummy offset to the arrays generated by genassym to avoid ary[0]

The dummy offset avoids the generation of dummy arrays of size zero.
This whole code path is a hack, but after a lot of messing around
Alex and I determined that it was easier to hack it then to try to
redo the code due to complications introduced by cross-compiled
environments.

1) remove uses of __label__, which is not supported by llvm/clang
2) remove uses of register type var __asm("ecx") and other variable
register-binding as it is not supported by llvm/clang and is superfluous
3) add an ugly hack, conditionalized on __clang__, to allow correct
compilation of atomic_intr_cond_try()

Record that a vm_map_entry is a stack mapping. When locating free space
do not allow non-MAP_STACK mappings to use space reserved by MAP_STACK
mappings, unless MAP_FIXED is used of course.

Previously MAP_STACK mappings implied MAP_FIXED, which is not how they are
supposed to work. Implement proper hinting without MAP_FIXED.

Do not allow a normal mmap() call to use space reserved by a MAP_STACK
mapping (unless MAP_FIXED is used of course).

The proper method of making a MAP_STACK mapping inside another MAP_STACK
mapping is to use MAP_STACK | MAP_TRYFIXED. For now, though, we silently
imply MAP_TRYFIXED when MAP_STACK is specified and it will work without it.

Document MAP_TRYFIXED and make it also relax additional requirements imposed
by MAP_STACK mappings inside of MAP_STACK mappings.

Fix libthread_xu's use of MAP_STACK. Guards were not being setup properly.

MAP_STACK mappings do not immediately extend down to their base, so calling
mprotect() on the base is basically a NOP. Instead of calling mprotect() we
call mmap() with MAP_FIXED to force the guard.

Properly use MAP_FIXED when setting up the primary guard on the original
user stack. The address specified in the mmap() is only a hint when MAP_FIXED
is not used, and will not properly map the anonymous area. Also, new kernels
do not allow non-MAP_STACK mappings to override MAP_STACK mappings and the
user stack area is a MAP_STACK mapping, so use of MAP_FIXED is mandatory here.

signalintr() was improperly entering a critical section, preventing
sched_ithd() from being able to preempt the current thread. Adjust
so the code matches the pc32 code.

lwp0 was being assigned cpu_heavy_switch instead of cpu_lwkt_switch,
which works fine on pc32 but blows up the vkernel if process 0 gets
preempted, because vkernel LWKT processes are not assigned vmspaces.
Properly use cpu_lwkt_switch() to fix the problem.

We were not checking for pending reschedule requests when the
vmspace_ctl() call got interrupted by a signal. NOTE: There is
still a race after the check prior to re-entry into vmspace_ctl()
which needs to be closed.

This should give us a better base with which we can work up a
more thread-friendly user malloc. Buildworld performance is about
the same (just slightly faster). malloc performance is about twice as
fast as the original.

For IPv6 v6 only address or inp's address family is not known yet,
e.g. before connect(2) is called on the INET6 socket, this function
acts exactly the same as cpu0_soport() (the soport function before
this commit). If a INET6 socket is connected to IPv4 mapped address,
then this function simply falls back to tcp_soport().

- We used to round long double arguments to double. Now we print
them properly.

- Bugs involving '%F', corner cases of '#' and 'g' format
specifiers, and the '.*' precision specifier have been fixed.

- Added support for the "'" specifier to print thousands' grouping
characters in a locale-dependent manner.

- Implement the __vfprintf() side of hexadecimal floating point
support. All that is still needed is a routine to convert the
mantissa to hex digits one nibble at a time in the style of ultoa().

* Add restrict qualifier.

* Add rewind() to the list of functions which may fail and set errno.

* Improve documentation for fgetpos() and fsetpos(), and discourage
users from assuming that fpos_t is an integral type.

* Describe the restrictions on seeking on wide character streams, and also
point out that fseek() clears the ungetwc() buffer.

* Save errno from getting clobbered where appropriate.

* Resulting fseek() offset must fit in long, required by POSIX,
so add LONG_MAX and final tests for it.

* Disallow negative seek as POSIX requires for fseek{o}.

* Catch few possible off_t overflows.

* Make fseek(... SEEK_CUR) fails if current file-position is unspecified.

* Move all stdio internal flags processing and setting out of __sread(),
__swrite() and __sseek() to higher level. According to funopen(3) they all
are just wrappers to something like standard read(2), write(2) and
lseek(2), i.e. must not touch stdio internals because they are replaceable
with any other functions knows nothing about stdio internals.

* Rename cantwrite() to prepwrite(). The latter is less confusing,
since the macro isn't really a predicate, and it has side-effects.
Also, don't set errno if prepwrite() fails, since this is done in
prepwrite() now.

* Fix a potential deadlock in _fwalk in a threaded environment.
A file flag (__SIGN) was added to stdio.h that, when set, tells
_fwalk to ignore it in its walk. This seemed to be needed in
refill.c because each file needs to be locked when flushing.

* Document dependence of mktemp(3) on the non-reentrant arc4random(3).

* Fix a few bugs with the _gettemp() routine which implements mkstemp(),
mkstemps(), and mkdtemp().
- Add proper range checking for the 'slen' parameter passed to mkstemps().
- Try all possible permutations of a template if a collision is encountered.
Previously, once a single template character reached 'z', it would not wrap
around to '0' and keep going until it encountered the original starting
letter. In the edge case that the randomly generated starting name used
all 'z' characters, only that single name would be tried before giving up.

* Use arc4random_uniform(3) since modulo size is not power of 2 in _gettemp.

* Write the message to stderr, not file descriptor 2, so that perror()
writes to the correct stream if stderr has been redirected with freopen().

* Use strerror_r() to format the error message so that strerror()'s static
buffer does not get clobbered in perror().

* Move the positional argument handling code for vfprintf() to a new file,
printf-pos.c, and move common definitions to printflocal.h.

* Remove advertising clause in the copyrights.

* In rewind.c:
1) add missing __sinit() as in fseek() it pretends to be.
2) use clearerr_unlocked() since we already lock stream before _fseeko()
3) don't zero errno at the end, it explicitely required by POSIX as the
only one method to test rewind() error condition.
4) don't clearerr() if error happens in _fseeko()

* When __SOPT is cleared, clear __SOFF too.

* Save a few cycles and don't initialize the locking fields in FILE if
they aren't going to be used later.

* Add ENVIRONMENT section to tmpnam(3) and mention there that TMPDIR is
ignored when issetugid(3) is true. Also add a SECURITY CONSIDERATIONS
section.

* In vasprintf, free the buffer when __vfprintf() fails and don't bother
trying to shrink the buffer with realloc() before returning it.

* Rework the floating point code in printf(). Significant changes:
- We used to round long double arguments to double. Now we print
them properly.
- Bugs involving '%F', corner cases of '#' and 'g' format
specifiers, and the '.*' precision specifier have been fixed.
- Added support for the "'" specifier to print thousands' grouping
characters in a locale-dependent manner.
- Implement the __vfprintf() side of hexadecimal floating point
support. All that is still needed is a routine to convert the
mantissa to hex digits one nibble at a time in the style of ultoa().

* %e conversions with precision 0 should not cause a decimal point to
be printed.

* Fix %f conversions where the number of significant digits is < expt.

* Implement __hdtoa() and __hldtoa() and enable printf() support for %a
and %A, which print floating-point numbers in hexadecimal.

* Add an extensible printf implementation compatible with GLIBC.

* Add support for multibyte decimal_point encodings, e.g., U+066B.

* Add support for multibyte thousands_sep encodings, e.g., U+066C.
The integer thousands' separator code is rewritten in order to
avoid having to preallocate a buffer for the largest possible
digit string with the most possible instances of the longest
possible multibyte thousands' separator. The new version inserts
thousands' separators for integers using the same code as floating point.

* Introduce a local variable and use it instead of passed in parameter
to get rid of restrict qualifier discarding in vswscanf().

* Set the error indicator on an attempt to write to a read-only stream
in wsetup.c.

* Move the format_arg() attribute handling to <sys/cdefs.h> where it
belongs.

Split ifnet serializer step 4/many: Add IFNET_SERIALIZE_MAIN, which is
used by polling(4) code. Now polling(4) no longer tries to hold all
of the serializers of the driver; it just holds driver's main serializer.

Split ifnet serializer step 2/many: Add if_serialize_assert() function
pointer to ifnet, so that upper layer could assert ifnet's serialization
states. Remove the serialization state assertion on ifnet.if_input()
path, since the serialization state normaly has nothing to do with the
input processing.

These three function pointers accept ifnet struct and ifnet_serialize
enumeration.

The ifnet_serialize enumeration indicates the serialization type:
IFNET_SERIALIZE_ALL:
All of the serializers should be held. Except for if_start and if_input,
this enumeration must be used when call ifnet function pointers.
IFNET_SERIALIZE_TX:
Only transmit serializer should be held. This enumeration could be used
when calling ifnet.if_start.
IFNET_SERIALIZE_RX:
Only receive serializer should be held. This enumeration could be used
when calling ifnet.if_input.

If the NIC driver does not set these three function pointer, then if_attach()
will set them to the default ones: only one serializer (if_serializer) is used
and ifnet_serialize parameter is ignored.

Following several inline functions are added which are sheer wrappers of the
three ifnet serialize function pointers:
ifnet_serialize_{all,tx,rx}()
ifnet_deserialize_{all,tx,rx}()
ifnet_tryserialize_{all,tx,rx}()

All of the protocol layers and most of the pseudo drivers are converted.