Merge shrink-tcr branch. This enables the 32-bit Windows lisp to run
on 64-bit Windows.

On 32-bit x86 ports, we expect to use a segment register to point to a
block of thread-local data called the TCR (thread context record).
This has always been kind of a bother on 32-bit Windows: we have been
using a kludge that allows us to use the %es segment register
(conditionalized on WIN32_ES_HACK).

Unfortunately, 64-bit Windows doesn't support using an LDT. This is
why the 32-bit lisp wouldn't run on 64-bit Windows.

The new scheme is to use some of the TlsSlots? (part of the Windows
TEB) for the most important parts of the TCR, and to introduce an "aux
vector" for the remaining TCR slots. Since %fs points to the TEB, we
can make this work. We reserve the last 34 (of 64) slots for our use,
and will die if we don't get them.

This change introduces some ugliness. In lisp kernel C files, there's
a TCR_AUX(tcr) macro that expands to "tcr->aux" on win32, and to "tcr"
elsewhere.

If lisp or lap code has a pointer to a TCR, it's necessary to subtract
off target::tcr-bias (which on Windows/x86 is #xe10, the offset from
%fs to the TlsSlots? in the Windows TEB). We also sometimes have to load
target::tcr.aux to get at data which has been moved there.

These changes should only affect Windows/x86. The story on the other
platforms is just the same as before.