This means that "./sync-all commit" will record in submodules first,
and then prompt you to record a patch updating the version of the
submodules last. Should make it less likely that we forget to update
the submodules' versions.

Allow different customizations per cross target
by obtaining GlobalCrossCompilePrefix from mk/config.mk and
using that to include mk/$(GlobalCrossCompilePrefix)build.mk
instead of mk/build.mk when present.

Note: GlobalCrossCompilePrefix is basically the same
as CrossCompilePrefix, but does not depend on $(phase).

We were doing it in two different ways and asserting that the results
were the same. In most cases they were, but I found one case where
they weren't: the GC itself allocates some memory for running
finalizers, and this memory was accounted for one way but not the
other.

It was simpler to remove the old way of counting allocation that to
try to fix it up, so I did that.

In particular, when there are only a few nullary constructors generate
regular pattern matching code, rather than using con2Tag. This avoids
generating unnecessary join points, which can make the code noticably
worse in the few-constructors case.

This patch makes the Data.Typeable.Typeable class work with arguments of any
kind. In particular, this removes the Typeable1..7 class hierarchy, greatly
simplyfing the whole Typeable story. Also added is the AutoDeriveTypeable
language extension, which will automatically derive Typeable for all types and
classes declared in that module. Since there is now no good reason to give
handwritten instances of the Typeable class, those are ignored (for backwards
compatibility), and a warning is emitted.

The old, kind-* Typeable class is now called OldTypeable, and lives in the
Data.OldTypeable module. It is deprecated, and should be removed in some future
version of GHC.