On Mon, 2004-12-20 at 21:56 +1100, Manuel M T Chakravarty wrote:
> > For some reason (as yet undiscovered) the serialisation is very slow and
> > memory hungry. On my machine it takes 16 seconds to parse all of gtk/gtk
> > but 45 seconds to serialise all that to disk.
>> My only *guess* would be that to serialise, you force some/all of the
> semantic analysis of the C AST that usually only occurs lazily for those
> parts of the header that are needed for the binding of the currently
> compiled .chs file. It depends on what information exactly you
> serialise.
Actually it turns out not to be that. It was my first suspicion too, so
I generated DeepSeq instances for everything (with DrIFT) and ran that
before serialising. I inserted timing points in key places. It turned
out that the DeepSeq took very little time at all (some time, so the
deepSeq was actually working) but the serialisation still took forever.
It seems that the serialisation allocates enormous amounts of garbage
which is why it takes so long. Simon M reckons that ghc's Binary module
should run in constant space (well, log stack space) when the right
optimisations are used. I'll probably have to analyse the optimised core
code so see what's really going on, if it is doing allocation anywhere.
Duncan