Another issue that I have come across while investigating memory use is the
heuristic used to determine that a given 32-bit blob is a plausible reference
to a heap address.
The collector will treat a blob that "points" slightly above the current end
of the heap as plausible, assuming that the heap will soon grow such that the
reference becomes real. However, the heuristic used does not work well for
me.
Around line 927 in alloc.c a slop factor is computed:
expansion_slop = 8 * WORDS_TO_BYTES(min_words_allocd());
I am not sure about the reasoning behind this heuristic, but I found that
expansion_slop was just under 3x the total heap size for me. For a 350 MB
heap, the plausible ending heap address actually surpassed 0x50000000. Once
you get to this point, the probability of an arbitrary 32-bit blob
"referencing" a plausible heap address is 0.25 or greater. Uppercase letters
in strings in the text/data segment (root set) now "reference" locations in
the 0x41000000-0x5a000000 range. Once the heap size hits 1.5 GB (entirely
possible on a 32-bit machine) then effectively the entire address space is
considered plausible.
There are two consequences to this:
1. Unless -DLARGE_CONFIG is used, the blacklist cache will alias a page onto
pages +/- multiple of 256MB from it. Once the heap gets over 100 MB, it
will start blacklisting itself to death as every page tends to alias to
a blacklisted page. -DLARGE_CONFIG effectively prevents aliasing on
32-bit systems which lets you live until...
2. At 1.5 GB, again you blacklist yourself to death. I have not actually
tried this, as my machine is not endowed with that much physical RAM.
:-)
I am noticing plausible addresses over 1G for heaps in the 350-400MB range.
Given that FreeBSD/i386 supports by default 3GB user space, I expect
blacklist-related problems once my heap hits 1G, even with -DLARGE_CONFIG.