(Adding sbcl-help CC back.)
> Is there any documentation on SB-INTROSPECT or on how to interpret the
> MAP-ROOT output? I've done as you suggested and nothing really jumps out at
> me, but I'm not sure I have the context to understand what I'm seeing.
You're seeing objects visible on the thread stack at that point. Let's
assume DOIT calls FOO, and FOO calls BAR. If you take a look at the
stack during execution of BAR from DOIT 2013, and see stuff that looks
like it belongs to invocation of DOIT 2012, then case closed. (That is
NOT a bug as such, just the way stack works as cheap scratch-space
that is not always explicitly cleaned, but just overwritten when
needed.)
(I think we're scrubbing stack slots better these days, though, so
this should not be happening much. Playing tricks with stack allocated
vectors is one possibility to manually scrub the stack - which might
actually be a better way of detecting this issue, now that I think of
it...)
...but really, if sticking (sb-ext:gc :full t) between DOIT calls
fixes things, then digging deeper isn't going to provide an easy /
prettier solution.
It also takes care of the option #4: different compilation policies +
side effect of generational GC. If during an early collection a huge
object happens to get promoted to an old generation, it might not be
considered for collection again until it is too late. As to *why* that
would happen with the single function and not in REPL? If DOIT is
inlined into the body of DOALL the actual code executed might just be
different enough to affect this -- especially if DOALL has a high
debug policy.
> Likewise, any pointers to documentation and/or what I should be looking for
> in the logfile?
I recall having documented it... Oh! No, not the output... I think
mailing list archives and the source code are your best bets if you
want to understand it -- I don't think we have proper docs on it. :/
> I turned on the GC logfile and the first test run didn't crash, so maybe
> this is a heisenbug of some sort.
This makes me suspect different compilation policies. If you just want
to get on with your life, that's understandable, but first rule in
chasing things like this is to make sure you're comparing oranges from
the same tree...
Cheers,
-- nikodemus

On 4 February 2014 04:14, Scott Turner <srt19170@...> wrote:
> Here's where it gets puzzling. (At least to me.) The top-level function of
> my program does this:
>
> (defun doall ()
> (doit 2009)
> (doit 2010)
> (doit 2011)
> (doit 2012)
> (doit 2013)
> (doit 2014))
>
> If instead of using this, I simply execute all the "doit"s sequentially from
> the REPL, they work fine and there's no crash. Here's (room) at the end of
> executing sequentially:
I can think of at least three reasons.
1. A conservative root holding on to data, possibly due to
insufficient stack-scrubbing. (SBCL version might be interesting here.
1.1.15 is the latest that had a noteworthy improvement in this
regard.) Requiring SB-INTROSPECT and trying something like
(sb-introspect:map-root #'print sb-thread:*current-thread*)
somewhere inside the DOIT might show something interesting -- provided
you can identify objects from an earlier DOIT call that should not
show up during the later one. I'm not promising MAP-ROOT on threads is
infallible, though. (Maybe it is, but it's been too long and I don't
remember offhand.)
In this case a complete GC between DOIT calls should help.
2. If DOIT starts a thread but doesn't join it, reaping that thread
might not be quite complete by the time the next one starts, leaving
the stack of the other thread live for a long enough for it to be
tenured. If putting a (sleep 1) call between each DOIT helps, then
something asynchronous like that is probably going on.
3. Funkyness involving *, **, ***, etc. (Ie. the various REPL
variables.) Silly example:
> (defun foo (x) (push x *) (format t "* = ~S" *) nil)
FOO
> (foo 1)
* = (1 . FOO)
NIL
> (foo 2)
* = (2)
NIL
>* (foo 3)
* = (3)
NIL
> (progn (foo 1) (foo 2) (foo 3))
* = (1)* = (2 1)* = (3 2 1)
NIL
See?
Anyone coming up with a fourth reason gets a cookie.
Also, using SB-EXT:GC-LOGFILE may yield more information re. the
manner of the heap filling.
Cheers,
-- nikodemus

Though I'm an absolute beginner in CL I took a look to the code, I just
missed a bit more of information for describing the problem, and also I
couldn't really run the code since unit tests use some files not in the
repo (currently myfavlibrary.exe).
I think the main problem is the file-to-bytes function, you are loading the
entire file in memory (though I don't know the size of that specific file
which made the test crash). Also the function bytes which splices the big
vector into a list of bytes.
As I said I'm a beginner, maybe all that is not correct, I would use arrays
instead of lists in the bytes function since you always pass the size (in
count param). Maybe you dont need at all copying splices of the big array
into new objects since they get unreachable soon (they got translated to
ints, longs, or so).
As for avoiding load the entire file in mem I don't know a nice solution,
you can always use file-position to seek in the file but I don't know if
doing this the reads get buffered.
Also I noted you use loop with index for arrays, you can use across for
sequences and subseq for splicing the sequences. I'm calling splice to a
subsequence sorry if it sounds misleading. A simple test over a (make-array
10000000 :element-type 'integer) performed 20% faster with across than with
the index approach summing all zeros in the array. Oh! also a tip, (loop
for i to (1- x) == (loop for i below x, well maybe you tip more this way...
2014-02-04 Jacek Podkanski <ruby.object@...>:
> Adding this code at the end of my tests bring heap use back to normal.
>
> (setf bytes nil
> mem nil)
> (sb-ext:gc :full T)
>
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> _______________________________________________
> Sbcl-help mailing list
> Sbcl-help@...
> https://lists.sourceforge.net/lists/listinfo/sbcl-help
>

Scott Turner <srt19170@...> writes:
> On Tue, Feb 4, 2014 at 11:30 AM, Jacek Podkanski <ruby.object@...
>> wrote:
>
>> So my question is: Is it OK to use explicit GC?
>
>
> Is there any difference between "sb-ext:gc" and "gc"? The latter is at
> least CL compliant.
gc is not "CL compliant", it the same symbol as sb-ext:gc.
--
With best regards, Stas.

On Tue, Feb 4, 2014 at 11:30 AM, Jacek Podkanski <ruby.object@...
> wrote:
> So my question is: Is it OK to use explicit GC?
Is there any difference between "sb-ext:gc" and "gc"? The latter is at
least CL compliant.
-- Scott

It seems that using explicit gc (sb-ext:gc :full T) it is possible to
run my test comfortably on machine with 1GB RAM.
If I don't use explicit gc I run into problems with system running out
of memory.
I know it's not portable, but I can live with SBCL specific code if I
can use my code on all machines.
Version I use:
SBCL 1.1.1.0.debian
So my question is: Is it OK to use explicit GC? What other options I
have when having problems with peaks of memory use?

Sanel Zukan <sanelz@...> writes:
> I'm curious, shouldn't GC be automatically called when heap near limit
> was reached?
Yes, but: the garbage collector strategy is a "mostly copying" strategy:
data that is still live needs to be copied to a new location rather than
remaining in place. One problem that this can cause is that if there is
not much reclaimable space, the copying as part of garbage collection
itself can cause the heap to run out of space.
Cheers,
Christophe

On Tue, Feb 4, 2014 at 4:15 PM, Sanel Zukan <sanelz@...> wrote:
> I'm curious, shouldn't GC be automatically called when heap near limit
> was reached?
>
It definitely should. However, calling it manually is harmless except for
wasting CPU time.
Calling first (GC :full t) and then (ROOM) is a test to find out if you
have generated uncollectable garbage, that is objects you don't need but
still have a reference to somewhere.