digitalmars.D - Re: GC Precision

A moving GC, one that doesn't stop the world on collection,
and one that's fully precise including stack would be nice, but they're several
orders of magnitude less important and would also have more ripple effects.

I agree that here doing something simple now is better than doing nothing or
doing something very refined in an unknown future.
And in future things may be improved. In D objects are always managed by
reference, and I think that most programs don't alter or cast such references
to something else (objects allocated on memory specified by the programmer, and
scoped objects allocated on the stack may be excluded from this). So I think
that it can be safe to move objects, to compact the heap.
So you may have 5 memory zones:
- C heap. (The type system of the D compiler may see the C-heap pointers and
D-heap pointers as two different types, as I have proposed in the past. So you
need casts if you want to mix them, and the compiler can use the D moving heap
in a safer way).
- Pinned D heap for everything can't be moved, like structs managed by pointers
(eventually a D compiler can infer at compile time that some structs too may be
moved around, because their pointer is used only in clean ways (you don't need
to introduce struct references for this)). I think SafeD modules will not use
this heap a lot (but they can use unsafe modules that may use pinned objects.
Is D safety transitive? I think it is not, so from a SafeD module you can call
and use an unsafe module);
- Old object generation managed with compaction (I think there's no need for
the permanent objects zone in D);
- Two "from" and "to" zones for the young generation, that don't use a true
compaction strategy, young objects bounce between them;
- New generation Eden where new object allocations happen managed as a memory
arena.
All this has the disadvantage of requiring a more complex GC, and probably
requiring 2-4 times more RAM at runtime. It hopefully has the advantage of
allowing new programmers, that have learnt Java at university, to program in
almost like in Java. (I have found a not-synthetic Java benchmark program that
converted to D is something like 18 times slower on LDC. I'll put the code in
my site in the following days).
Bye,
bearophile

A moving GC, one that doesn't stop the world on collection,
and one that's fully precise including stack would be nice, but they're several
orders of magnitude less important and would also have more ripple effects.

I agree that here doing something simple now is better than doing nothing or
doing something very refined in an unknown future.
And in future things may be improved. In D objects are always managed by
reference, and I think that most programs don't alter or cast such references
to something else (objects allocated on memory specified by the programmer, and
scoped objects allocated on the stack may be excluded from this). So I think
that it can be safe to move objects, to compact the heap.

The current implementation of toHash in Object does that: return
cast(hash_t)cast(void*)this;

So you may have 5 memory zones:
- C heap. (The type system of the D compiler may see the C-heap pointers and
D-heap pointers as two different types, as I have proposed in the past. So you
need casts if you want to mix them, and the compiler can use the D moving heap
in a safer way).
- Pinned D heap for everything can't be moved, like structs managed by
pointers (eventually a D compiler can infer at compile time that some structs
too may be moved around, because their pointer is used only in clean ways (you
don't need to introduce struct references for this)). I think SafeD modules
will not use this heap a lot (but they can use unsafe modules that may use
pinned objects. Is D safety transitive? I think it is not, so from a SafeD
module you can call and use an unsafe module);
- Old object generation managed with compaction (I think there's no need for
the permanent objects zone in D);
- Two "from" and "to" zones for the young generation, that don't use a true
compaction strategy, young objects bounce between them;
- New generation Eden where new object allocations happen managed as a memory
arena.
All this has the disadvantage of requiring a more complex GC, and probably
requiring 2-4 times more RAM at runtime. It hopefully has the advantage of
allowing new programmers, that have learnt Java at university, to program in
almost like in Java. (I have found a not-synthetic Java benchmark program that
converted to D is something like 18 times slower on LDC. I'll put the code in
my site in the following days).
Bye,
bearophile