I just got curious, after reading the GC analysis blog post. What
kind of features people generally would want for the GC (in the
distant murky future of 1999)?
Here's some of my nice to haves:
1. Thread local GCs. D is by default thread local, so it kind of
would make sense and goodbye stop everything GC.
2. Composable custom memory block GC. The ability to mallocate
128MB memory block and create a new GC instance to manage that
block. It would only need to scan that 128MB block and not worry
about rest of memory and resources (with complex destruction
orders) in 16GB heap. This way you probably could guarantee good
collection times for some subsystems in your program and use your
favorite allocator for others.
3. Callbacks to GC operations. I have timeline profiler
implemented for my project. It would be quite cool to have GC
collection starts and stops record a timestamp for the timeline.
(Can this be done already? Hopefully without recompiling the GC.
I tried to look but I couldn't find any hooks in the docs.)
4. Incremental GC with collection time limit. Is this even viable
for D?

I just got curious, after reading the GC analysis blog post. What kind
of features people generally would want for the GC (in the distant murky
future of 1999)?
Here's some of my nice to haves:
1. Thread local GCs. D is by default thread local, so it kind of would
make sense and goodbye stop everything GC.

Yes, yes and yes. The moment we fix the spec and code that casts shared
to TLS at will.

2. Composable custom memory block GC. The ability to mallocate 128MB
memory block and create a new GC instance to manage that block. It would
only need to scan that 128MB block and not worry about rest of memory
and resources (with complex destruction orders) in 16GB heap. This way
you probably could guarantee good collection times for some subsystems
in your program and use your favorite allocator for others.

Not sure what benefit this has compared to just limiting GC heap to 128Mb.

3. Callbacks to GC operations. I have timeline profiler implemented for
my project. It would be quite cool to have GC collection starts and
stops record a timestamp for the timeline.
(Can this be done already? Hopefully without recompiling the GC. I tried
to look but I couldn't find any hooks in the docs.)

Hooks for statistics sounds nice. You can do poll style check
periodically right now.

4. Incremental GC with collection time limit. Is this even viable for D?

Concurrent is more likely then incremental without barriers.
--
Dmitry Olshansky

I just got curious, after reading the GC analysis blog post. What
kind of features people generally would want for the GC (in the
distant murky future of 1999)?
Here's some of my nice to haves:
1. Thread local GCs. D is by default thread local, so it kind of
would make sense and goodbye stop everything GC.

Yes, yes and yes. The moment we fix the spec and code that casts
shared to TLS at will.

Hmm. I'm not familiar with the intricacies of shared; could you
elaborate on how casting from shared causes problems with a thread-local
GC? Or is the problem casting *from* shared?
T
--
Gone Chopin. Bach in a minuet.

I just got curious, after reading the GC analysis blog post. What
kind of features people generally would want for the GC (in the
distant murky future of 1999)?
Here's some of my nice to haves:
1. Thread local GCs. D is by default thread local, so it kind of
would make sense and goodbye stop everything GC.

Yes, yes and yes. The moment we fix the spec and code that casts
shared to TLS at will.

Hmm. I'm not familiar with the intricacies of shared; could you
elaborate on how casting from shared causes problems with a thread-local
GC? Or is the problem casting *from* shared?
T

Seems like my first post went into aether, sorry if double posting.
The problem is generally with transfer of things from one thread to
another. Currently this is done with good natured casts such as
assumeUnique. The GC needs to be in the know of what is transferred to who.
--
Dmitry Olshansky

I just got curious, after reading the GC analysis blog post. What
kind of features people generally would want for the GC (in the
distant murky future of 1999)?
Here's some of my nice to haves:
1. Thread local GCs. D is by default thread local, so it kind of
would make sense and goodbye stop everything GC.

Yes, yes and yes. The moment we fix the spec and code that casts
shared to TLS at will.

Hmm. I'm not familiar with the intricacies of shared; could you
elaborate on how casting from shared causes problems with a thread-local
GC? Or is the problem casting *from* shared?
T

Seems like my first post went into aether, sorry if double posting.
The problem is generally with transfer of things from one thread to
another. Currently this is done with good natured casts such as
assumeUnique. The GC needs to be in the know of what is transferred to
who.

Yeah, the compiler would have to insert additional instructions when casting
to or from either shared or immutable. Otherwise, a thread-local object
could easily be on a thread other than the one that it was created on. Also,
you have the issue of it being easier to construct something as thread-local
or mutable and then making it shared or immutable, in which case, it's fine
for the object to be treated as shared or immutable by most of the code, but
it means that its type would no match the GC that was used to allocate it.
Passing the object between GCs as part of the cast would almost certainly be
required to solve that problem.
However, you do have the issue that in order to operate on shared data, you
typically have to cast the object to thread-local (after locking the
appropriate mutex, of course) and then doing stuff on it as thread-local
before getting rid of all of the thread-local references and releasing the
lock. And moving the object between GCs for that would be an unnecessary
performance hit.
And all of that is when you're just talking about code that is reasonably
well behaved and not even considering the consequences of folks being idiots
about casting to or from shared and shooting themselves in the foot by
having shared objects passed around as thread-local or vice versa without
doing the appropriate stuff with mutexes - and it doesn't take into account
all of the cases where folks keep using __gshared when they should be using
shared, potentially resulting in fun problems if the GC is then
thread-local.
And IIRC, Daniel Murphy pointed out to me at one point that there was some
issue with classes that even shot the idea of transferring objects between
GCs in the foot. But unfortunately, I don't remember the details now.
In any case, past conversations on this have led me to believe that while it
would theoretically be nice to be able to take advantage of the fact that
D's type system marks objects as being shared or thread-local and have
separate GC's per thread, it isn't actually something that would be tenable
in practice, because the corner cases are too costly to deal with if not
actually outright intractable.
A more advanced type system that has some concept of thread ownership might
be able to solve the problem, but that would almost certainly complicate the
language too much to be worth it.
- Jonathan M Davis

2. Composable custom memory block GC. The ability to mallocate
128MB
memory block and create a new GC instance to manage that
block. It would
only need to scan that 128MB block and not worry about rest of
memory
and resources (with complex destruction orders) in 16GB heap.
This way
you probably could guarantee good collection times for some
subsystems
in your program and use your favorite allocator for others.

Not sure what benefit this has compared to just limiting GC
heap to 128Mb.

More flexibility, control and tools for doing mixed memory
management.
I was thinking that you could have multiple of these (thread
local or single threaded, edge cases/safety would be user
responsibility).
Basically turning gc into similar custom allocator as "bump the
pointer" allocators or object pools. This way few of these GCs
could be used as poor man's incremental GC.
I think it could be useful for applications that have tight
memory and timing budgets.
For example, in games, you typically have couple bigger
subsytems. Some game architectures preallocate all the memory
they need and distribute it to each subsystem using custom
allocators.
Maybe some systems with lot of small allocations could use
"memory block local GC" and be fast enough. For example, some
sort of a scripting or debug console subsytem could be fine with
32MB, but you wouldn't need care about releasing your temp
objects. Small heap size would guarantee fast collections.
And for some sort of incremental emulation you could manually
trigger the collections and cycle between different local GC per
frame or timestep. Also if any of the would pause for too long,
you could easily just see which one and fix it.