Comments

edited

We need object pooling support at a deeper level than is currently possible. The strongest evidence: Roslyn, the C# compiler itself, had to end up implementing multiple object pools to achieve performance goals. See here:

ASP.NET Core

Discussion

In today's world of increasing concurrency, shared objects that are treated as readonly/immutable during parallel processing always need deterministic cleaned up when the last task completes.

Arrays/Buffers - We end up pooling these every single time.

POCO's - Small message-like object that hover dangerously in Gen1 are used over and over again, yet are treated no differently by the GC. (POCOS need to be reference types for polymorphism/pattern matching without boxing when they are put in a queue.) Readonly structs, ref returns, Span, and stackalloc are great steps for processing on the stack, but do not address the inevitable need to call a form of .ReturnToPool(). Things will need to get buffered and will end up stored off the stack. It's unavoidable in queued scenarios. Value types are not your friends here either as you're going to be boxing and unboxing like crazy. The actor model is alive and well and happening more and more with pattern matching and increased parallelism.

There needs to be some way to achieve pause-free memory control in our ecosystem that involves some first class support of reference counting as well as custom allocators. It's not just one feature or keyword to solve this. Further, let's tell the GC to treat certain objects as memory critical and that they must be treated differently. This includes banning certain object instances from entering phase 1/2/LOH during garbage collection. (We would be able to make this a type-based policy, but let us make it only for a subset of instances as well.) Even defining our own "phase" with a GC-as-a-service paradigm would be lovely. In other words, let's write code to help the GC and not replace/fight it. This is not a place for being declarative.

Destructor-like behavior - (not finalizers as we need to access managed memory) We need tight deterministic cleanup if we're working with pools and we need to be able to set a guarantee on when they run.

GC Policy - Set a GC config on an object that says: "Do not promote this object to Gen 2 ever. Run a delegate/destructor as either a callback or Task/ValueTask on the ThreadPool, or on a thread reserved by the application. Do not pause the world for this object. It's the appliation's job to return it to a pool. The pool is marked as not to be compacted, and do not put it in the LOH. This will not be solved by new keywords similar to using() blocks. It likely won't be able to be declarative like other solutions.

MemoryPool<T> - Doesn't get the job done unfortunately. IMemoryOwner<T> is a reference type. The owner objects themselves either need to be pooled if we have frequent acquire and release of our objects, and we have to roll your own reference counting on top of it. Looking at the implementations, the best shot is a heuristic to avoid CPU cache thrashing with thread local storage that ends up in local cache starvation in a pipelined producer/consumer scenarios. We can try to wrap MemoryOwner in a value type/ struct to avoid further allocations, yet you end up treating that as a mutable handle. (Mutable value types are evil, yet looking at pooling implementations above... you see handles that are stateful structs with comments warning you.)

When the the ever looming day comes that we hit a pause from Gen 2, there is absolutely no good solution to this problem given the current run-time or language support. WeakReference does not get it done. ConditionalWeakTable still needs our own form of dirty GC or pooling of the WeakReferences themselves because as you add WeakReferences, you end up with a ton of them in the finalizer queue.

The Snowflake GC modification paper mentions this:

Finally, we have also built shareable reference counted objects, RefCount, but we are considering API extensions in this space as important future work.

Reference counting needs to be solved at the same time to support queuing scenarios and immediate release of scarce buffers. What we have right now with pooled objects on the stack is at least manageable. It all really breaks when we go to shared pointers. Having an immutable, pooled, object in a logging queue and a network queue immediately sends us back to square one. There is an argument to be made for incremental improvement and doing deterministic cleanup later. However, for such a fundamental change to memory management, leaving clean reference counting as a TODO would be a mistake as it has not historically worked out well.

For writing pools, and factories we really need to have support for treating constructors as general delegates. We end up needing and we do use this pattern every day: list.Select(x=> new Foo(x)), Factory.Create(x => new Foo(x)) or we learn the hard way that the new() generic constraint used Activator.CreateInstance, and you can't use any constructor arguments. I wish I could do this: Factory.Create(Foo.constructor) and the constructor is converted to an open delegate. Most importantly though, you end up having to make pooled instances have an Initialize(x, y) function when they are getting recycled. Otherwise, they have no way to be stateful. Let me call a constructor on an existing object as many times as I like in the same memory location without invalidating references to that object. (foo.constructor(x)) Last I checked, we can hack and do this through IL if we wanted to. (The memory layout is deterministic after all, right?)

Lastly, over almost 16 years of working in .NET, every single project has had to have an object pool at some point. It's no longer premature optimization, but an inevitability. In financial software, you end up with a gazillion tiny objects for orders, and quotes that love ending up in gen 2. For video, audio, and big unmanaged resources, you will end up pooling and it's going to be after you've written a lot of great code that's going to result in value types turning into classes making things worse. (But hey, it's not a buffer!) For gaming, you better hope that your unpredictable GC pause finishes while also having time to do your physics processing as well. (You're just going to drop frames, because you only have 15ms that you're already squeezing as much work into as you can.)

For C# language feature discussions - "The runtime doesn't support IL for ___" seems to come up in discussion in this area. It's why I wrote it all here. It's too big to not bring it all together as that's how we write programs: run-time and language.

I can't express enough how much fixing this area will benefit the community. It's been my priority 0 since 2008.

This comment has been minimized.

edited

Thank you for a detailed feedback. I do not think it will help to have yet another issue with general discussion that mixes multiple topics together. These discussions never go anywhere. I would recommend focusing on the more concrete proposals:

Snowflake: A lot of energy was spent on this project. It was hard to demonstrate performance benefits that would justify the complex programming model. I agree that it is a promising idea and it would be nice to iterate on it further. Maybe it can be turned into something real eventually.