"Both of these articles allude to the fact that I'm working on putting the D-Bus protocol into the kernel, in order to help achieve these larger goals of proper IPC for applications. And I'd like to confirm that yes, this is true, but it's not going to be D-Bus like you know it today. Our goal (and I use 'goal' in a very rough term, I have 8 pages of scribbled notes describing what we want to try to implement here), is to provide a reliable multicast and point-to-point messaging system for the kernel, that will work quickly and securely. On top of this kernel feature, we will try to provide a 'libdbus' interface that allows existing D-Bus users to work without ever knowing the D-Bus daemon was replaced on their system."

I should have said "execution units" instead of processes. Obviously multithreading is an improvement over multiprocessing, and multiplexing coroutines on a non-blocking thread pool is an improvement over multithreading.

But an in-kernel message bus would still ease the implementation and accelerate the performance of a modern concurrent runtime platform such as Go, whose channel type would map nicely to AF_BUS.

"I should have said 'execution units' instead of processes. Obviously multithreading is an improvement over multiprocessing, and multiplexing coroutines on a non-blocking thread pool is an improvement over multithreading."

Yea, multithreaded is a large improvement over multiprocess for efficiency. According to the next link, the minimum stack size is 16K.

So there is still a lot of per-client overhead that cannot be eliminated in the blocking thread model. This is why I'm a huge fan of the nginx's type of concurrency model. If your not familiar with it, it uses a number of processes equal to the number of parallel cores. Each process on top of that uses an asynchronous mechanism like epoll. This means it can get full concurrency across CPUs and handle each client with asynchronous IO. Each client only uses as many resources (CPU and memory) without the overhead of any synchronization primitives used by threads.

I'm so happy with this model that I try to encourage others to adopt it, but often times implementations compromise the model (especially by using blocking file-IO, linux doesn't even support asynchronous non-blocking file-IO). So an application which makes heavy use of uncached file-IO will probably do better with more threads to prevent clients from blocking each other and the async loop becoming idle.

Anyways, my personal programming philosophy aside...

"But an in-kernel message bus would still ease the implementation and accelerate the performance of a modern concurrent runtime platform such as Go, whose channel type would map nicely to AF_BUS."

I'm afraid I haven't learned much about go yet, despite a suggestion that I should. I am interested, how would go incorporate a kernel bus into a language feature that ordinary go programs would use?

In Go, goroutines are the fundamental execution units, which are scheduled on a thread pool of an appropriate size for the hardware, much like nginx. Goroutines interact using channels which exchange values of a type.

Channels may be bidirectional or unidirectional, and they may be buffered to accept a particular number of values before blocking the sending goroutines and unblocking the receiving goroutines.

When a goroutine is about to block, the runtime schedules any other unblocked goroutine on that thread rather than blocking the entire thread. When the I/O request is complete, the goroutine is unblocked and reinserted into the runqueue of goroutines.

Since you're already familiar with nginx, that should be enough to get the gist of goroutines and how the runtime might use AF-BUS to implement channels.

So there is still a lot of per-client overhead that cannot be eliminated in the blocking thread model. This is why I'm a huge fan of the nginx's type of concurrency model. If your not familiar with it, it uses a number of processes equal to the number of parallel cores. Each process on top of that uses an asynchronous mechanism like epoll. This means it can get full concurrency across CPUs and handle each client with asynchronous IO. Each client only uses as many resources (CPU and memory) without the overhead of any synchronization primitives used by threads.

Be all the fan you want, but you have to see it's limitations. It's great for stateless retrieval protocols. For everything else, it depends on the given case.