In my previous post, I did some rough “benchmarks” to see how message passing options behave. I got some great comments, and I thought I’ll expand on that.

The baseline for this was a blocking queue, and we managed to process using that we managed to get:

145,271,000 msgs in 00:00:10.4597977 for 13,888,510 ops/sec

And the async BufferBlock, using which we got:

43,268,149 msgs in 00:00:10 for 4,326,815 ops/sec.

Using LMAX Disruptor we got a disappointing:

29,791,996 msgs in 00:00:10.0003334 for 2,979,100 ops/sec

However, it was pointed out that I can significantly improve this if I changed the code to be:

var disruptor = new Disruptor.Dsl.Disruptor<Holder>(() => new Holder(), new SingleThreadedClaimStrategy(256), new YieldingWaitStrategy(), TaskScheduler.Default);

After which we get a very nice:

141,501,999 msgs in 00:00:10.0000051 for 14,150,193 ops/sec

Another request I got was for testing this with a concurrent queue, which is actually what it is meant to do. The code is actually the same as the blocking queue, we just changed Bus<string> to ConcurrentQueue<string>.

Using that, we got:

170,726,000 msgs in 00:00:10.0000042 for 17,072,593 ops/sec

And yes, this is pretty much just because I could. Any of those methods is quite significantly higher than anything close to what I actually need.

Comments

Yesterday, I was curious about the Disruptor comment as well. But from a quick peek through the code, I could not see how one could get thread safety using the SingleThreadedClaimStrategy and YieldWaitStrategy. Digging through the code made it look to me as if these were doing nothing to allow multiple threads to post/enqueue safely. Am I wrong about this? And if so, can someone please explain?

I'm pretty sure the area where you would see the greatest benefit from Disruptor was when you had multiple consumers with a dependency network between them [1][2]. Typically you would use multiple queues in the network, whereas you would only use one Disruptor ring-buffer. I'd like to see performance tests that test that scenario, which comes up more often in message passing scenarios.

@tyler: Yes, as the name suggests, SingleThreadedClaimStrategy is usable when there is a single thread acquiring entries from the ringbuffer, which was the benchmarked scenario. Others claim strategies must be used when you acquire entries from multiple threads. Note: according to the disruptor terminology, "acquiring entries" means acquiring entries to publish new values to the ring buffer.

Anyway, this kind of benchmark (single publisher/single consumer, monitor-based synchronization with no real contention, no latency measurement, etc.) is only meaningful if you just want to get a vague order of magnitude of what you can achieve in term of throughput between two threads, relatively to your use case. I think that's the point of these posts.

@Romain, thanks for the clarification. I misread Ayende's original code and thought, incorrectly, that he had two threads publishing. Somehow I missed the fact that one was posting and one was read/receiving. I must learn to read more carefully next time.

@Jordan, not only depdendent consumers but also concurrent where both of them are read-only.
@ Romain, currently I'm not aware of a public Disruptor port for .NET which includes the newest MultiProducerSequencer introduced in JAVA Disruptor 3.0. This is hugely beneficial, and speeds up multi-prod scenario a lot.

@ayende I'm pretty sure the Nuget package is out of date and is a debug build. I'm using the 2.10 release build of the Disruptor port and it is beating even the concurrent queue [1]. Again, its rare that anyone needs to process millions of messages a second, but since we are comparing raw performance, I think Disruptor comes out on top.