Memory Stream Multiplexer–write and read from many threads simultaneously

MemoryStreamMultiplexer is a MemoryStream like buffer manager where one thread can write and many threads can read from it simultaneously. It supports blocking reads, so that reader threads can call .Read() and wait for some data to be written. Handy for loading data in one thread that is consumed c

Introduction

Here’s an implementation of MemoryStream like buffer manager where
one thread can write and many threads can read simultaneously. Each reading
thread gets its own reader and can read from the shared stream on its own
without blocking write operation or other parallel read operations. It supports
blocking Read call so that reader threads can call Read(…) and wait
until some data is available, exactly the same way you would expect a
Stream to behave. You can use this to read content from network or
file in one thread and then get it read by one or more threads simultaneously. Readers do not block writing. As a result, both read and write happens concurrently.
Handy for building http proxy where you are downloading a certain file and you
have multiple clients asking for the same file at the same time. You can
download it in one thread and let one or more client threads read from the same buffer
exactly at the same time. You can also use this to read same file on disk from
multiple clients at the same time. You can also use this to implement a server
side cache where the same buffer is read by multiple clients at the same time.

How does it work

First you create a MemoryStreamMultiplexer object that holds the
shared buffer. It has a Write(…) method to write
byte[] to the shared buffer. Then you call GetReader()
to get a MemoryStreamReader created that can read the content from
the shared buffer. You can call GetReader() from a different thread
so that you can read and write simultaneously. Whenever you call
Write(…) it signals all the MemoryStreamReader
instances that content is now available to read. The readers that were
waiting on a Read(…) call gets the signal and reads from the shared
buffer.

It maintains a list of ManualResetEvent that is used to signal
the reader. Each reader gets two ManualResetEvent passed to it. One
to signal whenever a Write() happens, so that it can unblock the
Read call made by the reader threads and let them process the newly
available content. The other one to signal the Readers that writing has finished
so that it can stop expecting more content from the buffer.

Next is the MemoryStreamReader where most of the complicated
code lies.

I have tested it thoroughly on a Quad Core PC to make sure parallel reads really
happen and no thread overlaps on each other. I made sure the number of locks
hold are also minimal. You can see parallel Write and Read happening from the
Console output:

The above console output shows you that both read and write happening concurrently.

Performance testing the library

Here’s the Visual Studio 2010 Profiler report. It shows the most expensive code
is GetReader only and there’s no other function that comes anywhere
close to it. This is a good indication that the implementation is good enough.

Even in the GetReader function, the most expensive line of code is
creating the MemoryStreamReader:

When you do the Concurrency analysis to see which thread is doing what, it shows
that the reader threads read available content as soon as the writer thread
writes to the shared buffer. There’s no delay in reader threads getting the
signal and reading the recently added content.

The green bars on the threads show that as soon as the writer thread (Thread
7784) signals (the yellow bars), the reader threads execute and pickup the data.
There’s just one thread 8524 which seems to struggle picking up the signal for
some reason. But rest of the threads pickup the signal and read the recently
added bytes within 0.02ms on average.

I'm new to C#, and while using this class to read large amounts of data I have been getting NullReferenceException from this line of code:

byte[] currentBuffer = _bufferList[_bufferIndex];

Here are my thoughts on the reason.
This line of code can return a null reference due to concurrency and memory allocation issues.
If we read a really long data using this class at some point underlying List can decide to re-allocate it's data to increase the size. If at the exact same moment some reader have been trying to read data via ReadInternal a NullReferenceException will be thrown, because there is invalid data on the old memory pointer.
A few moments later, when List finishes reallocation, data is already available for reading.

Read from and Write to the same buffer simultaneously has been a problem since forever!
Most Operating Systems have an accessible implementation of some sort for this issue.
In Windows and for .NET it is called ReaderWriterLock (System.Threading.ReaderWriterLock[^]). Also, ReaderWriterLockSlim (System.Threading.ReaderWriterLockSlim[^]) can be used.
To me your code is trying to do what ReaderWriterLockSlim does. Please feel free to correct me if I am wrong
I have not done benchmarking on these against your code but I have used them all the time with no performance issue so far.

I saw that you have 255 ManualResetEvents in your main class.
Well, there are many problems with that:
1) It will cause problems if you have 256 or more "clients".
2) it will allocate 255 ManualResetEvents even if you have only 2 clients.
3) You could sinchronize all clients using only a single ManualResetEvent (or even an AutoResetEvent).
4) If you really need one ManualResetEvent per client, you should put it in the "client" class (the one used to read the data) and put that class inside the multiplexer list. So, one client = 1 ManualResetEvent (or 2 if you still use two per client, but no 255 or 255*2).

I can also say that you look at my article Managed Thread Synchronization[^] to create slim synchronization objects, but I really think you should refactor your class (only replacing ManualResetEvent by ManagedManualResetEvent may already do some good, though).

Declaring an array of object does not allocate the object. You have to do new until the actual construction of object happens. So, no manualresetevent is created until you call do a new ManualResetEvent().

255 is an upper limit I have set. You can go for a List<> implementation, but then you have to lock the list while doing a foreach (var item in list) item.Set(). I needed to avoid that locking.

One ManualResetEvent did not work for me as different threads have to call .reset() at different times. Maybe there's a way to overcome that.

Very good implementation on a managed waitevent alternative. I will give that a shot.

About the array, you are right. I really considered 255 events already allocated. My error.

And about your case, I think you need an hibrid solution.
Considering each client can be at different positions on the read, each one has the position, right?
So, you could do this:
When you write to the multiplexer, it writes to his effective memory stream and does something like:

lock(_pulser)
Monitor.PulseAll(_pulser);

The pulse all will release all waiting threads.
The, as the state is not kept, the clients only wait when they arrive the end, something like:

So, in fact, you will have:
From 0 to unlimited threads waiting on the multiplexer pulser.
As soon new data arrives (and is stored), the multiplexer Pulses all threads.
Each thread then continues. New threads arriving, before waiting, check if they can simple continue reading and, if they can, they read without waiting.

You will then have a synchronization of any amount of threads using a single object (the _pulser).

----

Edit: As a side note, the actual implementation of Finish is setting the finished only after iterating though all ManualResetEvents... I really think that if that variable is used to tell the clients that there is no more writing comming, you should set the variable before any call to ManualResetEvent.Set().