> Howard Hinnant <hinnant_at_[hidden]> writes:
>
>> With both the previous release and this one it looks like those threads
>> waiting on upgradable or exclusive access are essentially polling if
>> another thread already enjoys such access. What I'm not seeing is a
>> signal or condition variable upon which a thread sleeps after it enters
>> the first gate and tries to acquire upgradable or exclusive access but
>> discovers such access is not immediately available. I fear the polling
>> behavior will defeat the optimizations that read/write/upgradable exist
>> for in the first place. But I have not measured your code, so I could
>> well be mistaken.
>
> You're half right! If the threads block on trying to enter a new state, then
> they wait on the mutex_state_sem. This semaphore is triggered every time a
> lock is released, and again if a waiting thread is awakened but cannot satisfy
> it's condition.
>
> This can lead to a thread polling, if it is the only waiting thread, because
> once it has been awakened, it will keep releasing the semaphore and
> reacquiring it until it's condition can be satisfied.

With thanks to Roland Schwarz, who's condition implementation gave me the
inspiration I needed, I attach a new implementation of read-write-mutexes for
win32.

The idea is to have a second gate on the mutex states --- if a thread is
unable to make the change it requires immediately, then it must enter the
gate. This allows the thread to again check whether it can enter the new
state, and if not, increment the "waiting threads" count without fear of
interruption, or of missing a wakeup. It can then exit the gate, and wait for
notification.

When a thread transitions into a state that allows other threads in other
states (i.e. every state except the exclusive writing state, and the
acquiring_XXX or releasing_XXX states), then it must notify waiting
threads. To do this, it must first enter the gate, and check for waiting
threads. If there are none, it exits the gate. If there are some, then it
leaves the gate locked, and notifies the waiting threads (using a semaphore to
release the appropriate number of threads). When a thread is woken up, it
decreases the waiting thread count. If it is the last thread to be woken, then
it unlocks the gate. This ensures that only one notification is in progress at
once.

Whereas the problems Howard described above were real --- in long running
tests, my old implementation consumed near 100% CPU usage --- the new
implementation does not suffer from this problem, and consumes very little CPU
time. Not only that, but the overall runtime is less, too.