Choice over Events using STM

I’m currently writing a paper on CHP’s performance for conjunction, which I have been optimising recently. The problem with a new feature like conjunction is that there is nothing else to benchmark it against! But I can benchmark the effect that supporting conjunction has on performance for simple channel communications and other things that don’t feature conjunction.

Two of my comparisons are simple synchronous channels based on MVars and STM. These don’t support choice between events — you can’t choose between writing to two synchronous channels built on MVars or STM without more machinery on top. But they are fast. Another comparison is the CML package, which does support such choice between events — the performance of CML merits its own post some time (in short: fine normally, but terrible if you use its choice operator a lot — unless I have made a mistake in the benchmark).

I also wanted to benchmark an implementation that supported choice but not conjunction, based on STM. Version 1.0.0 of CHP fulfils this criteria, but was badly designed and totally unoptimised — and I know from my recent optimisations how bad the performance might be. So I constructed an optimised version of channels with choice but no conjunction. I was surprised at how short the algorithm was, and realised that it could be explained in a blog post. So here it is.

Implementing Event Choice with STM

Let’s be clear on the problem, first. I want an Event type such that I can say “wait for event A or event B, whichever happens first, but only engage in one of them”. Then I want someone else to be able to concurrently say “wait for event B or event C or event D, whichever happens first, but only engage in one of them” and have the algorithm resolve it all. STM doesn’t achieve this by itself; you can use orElse to choose between writing to two variables, but that doesn’t suffice for multiple people engaging in events with each other.

We begin with a helper function — one of those functions that is general enough that it might almost feature in a standard library. Here it is:

Next we’ll declare our data-types. Our Event contains a constant enrollment count (the number of processes required to synchronise together), and a transactional variable holding a list of current offers, each with an associated STM action. An offer is in turn a list of events which uses a ThreadId as a unique identifier; think of an offer as saying: I offer to engage in exactly one of the events in the list, and I’m waiting until I do:

Adding an offer to an event is as simple as adding it to the list of offers. My modifyTVar' function has type (a -> a) -> TVar a -> STM () and applies the modification function to the contents of the TVar, but it adds a little strictness that helps performance:

We also define a function for checking if an event is able to complete (when we are making an offer on it). This takes an event, and an action to perform if the event can complete. It then reads the current offers from the event’s transactional variable. If the enrollment count is equal to the number of current offers plus one (the offer now being made), it can complete. Completion involves performing all the associated STM actions, and then using a revoke function to remove the offers (which have now chosen this event, say: A) from all the events that they had offered on (e.g. event A, event B, event C):

We only require one further function. This function, offerAll, handles the creation of an offer, checks if any of the events in the offer can complete immediately, and otherwise records the offers in the event then waits for one of them to be completed by a later participants. It must use two transactions for this; one to make the offers (this transaction needs to finish for it to be visible to the other participants) and one to wait for an event to be completed. A crucial part of the function is not just knowing that an offer completed, but also knowing which one. For this we construct a TVar of our own into which a result can be written. This starts off as Nothing, and we later wait for it to become a Just value. We augment the user-supplied action-on-completion to also write into this TVar. The design of the algorithm as a whole ensures that this variable will only ever be written to once. Here is offerAll:

This is all that is needed for events with choice at both ends. You call offerAll with a list of offers and it gives you back the value you associated with that offer.

The Public API

To wrap this into a communication channel with a CML-like API, we wrap it up as follows. First we declare an SEvent type (named after CML, hence the re-use of the term event for another meaning) that represents a synchronisation action; this is a list (of choices), each containing an internal event, an action to perform during the completion of the offer, and one to perform afterwards that will yield a return value (which we can use for a Functor instance):

You can see that the choose function simply joins lists of choices together. We define our synchronisation function using offerAll, which will return the corresponding afterwards-STM-action for the chosen event, which we then execute using atomically:

The send function puts the value to send into the variable during the original event completion, and then afterwards the reader takes the value out of the variable at its leisure. (The implementation assumes that the same participants will use the channel each time; an extra level of indirection could be added to make the implementation more flexible in this regard.)

The code in this post provides nearly the same functionality as the CML library, but my tests indicate it is faster. I have now uploaded this code (with some documentation) as the sync package on Hackage. This provides a useful “lite” alternative to CHP that runs in the IO monad, and an alternative implementation of most of the features of the CML package.