The LVar that wasn't

Dec 24th, 2013

Suppose we want to write a program in which two threads, say, t1 and t2, each contribute a Boolean value, either True or False, to a shared result. If both threads write True, we want the result to be True; if either writes False (or if both do), then we want the result to be False. That is, we want the shared result to be the outcome of a logical “and” operation.

Let’s further suppose that we want the threads to be able to run in parallel, with their writes arriving in arbitrary order, and we want to be able to guarantee a deterministic result in spite of parallel execution. Sounds like a job for LVars!

Update (February 19, 2014): I’ve updated the links in this post to point to code that runs against the 1.1.2 release of LVish.

Six states

Consider the states that the shared result might take on as the program runs. There seem to be six such states, falling into four equivalence classes:

No information: First of all, there’s the state in which neither thread has written yet, so we have no information. We always begin in this state.

Some information, but not enough: Then there’s the state in which t1 has written True but t2 hasn’t written anything, and conversely, the state in which t2 has written True and t1 has been silent so far. If we’re in one of these states, we’re not ready to conclude that the result is True or False yet, because although we have some information, it’s still not enough to draw a conclusion.

Exactly enough information: Then there’s the state in which at least one thread has written False. In this case, we can go ahead and conclude that the result is False. On the other hand, there’s the state in which both threads have written True, in which case we can conclude that the result is True.

Too much information: Finally, there’s one more possibility — a state that represents the situation in which conflicting information has arrived. For instance, suppose that the shared result is in the “both are true” state, but then a write of False comes along. This couldn’t actually happen under the assumptions we’re making — which are that each thread can only write True or False, and that threads cannot change their answers — but let’s throw this state in anyway. If our program ever did enter this state, it would be the equivalent of shouting, “TMI!”, indicating that something had gone horribly wrong. Why would we need to include such a state? We’ll come back to this in a little while.

We can give the states names, order them by how much information we have when we’re in each one, and make a diagram showing us the relationships between states according to that order. Bot, short for “bottom”, is the state in which we have no information. TrueBot and BotTrue are, respectively, the state in which t1 has written True but t2 hasn’t written anything, and the state in which t1 hasn’t written anything but t2 has written True. TrueTrue is the state in which two writes of True have arrived, and F is the state in which at least one write of False has arrived. Finally, Top is the “TMI!” state.

Whenever a state is lower down in the diagram than another, we can interpret it as meaning that the lower state has less information than the higher state. For instance, Bot has less information than BotTrue, but TrueTrue does not have less information than F. Therefore, we can say that our six states are ordered by a “has less information than” relation. This relation is a partial order, rather than a total order, because some pairs of states, such as TrueTrue and F, have no relationship according to it — that is, neither one has less information than the other. But other pairs of states, like Bot and BotTrue, do belong to the relation. Together, the set of states and the partial order are called a partially ordered set, or poset.

Writing it down with LVish

How are we going to express all of this in code? Well, we want a shared data structure whose states are a partially ordered set, and whose contents grow monotonically with respect to that partial order. Using the LVish Haskell library, we can define an LVar type for the shared result. For instance, we might define it like this:

12

typeResult=PureLVarStatedataState=Bot|TrueBot|BotTrue|TrueTrue|F|Top

Here, we’re using the PureLVar type constructor provided by the LVish library’s Data.LVar.Internal.Pure module, in order to define a type called Result. The states that a Result can take on are the six states we just discussed.

So far, our code says nothing about how the partial order relates these states to each other; there’s no indication that, for instance, Bot has less information than F. In LVish, the way we express “has less information than” relationships is by writing a join operation that takes two states and returns a state. The join of two states x and y will be y if x has less information than y, and it will be x if y has less information than x. If neither has less information than the other — for instance, if we take the join of TrueTrue and F — then the result will be neither one of them; instead, it will be the least state that is greater than or equal to them both. In our case, by “least”, we mean having the least information, and by “greater than or equal to”, we mean having more information or the same amount of information. In other words, when we take the join of two states, the result will be a state that has just enough information to subsume both of them, and no more than that.

For example, TrueTrue is the join of TrueBot and BotTrue, because it’s at or above both TrueBot and BotTrue in our diagram, and (unlike Top) it’s the least element in the poset that meets that condition. Likewise, TrueBot is the join of Bot and TrueBot; Top is the join of TrueTrue and F; and so on.

Here’s the entire definition of the join operation, which we’ll call joinStates:

1234567891011121314151617181920

joinStates::State->State->State-- Joining an element with itself results in that element.joinStatesxy|x==y=x-- Joining an element with `Bot` results in that element.joinStatesBotx=x-- Joining an element with `Top` results in `Top`, in either order.joinStatesTop_=TopjoinStates_Top=Top-- Interesting cases involving `TrueTrue` or `F`.joinStatesTrueBotBotTrue=TrueTruejoinStatesTrueTrueTrueBot=TrueTruejoinStatesTrueTrueBotTrue=TrueTruejoinStatesFTrueTrue=TopjoinStatesF_=F-- Join is commutative.joinStatesxy=joinStatesyx

That’s kind of a mouthful, so we might want to take a moment to convince ourselves that it corresponds to the diagram up above. Happily, we only have six states, and so there are only thirty-six possible joins that could occur, since each state could be joined with itself or any of the others. That’s few enough that it’s tractable to just write out all thirty-six possibilities and verify by hand that they agree with the diagram. The printAllJoins function defined in this file can help by printing out the thirty-six possibilities, and there’s example output from printAllJoins in the comments in that file.

Using joinStates

Assuming that joinStates does, in fact, express the same relationships between states as the ordering shown in the diagram, then we should be good to go. But why did we have to define joinStates, anyway?

Since we defined Result to be a PureLVar, we can now manipulate Results using the operations on PureLVars that are provided by the Data.LVar.Internal.Pure module. For example, the getPureLVar operation allows us to do a “threshold read” of the contents of a PureLVar. Here’s the signature of getPureLVar:

1

getPureLVar::(JoinSemiLatticet,Eqt)=>PureLVart->[t]->Part

getPureLVar takes two arguments: a PureLVar, whose state is of type t, and a list of possible states (again, of type t) that the PureLVar might be in. It returns a state wrapped in a Par computation.

Notice that there’s a JoinSemiLattice typeclass constraint on t. We need that constraint because the implementation of getPureLVar is written in terms of a join function on states of PureLVars, like the joinStates that we defined up above. Through the JoinSemiLattice typeclass constraint, getPureLVar’s type expresses its expectation that such a join function exists. So we need to declare State to be an instance of JoinSemiLattice and indicate that the joinStates function we wrote is to be used as join.

12

instanceJoinSemiLatticeStatewherejoin=joinStates

So, that’s why we had to define joinStates — so that we could use it to declare State to be an instance of JoinSemiLattice, allowing us to make use of library functions like getPureLVar.

How about the Eq constraint on t? This part is slightly more subtle. We mentioned that getPureLVar is written in terms of join, but that’s actually not quite right; it’s actually written in terms of another function, joinLeq, and that is written in terms of join.

What is joinLeq? Suppose we have two elements x and y, and we want to know if x is “less than or equal to” y — which would imply that x has no more information than y. Can we determine the answer to this question if all we have is the join operation? Yes! All we have to do is take the join of x and y. If the result is y, then yes, x does indeed have no more information than y. In fact, the Algebra.Lattice library provides just such a joinLeq function.

But wait: in the implementation of joinLeq, once we’ve done the join of x and y, we need to compare the result of it with y, and we can’t do that comparison unless we have an == operation defined on States! This operation is provided by Haskell’s Eq typeclass. So, the type of joinLeq in Algebra.Lattice has an Eq constraint, along with a JoinSemiLattice constraint, and therefore our own getPureLVar function needs the Eq constraint, too — because getPureLVar calls joinLeq. We’ll see in a moment why getPureLVar needs to do that!

Reading a Result

There are two things we want to be able to do with a Result: write to it from our two parallel threads, and read from it when all the writes that are going to matter are finished. Let’s consider the latter first.

We only want to read from a Result when it’s in one of the two “exactly enough information” states: TrueTrue or F. Together, we call the states TrueTrue and F a “threshold set”, because they represent the threshold that a Result has to reach before we read from it. The getResult operation just takes a Result, in this case res, and calls getPureLVar on res with the threshold set of [TrueTrue, F] as the second argument to getPureLVar.

12

getResult::Result->ParStategetResultres=getPureLVarres[TrueTrue,F]

The semantics of getPureLVar are that it will block until res’s state reaches or surpasses an element of the threshold set. It’s because of this “reaches or surpasses” check that getPureLVar has to call joinLeq. If getPureLVar returns, it will return the member of the threshold set that the current state of res is at or above, wrapped in Par. getResult then returns that Par-wrapped State.

Recall that res is going to be written into from multiple threads that share access to it, and we don’t know what order those writes will occur in. The nice thing about getResult is that it will always behave the same way for a given set of writes to res, regardless of the order in which those writes occur. This is guaranteed to be the case because only one of TrueTrue or F can ever be the state of res.

If we had provided, say, [TrueTrue, TrueBot] as the threshold set, then we wouldn’t have that guarantee. Our program would allow schedule nondeterminism to be observed: getResult might return different values depending on the order in which t1 and t2 ran. For example, if t1 and t2 were both going to write True, and t1 ran first, then getResult might unblock and return TrueBot (wrapped in Par) immediately, without waiting for t2 to run. But if t2’s write of True happened first, followed by t1’s,, then getResult would unblock and return TrueTrue wrapped in Par.

So, we shouldn’t use [TrueTrue, TrueBot] as a threshold set. In fact, if we want to ensure that schedule nondeterminism is not observable, then the only valid threshold sets are those in which, for every two elements in the threshold set, their join is Top. (This is one explanation for why we need a Top state: beause the validity of threshold sets is defined in terms of it.) Notice that [TrueTrue, F], the threshold set we actually used, is just fine, since join TrueTrue F is Top.

Writing a Result

Next, let’s consider how we’re going to write into a Result in the first place. For that, we’ll write a function called asyncAnd. asyncAnd takes two Par-wrapped Bools, which are the two Boolean values that threads t1 and t2 respectively contribute to the result, and it returns a Par-wrapped Bool: True if the result was TrueTrue, and False otherwise.

asyncAnd creates a new Result, named res, by calling newPureLVar. It then forks two threads, each of which looks at a particular one of the provided Bools and writes the corresponding State into res. All this is followed by a call to getResult. Notice that if either thread writes F, then getResult can unblock immediately thereafter, without waiting for the other thread to finish. Therefore asyncAnd does a “short-circuit” computation: as soon as it encounters something that would make the result false, it doesn’t bother to wait for further results. (This is the reason for the “async” in the name.)

Also worth pointing out is that True and False don’t belong to the same universe as TrueTrue; True and False are mere Bool values, while TrueTrue is one of the six inhabitants of the State type. So, asyncAnd has to do the work of converting a TrueTrue back into a True, so that it can return the correct type. It just does this by returning the result of a == test that checks if the contents of the Result are TrueTrue.

Running asyncAnd

We’re finally ready to actually try calling asyncAnd with some arguments and see what happens. Since asyncAnd returns a Par computation, we’ll need to run that computation with the runPar function provided by LVish to see its results.

We can load up our code in GHCi and try out some example calls. Here’s a simple one:

12

*Main>runPar(asyncAnd(returnTrue)(returnFalse))False

Here, we’re giving asyncAnd two arguments, True and False, each wrapped in a return to give it the Par Bool type that asyncAnd needs. Unsurprisingly, the result of combining True and False is False. We can go ahead and check the rest of the truth table, too:

How about something a little more interesting? asyncAnd only takes two arguments, but we can fold it over a much longer list of arguments, which will give us the conjunction of them all. Here we’re folding it over a long list of alternating Trues and Falses, with the result being False, of course. We use (return True) as the starting value because True is the identity element of logical “and”.

Finally, while it’s good that these examples seem to work in GHCi, to really exercise asyncAnd we’ll want to compile our code with GHC using the -threaded flag, and then use the -N option to run the program on multiple processors.

An embarrassing mistake

The determinism guarantee of LVars is premised on the expectation that the states an LVar can take on form a partially ordered set, and that every two elements in that set have a join. In other words, the set of states it can take on are a join-semilattice. If that isn’t true, then all determinism bets are off.

There are two equivalent ways to define a join-semilattice. In fact, if S is a poset, the following three properties are equivalent, meaning that they’re either all true or all false:

S is a join-semilattice.

For all elements a and b of S, the join of a and b exists.

For all elements a, b, and c of S, the following identities hold:

Commutativity: a join b = b join a.

Associativity: a join (b join c) = (a join b) join c.

Idempotence: a join a = a.

Property (2) certainly seems like it ought to be true of Result. After all, we defined joinStates, and it’s a total function: give it any two States and it’ll give us an answer. Surely we can say that all joins exist.

How about property (3)? In some cases, we can write code to check that the commutativity, associativity, and idempotence identities hold. In particular, for a finite set of states like the one we’re dealing with — and particularly for a small finite set, like ours — it’s no problem to just iterate over all combinations of a, b, and c and check that commutativity, associativity, and idempotence hold. In fact, the current head-of-tree version of Data.LVar.Internal.Pure exposes a function called verifyFiniteJoin that does just such a check. It’s a quick-and-dirty function that takes as arguments a list of states and a join operation and returns an error message if any of the three properties fail. Here’s the code:

123456789101112131415161718

-- | Takes a finite set of states and a join operation (e.g., for an-- instance of JoinSemiLattice) and returns an error message if the-- join-semilattice properties don't hold.verifyFiniteJoin::(Eqa,Showa)=>[a]->(a->a->a)->MaybeStringverifyFiniteJoinallStatesjoin=case(isCommutative,isAssociative,isIdempotent)of(hd:_,_,_)->Just$"commutativity violated!: "++showhd(_,hd:_,_)->Just$"associativity violated!: "++showhd(_,_,hd:_)->Just$"idempotency violated!: "++showhd([],[],[])->NothingwhereisCommutative=[(a,b)|a<-allStates,b<-allStates,a`join`b/=b`join`a]isAssociative=[(a,b,c)|a<-allStates,b<-allStates,c<-allStates,a`join`(b`join`c)/=(a`join`b)`join`c]isIdempotent=[a|a<-allStates,a`join`a/=a]

When I wrote verifyFiniteJoin a month ago, I felt smug about how easy it would now be to verify that finite posets were join-semilattices. I couldn’t wait to try it on the first finite set of states I had handy, which happened to be the states of Result. When I did, here’s what I saw.

What went wrong here?

It turns out that joinStates really isn’t associative. In particular, we have:

12

*Main>TrueBot`joinStates`(BotTrue`joinStates`F)F

since BotTrue joined with F is F, which, joined with TrueBot, is again F. But:

12

*Main>(TrueBot`joinStates`BotTrue)`joinStates`FTop

since TrueBot joined with BotTrue is TrueTrue, which, joined with F, is Top! Therefore, the set of states that Result can take on isn’t a join-semilattice, and so, despite all our efforts, Result is not “really” an LVar.

Even more embarrassingly, this isn’t the first time I’ve made this mistake. The three-way parallel “or” computation that I wrote about in the second half of this post has the same associativity bug! Consider, for instance, the three states yes, { horace_no, franz_no }, and { horace_no, kat_no }. The join of yes and { horace_no, franz_no } is supposed to be yes, which when joined with { horace_no, kat_no }, is still yes. But the join of { horace_no, franz_no } and { horace_no, kat_no } is { horace_no, franz_no, kat_no }, which, when joined with yes, results in ⊤, which represents an error. Most of the post still makes sense, but with that particular parallel “or” example, just like with asyncAnd, I messed up because I was reaching too hard for an interesting threshold set.

Property (2) doesn’t hold, either

Returning to property (2) up above: since we’ve shown that property (3) doesn’t hold, it must also be the case the property (2) is false. But how is that possible? Doesn’t joinStates return a state for every pair of states it’s given as arguments? It never goes into an infinite loop or raises an error! Isn’t that enough to satisfy property (2)?

The trouble is that, although joinStates computes a total function, it doesn’t actually compute a join. Recall that the join of two elements is supposed to be the least element that is greater than or equal to them both. Because of this, the join of two elements is also known as their least upper bound. While an element may, in general, have many upper bounds — for instance, BotTrue has TrueTrue and Top as upper bounds — only one of those can be its least upper bound.

How can we know whether a given join function actually computes a least upper bound? For a give poset S, it would suffice for the following two properties to hold:

For all elements v1, v2, and v of S,
if v1 <= v and v2 <= v, then (v1 join v2) <= v.

The first of these properties says that, of all the upper bounds v1 and v2 might share, then whatever their join is, it had better be no bigger than any of those upper bounds. This property captures the “least” part of “least upper bound”. The second property just says that, when you join two elements, you have to get something that’s an upper bound on both of them; this property captures the “upper bound” part of “least upper bound”.

Again, with a finite set of states like we’re dealing with, it’s easy to check both of these properties, and as it turns out, joinStates fails the first one. TrueTrue and F are both upper bounds of TrueBot and BotTrue, and neither TrueTrue nor F is less then the other. Therefore, TrueBot and BotTrue don’t actually have a least upper bound, regardless of what the definition of joinStates says.

Does it matter?

So, joinStates breaks associativity, and so the set of states that a Result can take on is not a join-semilattice. But, is this a problem? Does breaking associativity actually matter in practice?

In the case of the asyncAnd example, the lack of associativity doesn’t seem to make a difference in practice, since every call to asyncAnd allocates one LVar and then writes to it twice. Because there are never more than two writes to any LVar, the associativity-breaking case of joinStates never occurs. Therefore we have a notion of “observable associativity”.

One thing to make note of is that we could have avoided any question about associativity by defining the States and the partial order on them differently. Instead of conflating all the states involving a write of False into a single state F, we could have defined a poset that looks like this:

With this poset, the join operation would have been associative, and we could have written getResult with a threshold set of all four final states: [TrueTrue, FalseTrue, TrueFalse, FalseFalse]. The disadvantage of doing it this way is that we lose the ability to get early answers from the computation. If a single False is written right away, the advantage of our original version is that it allows getResult to unblock immediately, since no later write can change the fact that the whole thing is going to be False. With this version, we have to wait for the second write regardless of what the first write is. Is associativity at odds with the ability to get early answers?

Try it yourself

All the code from this post, plus some additional scaffolding, is available on GitHub. If you want to play with it yourself, be sure to check the README for details on how to get set up.

Compared to the version on GitHub, a few of the types in this post are simplified. In the real code, you’ll notice two type variables, s and d, in a number of places:

The definition of the Result type is really type Result s = PureLVar s State.