Haskell API Design: Scoped Resources

A common idiom in any programming language is that of acquiring a resource, using it, and then releasing (de-acquiring?) it. The resource may be a file, a network socket or all sorts of other things. The main challenges with supporting this programming idiom in an API are three-fold:

Making sure that the programmer cannot use the resource before acquisition or after release.

Making sure that the programmer remembers to release the resource once they are finished with it.

Make sure that the resource is released if a problem occurs while it is acquired.

In Java you might implement this as a try/finally block (takes care of 3; not so good on 1 or 2). In C++ you might use the RAII pattern: acquiring in a constructor and releasing in a destructor (takes care of all three reasonably well). In Haskell, you can use things like the bracket function (written about it in detail elsewhere) to write what we might call “with..” APIs. Some classic examples:

These functions take as an argument the block to execute while the resource is acquired. If only this API is offered, it is impossible to forget to release the resource: release happens as soon as the block terminates. And if an exception occurs, that is also taken care of. There is a slight hole with respect to challenge 1 above: see if you can figure it out (we’ll return to it at the end of the post).

In this post, I thought I’d explore a few uses of this “scoped resource” pattern in my CHP library which use Haskell’s types to indicate the acquisition status.

Sharing

CHP features communication channels that allow you to send values between different threads. For example, something of type Chanout Integer allows you to send integer values. CHP’s channels are designed to be used by a single reader and a single writer. Sometimes, however, you want multiple writers and/or multiple readers to be able to use the same channel. For example, you may want to have many worker processes communicating results to a single harvester process. This can be achieved using shared “any-to-one” channels, which can be constructed simply by adding a mutex to the sending channel end. This is another case of the scoped resource pattern: we want to allow claiming of the shared channel (acquisition), writing to the channel while it is claimed, and to make sure the channel is released afterwards.

We can define a Shared wrapper for any type that simply holds a mutex alongside it:

data Shared a = Shared Mutex a

When we export this type we will not export its constructor, meaning that users of our shared API will not have access to the mutex, and will not be able to take the channel out of its shared wrapper, except using our API function, namely:

claim :: Shared a -> (a -> CHP b) -> CHP b

(I perhaps should have called it withClaimed to fit in with the other Haskell APIs, but never mind.) This specialises in the case of our Chanout Integer into:

Note that Chanout is a channel that can be used for sending, with functions such as:

writeChannel :: Chanout a -> a -> CHP ()

This sharing mechanism uses an opaque Shared wrapper type (i.e. one that has hidden internals) to stop the channel being written to unless the claim function is used to acquire/unwrap it into a normal channel for the duration of the block.

Enrolling

A barrier in CHP is a communication primitive that supports a single primitive operation: synchronising on the barrier. The rule for using them is straightforward: when you want to synchronise on a barrier, you must wait until all other processes that are members of the barrier also synchronise, at which point you all do synchronise and continue onwards. CHP supports dynamic membership of barriers: processes can enroll on the barrier (become members) and resign (leave the barrier) at any time. This is again an instance of the scoped resource pattern: enrollment is acquisition, you can synchronise while you are a member, and you should leave again at the end.

With barriers, we wrap them when you are enrolled on them, and only allow synchronisation on enrolled barriers:

As before, we don’t expose the details of the Enrolled type, so you can’t just wrap something up yourself and pretend that you Enrolled. The enroll function does the acquisition and release (aka enrolling and resigning), and during the block in the middle you can synchronise on the enrolled barrier.

Different Ways to Wrap

We’ve seen two examples of scoped resources. In the Shared example, we used our wrapper to block access to the key functionality: we might call this negative wrapping. With negative wrapping you need to acquire the resource to remove the wrapper and use the object inside. In contrast, in the Enrolled example, we used our wrapper to enable access to the key functionality: we could correspondingly call this positive wrapping. With positive wrapping you need to acquire the resource to add the wrapper and use the object inside.

So which is better: positive or negative wrapping? With our negative wrapping, we were able to make use of the existing functions for channels once the value was unwrapped. Consider for a moment the possibility of adding a barrier type that had a fixed membership: no need to enroll. To fit with our existing API for synchronising on barriers, we would have to use the Enrolled Barrier type for our fixed-membership barrier, but that seems a little misleading. It might have been better to have a (negative) Unenrolled wrapper on barriers that is removed by enrolling, which would allow our API to allow synchronisation on a plain Barrier. Negative wrappers also allow for resources to be acquired in different fashions. You could have an Unenrolled (Chanout Integer) that can only be used by enrolling, which is orthogonal to a Shared (Chanout Integer); with positive wrappers this is not the case. So overall I think it is probably better to use negative wrappers.

(The other way to do it is to have totally separate types, e.g. FilePath and Handle in withFile, above. This is fine if you only have one specific scoped resource to deal with, but loses out on the chance to have a simple type-class like Enrollable, above, that encompasses the same operation (here: enroll) on multiple different types being wrapped.)

The Hole

I said earlier in the post that these scoped APIs have a problem (which exists for both positive and negative wrappers). Consider again the claim function:

If you look at the function above, you’ll see it type-checks: types “a” and “b” are both Chanout String. What’s happened is that we have claimed the channel, and then passed it back out of the claim block. So now the mutex has been released, but we have the inner channel which we then use on the second line, outside the claim block. In general, I trust the users not to make this mistake. But there are ways to fix it. The technique known as lightweight monadic regions allows just that: paper here, subsequent library here. I’m going to try to convey the key idea here (as best I understand it).

The idea behind improving the scoped resource API is to parameterise the acquired resource by the monad in which they are allowed to be used. Here’s a simple example intended to illustrate the idea:

What’s happening here is that the acquired resource has an extra type parameter that indicates which monad it can be used in. The type signature of useResource makes sure you can only use the resource in its corresponding monad. Our withResource function uses a forall to make sure that the block you pass must work with any appropriate monad it’s given: we can actually use the plain IO monad! In fact, we have ended up with two levels of protection in this API: the forall makes it hard to return the resource from the block (see the rank 2 types section of this post for more information), and even if the user returns the resource from the block, the type signature of useResource means it can’t be used afterwards outside the monad it originated from.

This is only a very basic version of lightweight regions: the actual implementation supports nesting of these scopes, and even to release in a different order than scope would usually enforce (e.g. acquire A, acquire B, release A, release B). The only problem with lightweight regions is that they do create a formidable API, e.g. glance at the types and see if you can follow them. As often in Haskell, the cleverer API that makes extensive use of the type system ends up being quite hard to understand: CHP is also guilty of this in places. There is sometimes a trade-off between type-safety and usability — and I don’t think either should always win.

Advertisements

Like this:

LikeLoading...

Related

In practice you may want to separate out the quantification on the monad from the type level brand, otherwise all code within the quantified fragment winds up using dictionaries for everything including (>>=), return, etc, rather than being able to inline or determine those statically.

This unfortunately means that your brand infects your monad as an extra parameter as well, like with ST s.

I used the ‘quantify only the phantom type approach’ in ‘rad’ on hackage, but when I switched over to ‘ad’ I used the approach you describe here. I will be splitting the quantification on infinitesimal from the quantification on the AD mode, explicitly because there is a noticable performance difference between the two alternatives.

I.e. always work under “Acquired” and do whatever combination you want. When you acatually want to put it on synchronization and get result you have to use “claim”. And you have no need to restrict your work description to sequential form. Declare way to orginize parallel access to same resources through “zip” (if for example you have resource with rwlock and doing read).