Friday, November 4, 2011

I spoke before of using global locks in Haskell to protect a thread-unsafe C library. And I wrote about a GHC bug which breaks the most straightforward way to get a global lock.

My new solution is to store an MVar lock in a C global variable via StablePtr. I've implemented this, and it seems to work. I'd appreciate if people could bang on this code and report any issues.

You can get the library from Hackage or browse the source, including a test program. You can also use this code as a template for including a similar lock in your own Haskell project.

The C code

On the C side, we declare a global variable and a function to read that variable.

staticvoid* global = 0;

void* hs_globalzmlock_get_global(void) {return global;}

To avoid name clashes, I gave this function a long name based on the z-encoding of my package's name. The variable named global will not conflict with another compilation unit, because it's declared static.

Another C function will set this variable, if it was previously 0. Two threads might execute this code concurrently, so we use a GCC built-in for atomic memory access.

The unsafe import of c_get_global demands justification. This wrinkle arises from the fact that GHC runs many Haskell threads on the same OS thread. A long-running foreign call from that OS thread might block unrelated Haskell code. GHC prevents this by moving the foreign call and/or other Haskell threads to a different OS thread. This adds latency to the foreign call — about 100 nanoseconds in my tests.

In most cases a 100 ns overhead is negligible. But it matters for functions which are guaranteed to return in a very short amount of time. And blocking other Haskell threads during such a short call is fine. Marking the import unsafe tells GHC to ignore the blocking concern, and generate a direct C function call.

Our function c_get_global is a good use case for unsafe, because it simply returns a global variable. In my tests, adding unsafe decreased the overall latency of locking by about 50%. We cannot use unsafe with c_set_global because, in the worst case, GCC implements atomic operations with blocking library functions. That's okay because c_set_global will only be called a few times anyway.

The Haskell code

Now we have access to a C global of type void*, and we want to store a Haskell value of type MVar (). The StablePtr module is just what we need. A StablePtr is a reference to some Haskell expression, which can be converted to Ptr (), aka void*. There is no guarantee about this Ptr () value, except that it can be converted back to the original StablePtr.

It's fine for two threads to enter set concurrently. In one thread, the assignment will be dropped, and c_set_global will return 0. In that case we free the unused StablePtr, and the MVar will eventually be garbage-collected. StablePtrs must be freed manually, because the GHC garbage collector can't tell if some C code has stashed away the corresponding void*.

In the common path, we do an unsynchronized read on the global variable. Only if the variable appears to contain NULL do we allocate an MVar, perform a synchronized compare-and-swap, etc. This keeps overhead low, and makes this library suitable for fine-grained locking.

All that's left is the user-visible locking interface:

lock ::IO a ->IO alock act = get >>=flip withMVar (const act)

Inspecting the machine code

Just for fun, let's see how GCC implements __sync_val_compare_and_swap on the AMD64 architecture.