This is a pretty messy job :-) I'd imagine the answer if it exists is too painful for real use. Have you looked at hsc2hs? It's quite powerful, and can generate the sorts of signatures that you'd like as a preprocessing step.
–
sclvAug 11 '11 at 18:29

One solution I've been considering is to make something like a convertNth function, which would take a number and a function, and do the conversion to that position. I think I sort of get how that would work, though I haven't tried it yet, so maybe it would present some difficulty I haven't thought of. The plus side would be that I'd still be able to use my existing function for non-strings and only have to explicitly call out strings. Ideally, of course, I or someone else would just figure out how to automatically handle the strings.
–
ricreeAug 11 '11 at 18:38

The problem seems to be fitting in IO functions, since everything that converts to CStrings such as newCString or withCString are IO.

Right. The thing to observe here is that there are two somewhat interrelated matters with which to concern ourselves: A correspondence between two types, allowing conversions; and any extra context introduced by performing a conversion. To deal with this fully, we'll make both parts explicit and shuffle them around appropriately. We also need to take heed of variance; lifting an entire function requires working with types in both covariant and contravariant position, so we'll need conversions going in both directions.

Now, given a function we wish to translate, the plan goes something like this:

Convert the function's argument, receiving a new type and some context.

Defer the context onto the function's result, to get the argument how we want it.

Collapse redundant contexts where possible

Recursively translate the function's result, to deal with multi-argument functions

This says we have a context f, and some type t with that context. The Cxt type function extracts the plain context from t, and Collapse tries to combine contexts if possible. The collapse function lets us use the result of the type function.

Simple enough. Handling various combinations of contexts is a bit tedious, but the instances are obvious and easy to write.

We'll also need a way to determine the context given a type to convert. Currently the context is the same going in either direction, but it's certainly conceivable for it to be otherwise, so I've treated them separately. Thus, we have two type families, supplying the new outermost context for an import/export conversion:

This says that two types ext and int are uniquely convertible to each other. I realize that it might not be desirable to always have only one mapping for each type, but I didn't feel like complicating things further (at least, not right now).

As noted, I've also put off handling recursive conversions here; probably they could be combined, but I felt it would be clearer this way. Non-recursive conversions have simple, well-defined mappings that introduce a corresponding context, while recursive conversions need to propagate and merge contexts and deal with distinguishing recursive steps from the base case.

Oh, and you may have noticed by now the funny wiggly tilde business going on up there in the class contexts. That indicates a constraint that the two types must be equal; in this case it ties each type function to the opposite type parameter, which gives the bidirectional nature mentioned above. Er, you probably want to have a fairly recent GHC, though. On older GHCs, this would need functional dependencies instead, and would be written as something like class Convert ext int | ext -> int, int -> ext.

The term-level conversion functions are pretty simple--note the type function application in their result; application is left-associative as always, so that's just applying the context from the earlier type families. Also note the cross-over in names, in that the export context comes from a lookup using the native type.

Now to strike at the heart of the matter, and translate whole functions recursively. It should come as no surprise that I've introduced yet another type class. Actually, two, as I've separated import/export conversions this time.

Nothing interesting here. You may be noticing a common pattern by now--we're doing roughly equal amounts of computing at both the term and type level, and we're doing them in tandem, even to the point of mimicking names and expression structure. This is pretty common if you're doing type-level calculation for things involving real values, since GHC gets fussy if it doesn't understand what you're doing. Lining things up like this reduces headaches significantly.

Anyway, for each of these classes, we need one instance for each possible base case, and one for the recursive case. Alas, we can't easily have a generic base case, due to the usual bothersome nonsense with overlapping. It could be done using fundeps and type equality conditionals, but... ugh. Maybe later. Another option would be to parameterize the conversion function by a type-level number giving the desired conversion depth, which has the downside of being less automatic, but gains some benefit from being explicit as well, such as being less likely to stumble on polymorphic or ambiguous types.

For now, I'm going to assume that every function ends with something in IO, since IO a is distinguishable from a -> b without overlap.

The constraints here assert a specific context using a known instance, and that we have some base type with a conversion. Again, note the parallel structure shared by the type function Import and term function ffImport. The actual idea here should be pretty obvious--we map the conversion function over IO, creating a nested context of some sort, then use Collapse/collapse to clean up afterwards.

We've added an FFImport constraint for the recursive call, and the context wrangling has gotten more awkward because we don't know exactly what it is, merely specifying enough to make sure we can deal with it. Note also the contravariance here, in that we're converting the function to native types, but converting the argument to a foreign type. Other than that, it's still pretty simple.

Now, I've left out some instances at this point, but everything else follows the same patterns as the above, so let's just skip to the end and scope out the goods. Some imaginary foreign functions:

Nice! Two issues though -- since you're just using newCString and not withCString, this will leak like an unnamed source at the pentagon. Second, without undecidable instances, I'm assuming this code also can't default to let through arbitrary values (without convert instances) unchanged?
–
sclvAug 12 '11 at 14:28

@sclv: Good point about the allocation--using withCString actually makes for an interesting example, as well. As for default instances, that's only possible in the general case with overlapping instances and, hence, fundeps. Undecidable instances are already needed for the recursion in Import and a few others.
–
C. A. McCannAug 12 '11 at 15:27

@sclv: Keep in mind that I was intentionally avoiding overlapping, and fundeps as a whole, specifically to illustrate type families and how much nicer they are for things like this. At least, I think they're nicer, with the symmetry between the type and term functions.
–
C. A. McCannAug 12 '11 at 17:29

This can be done with template haskell. In many ways it is simpler than the
alternatives involving classes, since it is easier pattern match on
Language.Haskell.TH.Type than do the same thing with instances.

Checking the generated code by loading test.hs with -ddump-splices (note that
ghc still seems to miss some parentheses in the pretty printing) shows that
foreign_2 writes a definition which after some prettying up looks like:

Generating code the first way is simpler in that there are less variables to
track. While foldl ($) f [x,y,z] doesn't type check when it would mean
((f $ x) $ y $ z) = f x y z
it's acceptable in template haskell which involves only a handful of different
types.

Nice! Good to see worked examples of doing things with TH. In many ways I prefer type classes for things like this, where conceptually it really is a function on types that carries the terms along, but as you point out, TH makes some parts easier to work with.
–
C. A. McCannAug 12 '11 at 12:38

Here's a horrible two typeclass solution. The first part (named, unhelpfully, foo) will take things of types like Double -> Double -> CString -> IO () and turn them into things like IO (Double -> IO (Double -> IO (String -> IO ()))). So each conversion is forced into IO just to keep things fully uniform.

The second part, (named cio for "collapse io) will take those things and shove all the IO bits to the end.

Aside from being a generally terrible thing to do, there are two specific limitations. The first is that a catchall instance of Foo can't be written. So for every type you want to convert, even if the conversion is just id, you need an instance of Foo. The second limitation is that a catchall base case of CIO can't be written because of the IO wrappers around everything. So this only works for things that return IO (). If you want it to work for something returning IO Int you need to add that instance too.

I suspect that with sufficient work and some typeCast trickery these limitations can be overcome. But the code is horrible enough as is, so I wouldn't recommend it.