Sunday, June 14, 2009

Perfect strictness analysis (Part 2)

Strictness analysis fails to see through polymorphic functions whose strictness is in whole or in part determined by arguments supplied by callers. The example I gave was the function foldl, whose seed parameter can be made strict so long as the binary function used for the fold is also strict.

Can we do better? Maybe, but I can't think of a clean solution. Here, I'll sketch out an algorithm that doesn't quite work. Note that there is a runtime cost for implementing such an approach, but I believe the runtime cost can be made less than the cost of unnecessarily allocating a thunk on the heap - which requires at the very least a handful of instructions.

Before giving the algorithm, let's be clear about the goal: the goal of perfect strictness analysis is to avoid allocating a thunk for any argument that could be made strict without altering the callee's semantics. We don't care whether this determination can be done statically or whether it must wait until runtime.

In the case that it must wait until runtime, here is a sketch of an algorithm that seems like it might work, but doesn't. I'm going to loosely assume a stack-based machine, in which we push a function then its arguments onto a stack, then either call the function or create a thunk:

Each function stores (at runtime) an annotation indicating which arguments to it are strict, or, more generally, how the strictness of each argument is dependent on the strictness of other arguments.

Rather than building thunks bottom up, we build them top-down: when creating a thunk, we first push the callee onto a stack.

Before building a thunk for an argument to the callee, we ask the callee whether it is strict in that argument. If it is, we evaluate the argument. Note that the callee may examine its caller, recursively, to determine strictness, or it may examine other arguments that are being passed to the callee. In the case that the information is not yet known or the callee is lazy in that argument, we drop back to building a thunk for that argument.

Let's look out our definition for foldl again, and see how this would work. Recall the definition for foldl:

I'll assume we can perform some analysis to determine that foldl is strict in its third parameter (since we must always evaluate the third parameter to distinguish between the empty list case and the non-empty list case), lazy in its first parameter (since we won't necessarily apply that function to any arguments), and strict in its second parameter iff the first parameter is a binary function strict in both its arguments. Notice that the strictness of f actually depends not just on the type of the third argument, but on its runtime value - if the list is empty, f will not be evaluated; if it is nonempty, f will be evaluated. This is going to cause problems, as we'll see in a moment.

In the recursive call, foldl f (f s h) t, we would push foldl onto a stack. Now we are at the point where we are passing the first argument to foldl, so we would ask foldl whether it is strict in this argument. It responds 'no', so we leave f unevaluated. Next we are at the point we we are about to pass the second argument to foldl. We again query foldl, asking if it is strict in its second argument. foldl checks its first argument - except if f is unevaluated, its strictness is unknown, so we have to fall back to thunking the second argument. We're back where we started - the strictness of the function f won't become known until we reach the end of the list, so we end up with the same behavior as before. Ouch! Note that if f were already evaluated, foldl would be able to determine that the second parameter could be evaluated before being supplied.

It seems we would need another level of analysis to determine that the first parameter, f, can be made strict if the list being folded over is nonempty. In principle, I don't see why we couldn't do this sort of analysis, but this is starting to get ugly. In addition, we have the problem that when we go to supply f to foldl, the list argument hasn't been supplied yet, so we can't inspect the list to determine if it is nonempty. This might not be a fundamental problem since we obviously have a reference to the list even if we haven't formally supplied to foldl yet. But still, this is getting a lot more complex than I'd initially hoped.

4 comments:

Again commenting on the poly-strictness features of Haskell: We can actually gain a lot of leverage by making poly-strictness explicit. To do this we introduce a special kind to distinguish strict and lazy:

kind ! = S | L

And to make use of this new kind we have to redefine the function type constructor to take an additional argument of kind ! which determines whether to evaluate the argument strictly or not.

tycon _ {_}-> _ :: * -> ! -> * -> *

Now, we can make the strictness properties of foldl explicit by allowing type variables to range over ! the way they rank over other kinds:

foldl :: (a {s}-> b {t}-> a) {u}-> a {s}-> [b] {S}-> a

Thus, the strictness of the first-argument's first argument is the same as the strictness of the second argument. The third argument is strict (because of the pattern match), and the rest we don't care about. From here we may decide to specialize two different versions of foldl for each type of s and choose the appropriate one to use based on the call site. Such decisions are doable by pragmas, but the compiler might be able to be made smart enough to figure it out itself.

In order to make the syntax more usable, we could leave the ! parameter optionally implicit for the compiler to to figure out.

If we had difference types then we could push down a bit further to recognize that we should have s strictness on the first argument when the third argument is non-null at the non-recursive invocation. This requires unrolling the recursion one step via a worker-wrapper transformation in order to take advantage of the new strictness.

@winterkoninkje - This approach looks very similar to the one described by Stefan Holdermans here on the Haskell list: http://old.nabble.com/Runtime-strictness-analysis-for-polymorphic-HOFs--tt24026972.html#a24070861

I'm still sort of undecided on the merits of this approach. I do think it is interesting to explicitly represent strictness information in the type system. I'm unsure if you can actually build a convenient, easy-to-reason-about system around that.