How to restrict a monad without breaking it

The winding road to the Set monad

By eliminating duplicate possible worlds, a set monad can dramatically
accelerate non-deterministic computations. But such an efficient monad
does not exist -- or so it has been believed. We describe the challenges
and two ways to overcome them, taking hints from partial evaluation and
monadic reflection.

Introduction

Non-determinism was first introduced into computing in 1959, in the
Turing-award paper by Rabin and Scott: ``A nondeterministic automaton
has, at each stage of its operation, several choices of possible
actions... A nondeterministic automaton is not a probabilistic machine
but rather a machine with many choices in its moves. At each stage of
its motion across the tape it will be at liberty to choose one of
several new internal states''. Rabin and Scott have also introduced a
method to deterministically reason about (and, eventually, implement)
such non-deterministic machines: keep track of all currently possible
states -- of all `possible worlds'. Each internal state may transition
to zero, one or more new internal states. The collection of currently
possible states is updated as the computation progresses. Such a
collection could be a list -- hence the famous List monad. However,
if two new states of the machine turn out the same, surely it is
redundant to track both. The order among possible states may have no
significance either. A better choice for the collection of currently
possible states is a set. That is exactly what Rabin and Scott
chose, in their, now textbook, conversion of the non-deterministic
finite automaton to a deterministic one.

The possible worlds can be tracked in `parallel' (in the breadth-first
fashion) or one after another (backtracking). To detect duplicate
intermediate states, one has to store the ones already tracked. The
Set monad hence may be regarded as a memoization, trading space for time.
When the intermediate states are very numerous, the opposite trade-off
may be advisable. In other cases, with many duplicates,
the Set monad is exponentially worthwhile.

Perhaps for these theoretical and natural reasons, the question of Set
monads comes up regularly on blogs and mailing lists. The answers
have been disappointing. We recount the challenges after the
background section. First, there is a misconception that the Set
monad is impossible. Rather, Set has to be a restricted monad. The
restrictions are unpalatable however. It turns out, a genuine,
unrestricted Set monad exists -- but in its naive formulation is a
joke. Its CPS version is more efficient, to a limit. It does make a
point that a restricted monad can be realized as a regular,
restriction-free continuation monad. Yet efficiency may suffer. Some
computations that should take linear time in a Set monad take
exponential time in the naive CPS version. Furthermore, it seems this
inefficiency is inherent and there is nothing we can do.

The last two sections present approaches to attain the expected
efficiency. The first insight comes from partial evaluation: take
advantage and propagate the statically known information. For example,
the choice choose [1,2] signals that all intermediate values at that
point in the computation are integers and hence may be collected in a
set. The second insight is that the composition of monadic reflection
and reification is often an optimization. Efficient and
restriction-free Set monads exist after all.

References

Michael O. Rabin and Dana Scott: Finite Automata and Their Decision Problems
IBM Journal of Research and Development 3 (2), April 1959, pp. 114-125.
doi:10.1147/rd.32.0114

Michal Armoni and Mordechai Ben-Ari: The concept of nondeterminism: its development and implications for teaching
ACM SIGCSE Bulletin, 41(2), June 2009, pp. 141-160.
10.1145/1595453.1595495

Monads for non-determinism and the List monad

For the sake of reference, we recall the MonadPlus abstraction for
non-deterministic computations in Haskell and its most known
implementation, the List monad. The running examples will show off the
duplication of possible worlds, which clearly blows off in one
example. The efficient Set monads described later on the page speed up
such computations exponentially.

As other effects, nondeterminism is represented
in Haskell as a Monad. Non-deterministic computations are built
with the standard monadic operations return and (>>=) and
two primitives: mplus m1 m2
for the non-deterministic choice between computations m1 and m2; mzero for failure, the choice among zero alternatives.
The type class MonadPlus specifies the operations as

thus defining the `monad for non-determinism'. All our Set monads (except
the restricted version) are the instances of MonadPlus. The operation
to non-deterministically choose a value among several is generically
defined as

(One could have picked choose to be the primitive non-deterministic
operation.) The first running example computes the indices, among the first
30, of those triangular numbers that are sums of two triangles:

The example is defined generically for any MonadPlus implementation.
The code plainly matches the description of the Pythagorean triangle
problem: pick two triangles that sum to a triangle and return the
index of the latter. The unlucky choice is a failure. Such clarity of
expression was the chief motivation for non-determinism: ``The main
advantage of these [non-deterministic] machines is the small number of
internal states that they require in many cases and the ease in which
specific machines can be described'' (Rabin and Scott, 1959).

One implementation of non-determinism is to collect all `states' --
all values that could possibly be produced at some point in
the computation -- in a List. Hence the List monad:

There is much to like about the List monad, not the least is simplicity.
Thanks to laziness, the possible worlds are tracked one at a time;
hence the List monad realizes non-determinism as backtracking.
In this implementation, our example gives

The result, shown in the comments, has many duplicates: Some triangular
numbers can be decomposed into two triangles in more than one way.
Since we are interested only in the existence of the decomposition,
computing the duplicates is a waste. This waste, tracking duplicate
possible worlds, can blow-up, as the following example by
Petr Pudlak makes clear.

It is natural to collect the possible states (values reached at some point)
in a set. Rabin and Scott did exactly that in their 1959 paper.
They studied finite automata, whose states are easy to compare, by
their labels. The comparison may be difficult or impossible
in general.

Restricted Set `monad'

A Set monad, analogous to the List monad, is allegedly impossible. The
best we can do is to define Set as a restricted monad for
non-determinism. It is efficient as expected. However, it is not a
monad at all. Later sections will reveal that the Set monad is
not as impossible as it appears.

The list monad tracks all values that a non-deterministic computation
may produce at some point -- from the initial point through the
completion. Duplicate values mean redundant tracking. To eliminate
the duplicates, the intermediate values should be collected in a set
rather than a list. By analogy with the List monad, we need the Set
monad. Alas, we cannot write instance Monad Set because we cannot
define its required method return :: a -> Set a. Only comparable
values can be put into a set; therefore Set a requires the Ord a
constraint. However, in a monad, return must be polymorphic, without
any constraints (after all, return is meant to be a natural
transformation).

Therefore, one often hears of restricted monads -- whose return
and (>>=) operations are not fully polymorphic; the types of
their arguments are constrained, e.g., to be members of certain classes.
A Set a data type can be made an instance of a restricted
monad. Incidentally, Set is the main (and seems, the only) motivation
for restricted monads. There have been suggested several
implementations of restricted monads. Without committing to any
in particular, we define the return and bind operations as mere functions
rather than methods.

As expected, collecting the possible choices for the intermediate results
in a set has eliminated redundant computations and
improved efficiency.

However, the restricted Set monad, as any restricted monad, is quite
restricted. It is not a monad at all. For example, the monad
law return x >>= f == f x
does not generally hold: while the right-hand-size is well-typed
for any x::a and f :: a -> m b, the left-hand-side is well-typed
only for x of certain restricted types. Many monadic idioms and
functions become impossible -- such as the function ap :: Monad m => m (a -> b) -> m a -> m b used in a pattern like return f `ap` x1 `ap` ... `ap` xn. Since functions are not
generally comparable, we cannot return them.

We do not have to sacrifice the monad laws however.
Below, the efficiency of the restricted Set monad is attained without
imposing restrictions.
We can have the restriction-free monad for non-determinism,
and eliminate the duplicates too.

Unrestricted but inefficient Set monads

We cannot make Data.Set.Set to be an instance of the Monad type
class, as we have just seen. Yet a set monad -- a non-determinism
monad that produces the set of possible results -- does exist. We show
two. One of them is a joke; the other is more efficient. Both can be
significantly accelerated while still remaining the ordinary,
unrestricted monads. The acceleration is the subject of the next two
sections. This section is the starting point. It shows that a
restricted monad, a monad with restrictions on the types of
produced values, can be implemented as an ordinary, continuation
monad. Restricted monads are avoidable.

We start with the List monad, which collects the
currently possible values in a list and produces the list
of possible results. Converting the latter list to a set gives
a non-determinism monad in which the results are observed as a
set.

Compared to ptriang_list earlier, there are no longer duplicates
in the result. This `solution' must be a joke: the duplicates are
still present during the computation and are removed only at the very end.
(That is why the Ord constraint, requiring values to be comparable,
appears just in the run function.)
Therefore, stepN n still runs in time exponential in n. Here is
an even simpler
example, with the exponential number of identical choices:

SetC is the familiar, ordinary Cont monad, with a set as the
answer-type. The Ord constrains the answer-type but not
the type a of the produced values.
The run function builds the answer-set, supplying the initial continuation singleton and the Ord constraint:

runSetC :: Ord r => SetC r -> Set r
runSetC m = unSetC m singleton

The failing computation mzero produces the empty set, and
the non-deterministic choice mplus pursues both alternatives and
unions the results.

We have thus converted the restricted Set monad into an
ordinary, unrestricted monad.

The SetC monad may look like the same joke as the un-CPS, naive
version. Although the output of our examples has no duplicates, the examples
themselves do not run any faster. Actually, this is not quite true.
Some examples in the SetC monad can run exponentially faster. The example dups does run faster in GHC, in some circumstances.

In the naive Set monad, we obtain with a bit of inlining and substitution:

The last step relied on a clever optimizer noticing the
two occurrences of dups n :: [Int] and lifting that
common sub-expression.
Alas, it does not help the overall time complexity. Although the 2^(n+1)-element list x ++ x will be computed faster, fromList still
needs to traverse it completely when converting to a set. The overall complexity
remains O(2^n). What would have helped are the properties of S.fromList
and of the set union, which give fromList (x++x) === fromList x. Alas,
this domain-specific knowledge is not available to a general-purpose
optimizer.

Running the same example in the CPS version SetC brings an interesting
twist:

Since runSetC (dups n) is the singleton for any n, x `union` x takes constant time. Therefore, runSetC (dups (n+1))
runs in linear time. The elimination of the common sub-expression k () makes a huge difference in time complexity now. The GHC optimizer
indeed does such a common-subexpression-elimination in certain
cases (when the answer-type is specialized).

Unfortunately, the lucky cases like the above are isolated. The stepN example exponentially explodes with SetC. The cause is not
hard to see: runSetC (stepN n) will have to compute runSetC (choose [0,1] >>= k), which is runSetC (k 0) `union` runSetC (k 1). There are no longer
common sub-expressions; runSetC (k i), which is
essentially runSetC (stepN (n-1)),
produces different, although mostly overlapping results, for different i.

We have seen two proper Set monads, the naive one and its CPS
version. The latter is less of a joke and
takes some advantage of sets to speed-up some computations. The claim that
a restricted Set monad can be converted to a proper, non-restricted monad
gets a bit of credence -- only a small bit so far.
The principal flaw of both Set monads is the independent tracking
of the computations forked from a non-deterministic
choice. The forked computations may go through many common states, but they
are not shared. Hence the performance remains sub-optimal.
The following two sections fix the problem.

Efficient Set monad through knowledge propagation

The first approach to improve the naive Set monad takes a clue from
partial evaluation: use the statically available knowledge and
propagate it. This section develops the first efficient and
restriction-free Set monad.

The naive Set monad is a joke because the set is produced only at the
end of the computation. The possible intermediate results are being
collected in a list, which lets duplicates accumulate. To remove
the duplicates, we should be able to compare the possible
intermediate results.
They could be functions, however, or other incomparable data.
Hence we have to deal with the values that are known to be comparable,
and the ones that might not be. The collection for the possible
intermediate results has to be `a blend', a disjoint union of a set
and a list:

data SetM a where
SMOrd :: Ord a => S.Set a -> SetM a
SMAny :: [a] -> SetM a

SetM is a GADT: Ord a constrains the type of the collected values,
but only within the SMOrd alternative. Just like List, SetM is an ordinary, restriction-free monad:

If all SetM a in the list [SetM a] are SMAny, then the collection is
indeed concatenation. If at least one of them is SMOrd, we know that all type a values are comparable, even those contained in SMAny. In other words, if we somehow know that some of the
currently possible intermediate values are comparable, then all of
them are. This is the principle of propagating knowledge. The MonadPlus instances uses that principle too:

If at least one choice yields comparable results, the results of
the other choice are comparable too, even if we did not know that
(see the SMAny alternative) until now.

We get to know which intermediate results
are comparable from the programmer's telling us. We ask the programmer
to use the optimized version to choose among comparable
alternatives:

chooseOrd :: Ord a => [a] -> SetM a
chooseOrd x = SMOrd (fromList x)

One may argue that choose must always have the Ord
constraint. (We could instead ask the programmer
to use the specialized mplusOrd :: Ord a -> m a -> m a -> m a
where appropriate.) All remains is to extract the
set as the final result:

If we re-write our examples with the efficient choose, we find that
they indeed run much faster; stepN n runs in linear time, as expected.

Thus from program annotations -- by asking the programmer to tell us
of some comparable values -- we gain static knowledge. We then arrange
to propagate that static knowledge throughout the computation at run-time.
As in partial evaluation, the speedup can be exponentially impressive.

Efficient Set monad through reflection-reification

We have developed one serious, efficient, restriction-free Set monad
by improving the naive version. This section improves the CPS variant
-- the SetC monad, -- illustrating the principle
of monadic reflection and reification.

Reification represents an effectful computation as a value.
For example, a non-deterministic computation may be
represented by the (lazy) tree of its choices -- or by the set
of possible results. The SetC monad already has a function
to reify a SetC computation into a set:

reifySet :: Ord r => SetC r -> Set r
reifySet = runSetC

Monadic reflection turns the representation of a computation
to the computation itself. For example, the reflection takes
a set of values and non-deterministically selects one.
Our choose was essentially such a function:

From the types, reifySet and reflectSet look like inverses.
They indeed are, as one can easily check.

reifySet . reflectSet $ set == set
reflectSet . reifySet $ m ~ m

Reflecting and then reifying a set gives the equal set. Reifying and
then reflecting a computation gives an observationally equivalent
computation. Any reification-reflection pair must satisfy these
laws.

The second law says that the composition reflectSet . reifySet
is a sound program transformation that preserves equivalence.
It is often an optimization transformation. Indeed, the
reified computation probably has done calculations between the choices.
The reification result, a set, records only the results of the choices.
Reflecting the set gives a computation that chooses among the
already computed results.

We apply this optimization recipe to SetC. Rather than changing the monad,
we ask the programmer to add reflectSet . reifySet `annotations'
at appropriate places, for instance, around recursive calls. The stepNR computation will now read:

We have thus presented the second proper, efficient Set monad. The
bare SetC monad is inefficient because the results of
non-deterministic choices are unioned only at the very end. The
reflection-reification `annotation' splits the computation in
chunks. The sets of results are unioned sooner, at chunk boundaries,
hence eliminating the duplicates among the intermediate results. The
performance improves as expected. Although the reflection-reification
looks quite different from the first optimization principle of
knowledge tracking, it, too, takes and follows through
on the hints left by the programmer. The appearance of reflectSet . reifySet in the code is such a hint, telling us that the
intermediate values at that point in the program are comparable and
hence the duplicates can be detected and removed.