Sometimes it is useful to enumerate in increasing order
programs that have a given type. A
simple example is test
generation for compilers: we want to test a new optimising phase and
are interested in generating programs that trigger that phase (we
use types for this purpose). Since we want to find 'minimal'
examples exhibiting bugs in the new optimising phase, it makes
sense to start with the smallest programs as test cases first.
Another (somewhat trivial) example is superoptimising compilation where
we enumerate all possible assembly programs by increasing length. The typing system is trivial in the sense that all assembly programs are admissible.

There is a trivial algorithm solving this problem: lazily enumerate
all untyped programs and do type inference on each, reject those
that fail to type. The trivial algorithm is unlikely to be
efficient since for reasonable typing systems, most programs don't have the target type. Surely one can do better by interspersing typing and
enumeration. Let's express this problem in a more abstract setting.

Assume fixed an untyped programming language $L$ and a well-founded partial-order (or
preorder) $\sqsubseteq$ on $L$ programs. We are also given a typing
system for $L$. We write $\Gamma \vdash P : \tau$ to indicate that
program $P$ has type $\tau$ assuming $P$'s free variables are typed
as described in the typing environment $\Gamma$.

Note that my formulation of the problem has been deliberately vague,
for it's likely that a lot depends on $\sqsubseteq$ and the typing
system. Most orders $\sqsubseteq$ will be too complicated to admit
efficient algorithms, I imagine. The orders I have in mind are program size, or weighted program size (some program constructors are 'heavier' than others).

Question. What is the state-of-the-art of research on efficient algorithms for
such (and related) problems?

2 Answers
2

For ordered enumeration instead of random generation you are getting into the realm of combinatorics. I don't know of any generic results, but this paper Counting and Generating Lambda Terms describes an enumeration of untyped terms and empirical data on the sieve approach to enumerating typed lambda terms. It looks like they use a hindley-milner type system so no annotations are needed.

On the other hand if you want to generate typed terms directly, there are libraries like SciFe (website,paper) and data/enumerate (docs,draft paper) that support "dependent enumeration" where you enumerate one thing and then select what enumeration to use based on that (essentially enumeration of Sigma types), that is essential for enumerating typed terms in non-trivial languages.
Dependent enumeration isn't fast either, but it might be faster than a sieve.

$\begingroup$Dependent enumeration is fastest when the dependent clause is always infinite (because you can do a diagonalization), if the dependent clause is finite it can be very slow, for instance if it is often empty it degenerates to a search procedure. This is detailed somewhat in the documentation for cons/de in data/enumerate, but there should probably be more. On the other hand, diagonalizing is fast, but doesn't give an ordering with a nice combinatorial interpretation.$\endgroup$
– Max NewMar 22 '17 at 14:33

I have used the "randomly generate terms and check that they are well-typed" approach (you mention that "untyped" terms are generated, you can also randomly generate terms in a Church-style grammar with explicit type annotations) and it worked very well in practice, it revealed all the bugs there was to find on this particular part of the project. For practical purposes I would recommend trying this first. (On the other hand, the generator was aware of scoping rules -- it maintained a set of currently bound variables for top-down generation, this is easy to implement.)

The choice of order depends a lot on the application you have in mind: two distinct applications will require different orders. For example, people that work on type-directed program synthesis are not usually interested in enumerating equivalent programs, so they will make simplification rules such as assuming that "any term of type A -> B can be chosen to start with a lambda-abstraction" -- in your framework, ordering lambda-terms before all terms at this type. This order is very bad for compiler testing: if you only generate programs that are in normal form, you are unlikely to exercise much of your optimization strategies or runtime semantics. So I believe that assuming an order to abstract away from the intended usage is not a good modeling choice: its choice is tightly coupled with the intended usage, and I would rather reason about the usage than the corresponding order.

Now for some references.

For typed-term generation for testing purposes, you may be interested in "Making random judgments: Automatically generating well-typed terms from the definition of a type-system" by Burke Fetscher, Koen Claessen, Michał Pałka, John Hughes, and Robert Bruce Findler, 2015.

This paper presents a generic method for randomly generating well-
typed expressions. It starts from a specification of a typing judgment in PLT Re-
dex and uses a specialized solver that employs randomness to find many different
valid derivations of the judgment form.
Our motivation for building these random terms is to more effectively falsify
conjectures as part of the tool-support for semantics models specified in Redex.
Accordingly, we evaluate the generator against the other available methods for
Redex, as well as the best available custom well-typed term generator. Our results
show that our new generator is much more effective than generation techniques
that do not explicitly take types into account and is competitive with generation
techniques that do, even though they are specialized to particular type-systems
and ours is not.

For typed term enumeration for program synthesis, two recent works that I know are Example-Directed Synthesis: A Type-Theoretic Interpretation by Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic, 2016, and Program Synthesis from Polymorphic Refinement Types by Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama, 2016. Note that both works use refinement types as a way to encode more desired properties of the searched program -- such as a testsuite that it should pass. These refinements often constrain the search space substantially more than just types, but they are also harder to invert -- interleave with search.

$\begingroup$When you generate a-la-Church terms: do you generate type-annotations randomly, or does the choice of type-annotation inform the generation of the abstraction's body?$\endgroup$
– Martin BergerMar 18 '17 at 8:57

$\begingroup$I was aware of the work on randomly generating programs such as Fetscher et al. The key difference is that if you don't want a uniform distribution (e.g. of programs of size $n$) you can just abandon a search and backtrack if you are down a hopeless path. That's what Fetscher et al do. But that's not possible in enumeration.$\endgroup$
– Martin BergerMar 18 '17 at 8:57

$\begingroup$One thing that I forgot to mention is the (natural) idea of writing an enumerator in a logic programming language. I am not sure that it would actually make it easier to push the type-system constraints into the generation (so it may be similar to naive random generation), but good Prolog implementation may support extra features to make it efficient, such as tabling. I have not tried myself.$\endgroup$
– gascheMar 18 '17 at 16:05