Research Blog of 穆信成 Shin-Cheng Mu

Menu

Category Archives: Research Blog

Before learning dependent types, all I knew about program termination was from the work of Henk Doornbos and Roland Backhouse [DB95, DB96]. I am curious about the connection between their formulation of well-foundedness by relational calculation and the type theoretic view.

It is generally accepted that a relation _<_ is well-founded if, for all x, there is no infinite chain x > x₁ > x₂ > x₃ > …. . A related concept is inducvitity. We say that a relation admits induction of we can use it to perform induction. For example, the less-than ordering on natural number is inductive.

Most of the type theoretical paper I have seen equate well-foundedness to the concept of accessibilty, which I talked about in a previous post. For example, Bengt Nordström [Nor88] introduced the datatype Acc as a “constructively more useful description” of well-foundedness.

Doornbos and Backhouse generalised well-foundedness and other related notions to an arbitrary F-algebra. Well-foundedness, however, was generalised to “having a unique solution.” A relation R : FA ← A is F-well-founded if there is a unique solution for X = S . FX . R.

Reductivity

Inductivity, on the other hand, is generalised to F-reductivity. While F-reductivity is a generalisation of strong induction, Doornbos and Backhouse called it reductivity “in order to avoid confusion with existing notions of inductivity.” [DB95] (Nevertheless, an equivalent concept in Algebra of Programming modelled using membership relation rather than monotype factors, calls itself “inductivity”. Different usage of similar terminology can be rather confusing sometimes.)

A relation R : FA ← A is F-reductive if

μ(λP . R⧷FP) = id

where ⧷ is the monotype factor defined by ran(R . A) ⊆ B ≡ A ⊆ R⧷B. The function λP . R⧷FP takes coreflexive (a relation that is a subset if id) P to coreflexive R⧷FP, and μ takes its least fixed-point.

Well, it is a definition that is very concise but also very hard to understand, unless you are one of the few who really like these sort of things. If we take P to be a predicate, R\FP expands to a predicate on x that is true if

∀y . y R x ⇒ FP y

Taking the fixed point means to look for the minimum P such that

∀x . (∀y . y R x ⇒ FP y) ⇒ P x

The expression now resembles strong induction we all know very well. Computationally, how do we take the fixed point? Of course, by defining a recursive datatype:

where f : (a -> Set) -> (F a -> Set) is the arrow-operation of functor F. When we take F to be the identity functor, it is exactly the definition of accessibility in my previous post! Finally, the requirement that μ(λP . R⧷FP) = id means every element in a is accessible.

So, it seems that reductivity of Doornbos and Backhouse is in fact accessibility, which is often taken by type theorists to be an alternative definition of well-foundedness. Doornbos and Backhouse, however, showed that reductivity guarantees uniqueness solution of hylo equations (which is their definition of F-well-foundedness). The converse, however, is not true in general.

New Reductive Relations from Old

Doornbos and Backhouse also developed some principles to build new reductive relations from existing ones. Let us try to see their Agda counterparts. For simplicity, we take F to be the identity functor in the discussion below.

All initial algebras are reductive. For example, given Pred relating n and suc n, one can show that Pred is reductive.

In dependently typed programming, we would like to ensure that proof terms are terminating. Apart from the type checker, Agda employs a separate termination checker. How it exactly works is still at times mysterious to me, but basically it recognises a program as terminating if some of its arguments become “structurally” smaller after each recursive call. This is already far more expressive than I thought — quite a number of programs can be written by structural recursion. Still, we occasionally need some more flexibility. One of such occasion was when I wished to express terminating unfolds in AoPA. For another example, consider this function mentioned in the paper Terminating General Recursion by Bengt Nordström:

f (n, k) = if k ≤ n then n else f(f (n + 1, k), k)

It is a variation of McCarthy’s 91 function, used as a popular example due to its non-trivial recursion pattern. Can we express this function in Agda while informing the termination checker that it does terminate?

Given a relation < that has been shown to be well-founded, we know that a recursively defined function is terminating if its arguments becomes “smaller” with respect to < at each recursive call. We can encode such well-founded recursion in Agda’s style of structural recusion. The earliest paper I could find about such encoding is Terminating General Recursion by Bengt Nordström. It may have appeared in other forms, however.

Accessibility

Given a binary relation _<_ : a -> a -> Set, an element x is in the well-founded subset of < if there is no infinite chain of x₁, x₂, x₃ ... such that x > x₁ > x₂ > x₃ > .... The relation is well-founded if all elements of a are in its well-founded subset. The less-than ordering on natural number is well-founded, while the greater-than ordering is not. A more constructive way to talk about well-foundedness is via accessibility. Given a relation _ a -> Set, the proposition Acc _ states that x is accessible under relation <. An element x : a is accessible if all y : a "smaller" than x are accessible:

A relation _ a -> Set is well-founded if all elements in a are accessible:

well-found : {a : Set} -> (a -> a -> Set) -> Set
well-found _ Acc _

For an example, consider the less-than-or-equal-to ordering on natural numbers. It is usually modelled by a datatype stating that 0 ≤ n for all n and that 1+m ≤ 1+n follows from m ≤ n. The strictly-less-than ordering is defined by m < n = 1+m ≤ n. For well-foundedness proofs, however, the following definitions turn out to be more convenient:

Its type says that if we can provide a function Φ that proves some property P x for all x provided that P y has been proved for all y < x, then P holds for all accessible elements in a. Recall strong induction.

When < is well-founded, all elements are accessible. Therefore we can define:

where sm-sn<sm : forall m n -> (m ∸ n) <′ suc m. Notice how rec-wf ℕ behaves like a fixed-point operator taking the "fixed-point" of body. The recursive call is replaced by a function rec, having type (m : ℕ) -> m ℕ. To call rec, the programmer has to provide a proof that the argument is indeed getting smaller with respect to the relation _<_.

The function could also be written without explicit fixed-point, but with explicit pattern matching with the accessibility proof:

Upper-Bounded Reverse Ordering

As a practice, let us consider another ordering. For a given natural number k, define n ◁ m = m < n ≤ k. That is, for m and n upper-bounded by k, the one closer to k is considered smaller. Apparently ◁ is well-founded. I tried to encode it in Agda by:

_ ℕ -> ℕ -> Set
m < n ≤ k = (m

◁ : ℕ -> ℕ -> ℕ -> Set
(◁ k) m n = n < m ≤ k

However, the accessibility proof became very difficult to construct. I ended up defining yet another variation of the less-than-or-equal-to ordering:

It is perhaps quite hard to understand unless you try to construct it yourself. The reader only needs to notice that in the two recursive calls to ◁-access (in the second and the last clause), the argument m is incremented to suc m. That is why we need to include a proof term of m as an argument, which becomes structurally smaller in the recursive calls. This is how we inform the termination checker that, although m increases, the distance between m and k shrinks. The proof of n ≤ k, on the other hand, is used to establish the contradictions in the first and the third clause.

A Variation of McCarthy's 91 Function

Back to the function f mentioned in the beginning, whose native Agda translation:

f k n with k ≤? n
... | yes _ = n
... | no _ = f k (f k (suc n))

certainly does not pass the termination check.

In fact, the second argument (n) to f becomes smaller with respect to ◁ k. This is obvious for the inner recursive call (f k (suc n)) since k ≰ n implies n < 1 + n ≤ k. To prove the termination of the outer recursive call, we have to show that f k n actually computes the maximum of k and n. Since k ≰ n, we have f k (suc n) = k. Therefore, f k (suc n) is smaller than n with respect to ◁ k.

To prove the termination of f we have to prove something about its result. One of the possible ways to do so is to define:

P : ℕ -> ℕ -> Set
P k n = ∃ ℕ (\fn -> n ≤′′ k -> fn ≡ k)

and extend the type of f to:

f : (k : ℕ) -> (n : ℕ) -> P k n

Now that f returns an existential type, we have to pattern math its result. A definition of f may look like:

After seeing our code, Nils Anders Danielsson suggested two improvements. Firstly, to wrap the bound in the seed. The terminating unfoldr↓ would thus have a simpler type as well as a simpler implemenation:

The function split↓ I eventually come up with, however, is much more complicated than I had wished. Even worse, it is now split↓ that fails to pass the termination check.

Ulf Norell suggested some possible fixes. The difficulties, however, is probably a hint that there is something wrong in my approach in the first place. Rather than trying to fix it, Nils Anders showed me how he would tackle the problem from the beginning.

Using Well-Founded Recursion

Nils Anders showed me how to define unfoldT↓ using well-founded recursion. For a simplified explanation, the Standard Library provides a function having type (after normalisation):

With one can define functions on natural numbers by recursion, provided that the argument strictly decreases in each recursive call. P is the type of the result, parameterised by the input. The function f is a template that specifies the body of the recursion which, given i, computes some result of type P i. The functional argument rec is supposed to be the recursive call. The constraint j , however, guarantees that it accepts only inputs strictly smaller than i (the ordering is a variation of < that is more suitable for this purpose). One may perhaps think of as a fixed-point operator computing the fixed-point of f, only that f has to take i before rec because the latter depends on the former.

Let us try to define an unfold on trees using . The "base-functor" of the datatype Tree⁺ a is F b = a ⊎ b × b. One of the lessons we have learnt is that it would be more convenient for the generating function to return the bound information. We could use a type like this:

F a b k = a ⊎ ∃ (ℕ × ℕ) (\(i , j) -> b i × b j × i

But it is perhaps more reader-friendly to define the base functor as a datatype:

All the definition above makes no recursive calls. All the tricks getting us through the termination check are hidden in . How is it defined?

Well-Founded Recursion Defined on

Currently, well-founded recursion are defined in Induction and its sub-modules. They are very interesting modules to study. The definitions there, however, are very abstract. To understand how works, let's try to implement our own.

Recall that the recursion template f has type forall i -> (forall j -> j P j) -> P i. That is, given i, f computes P i, provided that we know of a method to compute P j for all j . Let us try to construct such a method using f. The function guard f i computes f, provided that the input is strictly smaller than i:

Observe that guard terminates because it is defined inductively on i. If we discard the type information, all what guard f i j is to make a call to f j. Before doing so, however, it checks through the proof to make sure that j is strictly smaller than i. In f j (guard f j), the call to guard f j ensures that subsequent calls to f are given arguments smaller than j.

Now can be defined by:

Set) ->
(forall i -> (forall j -> j P j) -> P i) ->
forall n -> P n

In Induction.Nat, is an instance of well-founded recursion defined using the concept of accessibility, defined in Induction.WellFounded. I find them very interesting modules about which I hope to understand more.

The type of sort-der is a proposition that there exists a program of type [ Val ] → [ Val ] that is contained in ordered ? ◦ permute, a relation mapping a list to one of its ordered permutations. The combinator fun embeds a function in a relation. Relations on each line are related by relational inclusion (⊒), while the proof of inclusion are given in angle brackets. In the final step, the relation foldR (fun (uncurry insert)) nil is refined to a function using the rule:

which says that a relational fold is a functional fold if the algebra (f) is a function. When the proof is completed, an algorithm isort is obtained by extracting the witness of the proposition. It is an executable program that is backed by the type system to meet the speciﬁcation.

One may wonder: why insertion sort, rather than something more interesting such as mergesort or quicksort? Because we have not yet figured out how to properly deal with hylomorphism yet. One of the things still missing in AoPA is a treatment of unfolds. AoPA attempts to model the category of sets and relations. In this context one can talk about unfolds because they are simply the converses of folds. We perform the derivations in terms of folds, and refine the converse of a relational fold to a functional unfold provided that the coalgebra is a function:

The only problem is that this unfoldr cannot pass the termination check of Agda.

My first thought was to wait until Agda supports codata and coinduction. However, the reason I wanted an unfold in AoPA in the first place was to compose it with a fold to form a hylomorphism, which is not allowed if the unfold returns codata. I then attempted to model unfolds in a way similar to what we did in relational program derivation. We know that the recursive equation:

R = S ○ FR ○ T

has a unique solution R = ⦇S⦈ ○ ⦇T⦈˘ if and only if T is well-founded. In program derivation, the proof of well-foundedness is usually given separately. Therefore I hope to come up with a variation of unfoldr which, given a proof of well-foundedness, somehow passes the termination check.

During DTP 2008, I consulted a number of people and was suggested at least three solutions. I am going to summarise them in turn.

Canonical Examples

With each approach I would like to tackle two tasks. Given the following function dec:

dec : ℕ -> ⊤ ⊎ (ℕ × ℕ)
dec zero = inj₁ tt
dec (suc n) = inj₂ (n , n)

One may attempt to define a function down using unfoldr such that down n produces a list from n to 0. As mentioned above, unfoldr cannot pass the termination check. In fact, even the following definition fails the termination check:

If we apply, for example, foldT merge wrap, we get merge sort (I use non-empty lists to save the constructor for empty trees). In Haskell, one might first define a function that splits a list to roughly two halfs, something equivalent to this:

and use an unfold on Tree⁺. The second task is to construct a function expand : {a : Set} -> List⁺ a -> Tree⁺ a that builds a roughly balanced tree, using some kind of unfold that makes use of split⁺, or at least some variation of it.

Josh Ko’s Approach

Josh and I tried to extend unfolds with extra arguments representing a bound and proofs that the bound decreases with each call to f:

Apart from the generating function f and the seed b, the function unfoldr↓ takes a bound t, a natural number. The seed and the bound is related by ≾, to be defined for each specific problem. Before producing the list, unfoldr↓ demands that the current seed is bounded by the bound (y ≾ t), a proof that f must yield inj₁ when the bound is zero, and a proof that if f returns inj₂ (x , y'), the new seed y' is bounded by a strictly smaller bound.

To define unfoldr↓ we need the inspect idiom (see the explanation in Relation.Binary.PropositionalEquality):

One would wonder whether the condition (y₁ ≼ i × y₂ ≼ i) is too restrictive. When the proposition has to be proved inductively, we may need a stronger inductive hypothesis, for example, that both y₁ and y₂ are bounded by half of suc i. The current definition luckily works for our second task. We may need some generalisation later.

Repeated calls to split⁺ eventually terminates because the lengths of the lists are strictly decreasing. Therefore we define:

Definition of lemma1 is a simple application of contradiction. On the other hand, lemma2 is very tedious to define. It appears to me that the definition inevitably becomes cumbersome because split⁺ and lemma2 are defined separately. It could be much easier if split⁺ is defined to return the computed results as well as some proofs about them. For my application in AoPA, however, I would prefer to be able to reuse the old split⁺. I don’t know a good solution yet.

Nils Anders Danielsson suggested that it would be nicer to integrate the bound into the type of the seed, while Conor McBride suggested yet another approach. I will try to summarise them next time.

Back in 2003 when I just started my postdoc in PSD Lab in University of Tokyo, my colleagues there were discussing about the third homomorphism theorem. The theorem says that if a function f can be expressed both as a foldr and a foldl:

f = foldr (≪) e
f = foldl (≫) e

there exists some associative binary operator ⊚ such that

f [] = e
f [a] = a ≪ e = e ≫ a
f (xs ⧺ ys) = f xs ⊚ f ys

Being a list homomorphism implies the potential of parallel computation.

The paper we studied was, of course, The Thrid Homomorphism Theorem by Jeremy. The aim then was to automatically, or semi-automatically, derive ⊚, given ≪, ≫, and e. The motivating examples include:

I myself was not particularly interested in automation. I was interested in the challenge for two reasons. Firstly, it appears that some kind of inverse function is needed. Secondly, when I looked at Jeremy’s proof, I felt there is a relational proof inside waiting to be discovered. So I tried.

Setting the Stage

For the ease of point-free calculation, we define alternatives of folds where the input is paired with the base-cases:

Once we have the lemma proved, the theorem follows by taking ⊚ = f . cat . (f °× f °). It has to be a (partial) function because ⊚ . (f × f) = f . cat is a function. Futhermore, ⊚ is associative because cat is.

The proof of Lemma 1 is simply some applications of (1), (2), and S . S° . S = S:

The proof is still the same as that of Jeremy’s, but in a relational disguise.

Refining to Functions

To derive actual algorithms, we have yet to refine ⊚ = f . cat . (f°× f°) so that it uses functional components only. We shall pick some g ⊆ f° whose domain equals the range of f, such that ⊚ = f . cat . (g × g). (An equality, rather than inclusion, holds because ⊚ in both definitions are partial functions having the same domain.)

For sort, define sorted = ran sort. It is a partial function such that sorted xs = xs iff xs is sorted. Notice that sorted ⊆ sort°. Therefore, a possible choice for ⊚ would be sort . cat . (sorted × sorted) — concatenating two sorted lists, and sort them again. Jeremy then went on with this definition and proved that ⊚ = merge, taking advantage of the fact that both input lists are sorted.

Some more Properties

Some more properties are needed to deal with scanr. The following properties allow foldrp and foldrp to proceed by shifting elements of the list to the base case:

When simple approaches of refining ⊚ = f . cat . (f°× f°) does not work, the following approach sometimes does. Since f is a fold, one may attempt to take one of the f° as an unfold, thus forming an “unfold, then fold” pattern:

Prologue

Given the complexity of the proof above, I did not think there was a hope to automatically construct ⊚ for a reasonably useful set of list homomorphisms. My colleagues were talking about “weak inverses” — their attempts to look for a refinement of f°, which I considered too ad-hoc.

Being just graduated from AoP, I was perhaps too arrogant and proud of all the clever derivation we did to care about automatic program construction. (Like Jeremy, in the end of my thesis I quoted “善數不用籌策 (Those who good at calculation need no mechanical aids)” from 老子 Lao Tzu. And didn’t Dijkstra say “I hardly used the computer they gave me. I was perfectly happy without it.”?) The relational method, which seemed to cover everything, gave me a false feeling that I knew the problem inside out.

Last year, however, my colleagues and I met again and I was told that they eventually published a paper on this subject:

They focused on weak inverses that returns only lists with fixed lengths. The advantage is that calculations like the one above are no longer necessary — since the lists are of fixed-length, ⊚ always takes constant time. On the other hand, scanr and scanl are special cases dealt with on a different level. Such distinction is enforced by the language they allow to construct f. No, they do not seem to have handled sort. But their approach still covered a reasonably useful collection of functions.

Well, I think the moral is that we cannot just stop when it appears that is no elegant solution that suits our taste. It sometimes pays to get our hands dirty, through which we may eventually discover the beauty within.

By chance, I came upon a blog entry by Masahiro Sakai (酒井政裕) in which he, after reading my short comment on my previous homepage “Do you know that the S combinator is injective? I have a simple algebraic proof!”, tried to construct the inverse of S and showed that S⁻¹ ○ S = id in, guess what, Agda!

Wow, people were actually reading what I wrote! This made me feel that I have to be a bit more responsible and, at least, provide the proof when I claim I have one. So, here it is.

S is injective

Recall the definition of S:

S : (A -> B -> C) -> (A -> B) -> A -> C
S = λ x y a -> x a (y a)

I am assuming a simple semantics of sets and functions, and by S being injective I mean that S x = S x' ⇒ x = x', which can be trivially shown below:

I would usually leave out the ∀‘s in derivations, but in this case, they are explicitly written to emphasise that b is indeed universally quantified.

So, what is the inverse of S?

Now that we know S is injective, there ought to exist some function S⁻¹ such that S⁻¹ ○ S = id. Nakano san claimed that a definition would be:

S⁻¹ : ((A -> B) -> A -> C) -> A -> B -> C
S⁻¹ = λ x a b -> x (K b) a

That S⁻¹ ○ S = id is easy to see:

S⁻¹ (S x) a b
= (S x) (K b) a
= x a (K b a)
= x a b

From another direction, we have only S ○ S⁻¹ ⊆ id since S is not a surjective function. How the range of S look like? Inspecting the definition of S. Since y is applied only once to a, the value of y on other input does not matter. That is, the range of S consists of only functions e such that:

Inverting higher-order functions?

Some endnotes. Once Nakano and I thought we discovered how to invert higher-order functions (and even derive the necessary side conditions, like the one on e above) by drawing λ expressions as trees and manipulating them. It turned out that I overlooked something obvious in the very beginning and the result was a disaster.

Does anyone know of some related work on inverting higher order functions? The only one I know is Samson Abramsky’s paper A Structural Approach to Reversible Computation, but I am not sure whether it is about reversibility/invertibility in the same sense.

After reading Atsushi Igarashi’s paper Deriving Compilers and Virtual Machines for a Multi-Level Language, I gave myself the following exercise: given an interpreter of a small language, derive a virtual machine and a corresponding compiler. Such techniques probably date back to Ager et al., where they built connections between not one but five pairs of denotational semantics and existing abstract machines (Krivine’s, CEK, CLS, SECD, and Categorical Abstract Machine). They continued to explore a number of variations in subsequent papers. Igarashi’s paper also deals with a more difficult topic — to derive an abstract machine for a language with quotes/unquotes. As a starting point to appreciating their work, I tried this small example one could do in one afternoon.

Let us start with this simple language with numerical values, a binary operator for addition, lambda abstraction and variables represented by de Bruijn notation, and application:

Like in many derivations, we have cheated a little from the beginning. The use of de Bruijn indices makes env act like a stack, while CPS abstracts the control flow of the virtual machine to be derived.

Defunctionalising the Continuation

The derivation relies on Reynolds’ defunctionalisation: to enumerate the instances where a higher order function is used, and represent them by first-order structures. As the first step, we aim to replace the continuation Cont, a synonym of Val -> Val, by some datatype:

data Cont = Cont 0 | Cont1 ... | Cont2 ... | ...

and to define a function appK :: Cont -> Val -> Val pattern-matching Cont and performing the actions that were to be done in eval. Where there used to be a continuation application in eval, we replace it with an application of appK:

Separating the Evaluator from the Virtual Machine

The functions eval and appK are mutually recursive. Later, however, appK is to evolve into a part of the virtual machine and eval the compiler. We do not want to mix the two stages. The next task is therefore to separate them.

In the definition of appK, the function eval is called in three places. In all these cases, eval is applied to an expression captured in Cont. Rather than performing the application in appK, we can also make it happen earlier in eval and store the partially applied result in Cont:

Notice that while Env is a data stack, Cont, having a list-like structure, acts as a control stack. Constructors Cont1, Cont3, and Cont4 store a pointer (Env) into the data stack. What about Compt? As we will soon see, by applying another defunctionalisation, Compt becomes a list of instructions. The pointer in Cont with type Compt can therefore be seen as an instruction pointer.

Defunctionalising the Computation

By defunctionalising, we replace Compt by a first-order data structure, and create a function appC :: Compt -> Env -> Cont -> Val. This time, however, we try to choose more telling names for the introduced instructions. The first two clauses of eval, for example, is split into:

Once you understand how it works, the derived result is not so surprising. Igarashi sort of mentioned that it is not too difficult to add recursion. See his paper for more interesting results, such as extension to multi-level languages — programs that generate programs.

To practise using the Logic.ChainReasoning module in Agda, Josh proved the foldr-fusion theorem, which he learnt in the program derivation lecture in FLOLAC where we used the maximum segment sum (MSS) as the main example. Seeing his proof, I was curious to know how much program derivation I can do in Agda and tried coding the entire derivation of MSS. I thought it would be a small exercise I could do over the weekend, but ended up spending the entire week.

As a reminder, given a list of (possibly negative) numbers, the MSS is about computing the maximum sum among all its consecutive segments. Typically, the specification is:

If it works well, we can probably use Agda as a tool for program derivation!

Currently, however, I find it harder to use than expected, perhaps due to my being unfamiliar with the way Agda reports type errors. Nevertheless, Agda does force me to make every details right. For example, the usual definition of max I would use in a paper would be:

max = foldr _↑_ -∞

But then I would have to define numbers with lower bound -∞. A sloppy alternative definition:

max = foldr _↑_ 0

led me to prove a base case 0 ↑ max x ≡ max x, which is not true. That the definition does work in practice relies on the fact that segs always returns empty list as one of the possible segment. Alternatively, I could define max on non-empty lists only:

max : List⁺ Val -> Val
max = foldr⁺ _↑_ id

where List⁺ A is defined by:

data List⁺ (A : Set) : Set where
[_]⁺ : A -> List⁺ A
_::⁺_ : A -> List⁺ A -> List⁺ A

and refine the types of inits, tails, etc, to return non-empty lists. This is the approach I took.

ListProperties.agda: some properties I need about lists, such as fold fusion, concat ○ map (map f) = map f ○ concat, etc. Later in the development I realised that I should switch to non-empty lists, so not all of the properties here are used.

Derivation.agda: the main derivation of MSS. The derivation is parameterised over any numerical data and operators + and ↑ such that + is associative, and a + (b ↑ c) = (a ↑ b) + (a ↑ c). The reason of this parameterisation, however, was that I did not know how to prove the properties above, until Josh showed me the proof.

The reason I studied the thinning theorem again is because I need a slightly generalised variation. The following seems to be what I need. The general idea and the term “pruning” emerged from discussion with Sharon Curtis. The term “lax preorder” is invented by myself. I am not good at naming, and welcome suggestions for better names.

The notation below are mostly taken from the book Algebra of Programming. Not many people, even among functional programmers, are familiar with these notations involving relational intersection, division, etc. One starts to appreciate their benefits once he/she gets used to using their calculation rules. Most of the time when I was doing the proof, I was merely manipulating the symbols. I could not have managed the complexity if I had to fall back to the semantics and think about what they “mean” all the time.

A relation Q :: PA ← A between a set of A and an element is called a lax preorder if it is

Any preorder R induces a lax preorder ∋ . R. If a relation S is monotonic on R˘, it is monotonic on lax preorder ∋ . R. Furthermore, prune (∋ . R) = thin R. Therefore, pruning is a generalisation of thinning. We need the notion of lax preorders because, for some problems, the generating relation S is monotonic on a lax preorder, but not a preorder.

A while ago when I needed to consider some variations of the thinning theorem I tried to do the proof again. And, horrifyingly, I could not do it anymore! Have my skills become rusty due to lack of practice in the past few years? In panic, I spent an entire afternoon fighting with it, until I realised that it was just a typical copying error from the very beginning: when I copied a property from the book I put in an extra Λ. Then I trapped myself in the maze of expanding ΛR into ∈\R ∩ (R\∈) and using modular law and ….

Having fixed the error, I get my trivial and easy proof back again. Anyway, I am going to record it below, in case I run into the same panic again.