[...] Richard had completed his analysis of the behavior of the
router, and much to our surprise and amusement, he presented his
answer in the form of a set of partial differential equations. To a
physicist this may seem natural, but to a computer designer, treating
a set of boolean circuits as a continuous, differentiable system is a
bit strange. Feynman's router equations were in terms of variables
representing continuous quantities such as "the average number of 1
bits in a message address." I was much more accustomed to seeing
analysis in terms of inductive proof and case analysis than taking the
derivative of "the number of 1's" with respect to time. Our discrete
analysis said we needed seven buffers per chip; Feynman's equations
suggested that we only needed five. We decided to play it safe and
ignore Feynman.

The decision to ignore Feynman's analysis was made in September, but
by next spring we were up against a wall. The chips that we had
designed were slightly too big to manufacture and the only way to
solve the problem was to cut the number of buffers per chip back to
five. Since Feynman's equations claimed we could do this safely, his
unconventional methods of analysis started looking better and better
to us. We decided to go ahead and make the chips with the smaller
number of buffers.

Fortunately, he was right. When we put together the chips the machine worked.

I've put the relevant bit in bold. To most mathematicians,
inductive proofs are the tool of last resort. You always try to weave
together the known properties of your object together in a creative
way, and do an exhaustive case analysis only if all else fails.
However, to logicians and computer scientists, induction is the tool
of first resort – as soon as we have an object, we grind it to dust
with induction and reassemble it bit by bit with the properties we
want.

Regular mathematicians find this process roughly as appealing as
filling in tax forms, but it does come with its compensations. One of
those compensations is that we get to collect a wide variety of fixed
point theorems, to justify an equally wide variety of recursive
constructions.

The modal version
of Löb's
theorem is one of the more exotic fixed point theorems in logic.
It is often glossed as representing the "abstract content" of Gödel's
incompleteness theorem, which is the sort of explanation that conceals
as much as it reveals. If you look at the proof, though, you'll see
that its structure is a very close match for something very familiar
to computer scientists – the proof of Löb's theorem is (almost) a
derivation of the Y combinator!

To understand this, let's try formulating the fragment of provability
logic we need as a natural deduction system. The types we will use are
the following:
$$
\begin{array}{lcl}
A & ::= & b \bnfalt A \to B \bnfalt \mu a.\,A(a) \bnfalt \Box A
\end{array}
$$

This looks like the simply-typed lambda calculus, extended with
recursive types and a special $\Box A$ type former. This is almost,
but not quite, correct! $\Box A$ is indeed a modal operator -- in
particular, it satisfies the axioms of the K4 modality. However, the
recursive type $\mu a.\,A$ is not an ordinary recursive type, because
unrestricted recursive types lead to logical inconsistency. Instead,
$\mu a.\,A$ is a modal fixed point. That is, it satisfies
the isomorphism:
$$
\mu a.\,A \simeq A{[\Box(\mu a.\,A)/a]}
$$

The two key rules are the introduction and elimination rules for $\Box
A$.

The elimination rule is the standard Davies-Pfenning rule. It says that
if you can prove $\Box A$, you can add an $A$ to the modal context $\Delta$
and carry on with your proof.

The introduction rule is different, and weaker than the DP S4
rule. It says you can introduce a $\Box A$, if you can prove $A$ only
using the modal hypotheses in $\Delta$. So the variables in $\Gamma$
are deleted, and the ones in $\Delta$ are copied into the ordinary
hypothetical context. This setup is weaker than the DP introduction rule,
and captures the fact that in K4 you can't derive the comonadic unit
rule that $\Box A \to A$.

Once we have this, it becomes possible to derive Löb's theorem. Before
we do that, let's look at the derivation of the Y combinator. To derive a
fixed point at a type $P$, we introduce a recursive type $μa.\, a \to P$,
and then we can add fold's and unfold's to the standard definition of the
Y combinator:

You could also write this more compactly as λf. let g = λx. f
(unfold x x) in g (fold g), but I have introduced some spurious
let-bindings for a reason which will become apparent in a second.

Namely, virtually the same construction works with Löb's theorem!
We will take the modal recursive type $\mu a.\,a \to P$, and
then add some boxing and unboxing to manage the boxes (I'll annotate
the variables with their types so that you can see what's going on):

Note that if you took $\Box$ to be the identity constructor, and
erased all the $\box$'s from this term, then this proof of Löb's
theorem is actually exactly the same as Curry's Y combinator!

However, people more typically take $\Box$ to mean something like
"quotation" or "provability". What proof shows is that there's nothing
mysterious going on with self-reference and quotation -- or rather, it
is precisely the same mystery as the universality of the lambda-calculus.

Postscript: One of the nice things about blogging is that you can be more informal than in a paper, but it's also one of the pitfalls. I got an email asking about the history of this system, and of course a blog post doesn't have a related work section! Basically, a few years ago, I worked out a calculus giving a syntax for applicative functors. An applicative is basically a monoidal functor with a strength, and I noticed that the modal logic K is basically a monoidal functor without a strength. There matters stood until last year I read Kenneth Foner's Haskell Symposium paper Getting a Quick Fix on Comonads, which was about programming with Löb's theorem. This was interesting, and after Wikipedia's proof of Löb's theorem, I saw that (a) the proof was basically Curry's paradox, and (b) I could adapt the rules for applicatives to model the modal logic K4, which could (with a small bit of tinkering) have a very similar calculus to applicatives. (In fact, the proof I give of Löb's theorem is a transliteration of the proof on Wikipedia into the system in this blog post!)

I should emphasize that I haven't really proved anything about these systems: I think the proofs should be straightforward, but I haven't done them. Luckily, at least for K, I don't have to -- recently, Alex Kavvos has put a draft paper on the arXiv,
System K: a simple modal λ-calculus, which works out the metatheory of K with a dual-zone Davies/Pfenning-style calculus. If you need syntax for applicative functors minus strength, do take a look!

Friday, May 6, 2016

I lived in Cambridge when I did a postdoc with Nick Benton at Microsoft Research, so it'll be nice to be able to reconnect with friends I haven't been able to see regularly a while. At the same time, of course, I will miss the friends I have made here, such as Dan Ghica and Paul Levy.