Monday, October 20, 2014

Ever since I learned about them, I've thought of
call-by-push-value and focusing (aka polarization) as
essentially two different views of the same problem: they both give a
fine-grained decomposition of higher-order effectful programs which
permits preserving the full βη-theory of the language.

Until this morning, I had thought that the differences were merely
cosmetic, with CBPV arising from Paul Levy's analysis of the
relationship between denotational semantics and operational semantics,
and focusing arising an analysis of the relationship between
operational semantics and proof theory (a lot of people have looked
at this, but I learned about it from Noam Zeilberger). Both systems decompose a
Moggi-style computational monad into a pair of adjoint operators,
which mediate between values and computations (in CBPV) and positive
and negative types (in focusing). So I thought this meant that “value
type” and “positive type” were synonyms, as were “computation type”
and “negative type”.

This morning, I realized I was wrong! Focusing and call-by-push-value
make precisely the opposite choices in their treatment of variables!
To understand this point, let's first recall the syntax of types for
a call-by-push-value (on top) and a polarized (on bottom) calculus.

At first glance, these two grammars look identical, save only for the
renamings and . But this is
misleading! If they are actually the same idea, the reason has to be
much more subtle. The reason for this is that the typing judgements
for these two systems are actually quite different.

In call-by-push-value, the idea is that is a functor which
is left adjoint to . As a result, values are interpreted in a
category of values , and computations are interpreted in a
category of computations . The adjunction between values and
computations means that the hom-set is ismorphic
to the hom-set .
This adjunction gives rise to the two basic judgement forms of
call-by-push-value, the value judgement
and the computation judgement . The idea
is that and
.

The key bit is in the interpretation of contexts in computations, so
let me highlight that:

Note that we interpret contexts as , and so this says that
variables refer to values.

However, in a polarized type theory, we observe that positive types
are “left-invertible”, and negative types are “right-invertible”. In
proof theory, a rule is invertibile when the conclusion implies the
premise. For example, the right rule for implication introduction in
intuitionistic logic reads

This is invertible because you can prove, as a theorem, that

is an admissible rule of the system. Similarly, sums have a
left rule:

such that the following two rules are admissible:

The key idea behind polarization is that one should specify the
calculus modulo the invertible rules. That is, the judgement on the
right should fundamentally be a judgement that a term has a positive
type, and the hypotheses in the context should be negative. That is,
the two primary judgements of a polarized system are the positive introduction
judgement

which explains how introductions for positive types work, and the
negative elimination (or spine judgement)

which explains how eliminations for negative types work. The
eliminations for positive types are derived and the introductions for
negative types are derived judgements (which end up being rules for
pattern matching and lambda-abstractions) which make cut-elimination
hold, plus a few book-keeping rules to hook these two judgements
together. The critical point is that the grammar for consists
of negative types:

This is because positive types are (by definition) left-invertible,
and so there is no reason to permit them to appear as hypotheses. As
a result, the context clearly has a very different character than in
call-by-push-value.

I don't have a punchline for this post, in the sense of “and therefore
the following weird things happen as a consequence”, but I would be
astonished if there weren't some interesting consequences! Both
focalization and call-by-push-value teach us that it pays large
dividends to pay attention to the fine structure of computation, and
it's really surprising that they are apparently not looking at the
same fine structure, despite apparently arising from the same
dichotomy at the type level.