Friday, February 24, 2006

Despite being moderately good at mathematics, even managing to scrape together a PhD, there are certain topics that are always brick walls to me so that I find it hard to get started even at the most elementary level. In algebraic topology I always had problems with spectral sequences but they're not so elementary and are notoriously tricky. But in logic I can barely get off the ground. Here's an example of a sentence from an introduction to linear logic that baffles the hell out of me "One of the most important properties of the proof-rules for Classical Logic is that the cut-rule is redundant". This is one of the most ridiculous things I have read in mathematics writing. If it's redundant then don't study it. Excise it from the list of derivation rules and don't bother with it every again.

I'm sure than when set theorists first tried to write down the axioms that became ZF they found lots of redundant axioms. Over the years they were whittled them down to the list we have today so that I bet you can't even name the axioms that were jettisoned for redundancy. Not so in "Gentzen style" logic. Every document ever written on the subject seems to introduce this rule and then with a flourish they show how it can be eliminated. They develop the whole subject and then proceed to demolish the earlier work by showing how they can rewrite everything they did earlier without using this rule. The only explanation I can come up with is that authors of books on logic are paid by the word and that this allows them a few extra chapters for very little work.

Of course the problem here must be me. I'm sure there's a perfectly good reason for harping on about the cut rule, I just don't see it. And I think this points to a difficulty with trying to read mathematics texts outside of academia. When you're a student you're sharing extratextual material all the time. People tell you that such and such a result is interesting because it has an application somewhere and then you go and read the formal details knowing what the motivation was. Or someone will give an informal seminar where they write equations on the board, but speak much more informally between the equations. But most mathematics texts hide this material from you and present the bare facts. This is fine when you're in an academic environment, but I have to confess to finding it difficult when working on my own.

One thing I'd love to see online is the equivalent of the reading seminars we did during my PhD work. Each week we'd read a chapter and then have a seminar to discuss what we'd just read. Does anyone do this? Blogs seem like a great way to do this but I've seen no evidence of groups working like this.

Hi Derek. At one point that paper asks "what is the point of sequents?" and proceeds to answer. And it asks "what is the price for eliminating cuts?", which is the subject of the paper. In other words, it has informal text that asks the kinds of questions that I've been asking myself. This is excellent. Exactly what I'm looking for. It even talks a little about linear logic which is my longer range target. Many thanks!

I'm glad Derek posted as he did. It seemed to me that this had to be the reason: namely that the cut-rule, though redundant, makes things a lot easier.

In theoretical computer science, we often wave our hands around machine models because we know that somewhere it can be shown that they are mostly equivalent. And so a result that shows that a particular machine model is not special is very important, even if one goes on proving results in that machine model.

Cut elimination basically says 'you can use lemmas'. So we're using cut elimination all the time, and it's essentially a fundamental part of our logic. It's the fact that you don't *have* to use it that's interesting - even though it's in principle unnecesary, we're still going to be using it all the time. Some classical logics do indeed take the principle that because it's redundant you shouldn't take it as an axiom, but they then have to immediately go on to prove it as a metatheorem before they can do anything especially useful!

(Which isn't to say that they're wrong in doing so, just that cut elimination will come up regardless of how you slice it).

Oh, idle note. The ZF axioms are redundant. Specification follows from replacement, and pairing follows from replacement + power set. I think that's it, but I'm not really sure. They're kept in for more or less the same reason - were they not axioms, you would need to immediately make them theorems.

Yes, my first thought was "why not make it a metatheorem?" if it's useful.

Derek's recommendation has made things a lot clearer.

I did already understand the analogy with programming, the reuse of a lemma is similar to the reuse of a function and cut elimination is like function application. (And Curry-Howard etc. formalises this). But I couldn't understand why logicians would be so hung up on cut elimination.

The point of cut elimination is basically that we want this metatheorem to hold for any logical system if we'll really consider it a logic. Just like the deduction theorem (that A,B|=C is a valid sequent iff A|=(B->C) is). Since cut is unidirectional, it makes sense to phrase it as a rule that we then prove is "admissible" from the others, while the deduction theorem goes two ways, and the converse direction just _is_ the only rule of inference in many systems.

Another point about cut elimination (assuming that lack of sleep isn't messing me up right now) is that cut can be eliminated in proofs of sequents just using the rules of inference, but I think it can't be eliminated in some cases where extra non-logical axioms are assumed.

As for the redundancy of ZF (pairing is redundant, and many phrasings of replacement make separation redundant as well - some phrasings also include the existence of the emptyset, which is redundant given infinity and separation) there's another reason for that. The system Z, which consists of all the axioms except replacement, is quite natural, as is the system ZF-Inf, which has all the axioms except infinity. In each of these theories, some of the previously redundant axioms are now necessary. And in many particular consistency proofs, we construct a structure and then show it satisfies ZF plus whatever else we want. However, we often need the whole machinery of Z to prove that the system satisfies replacement, so it makes sense to prove that the axioms hold one by one, even though at the end we won't need some of the earlier ones.

One point that I haven't quite understood though is why all the axioms seem to have biconditionals built in, where conditionals would be sufficient, given separation. (That is, rather than phrasing the union axiom as saying that the union of a set of sets exists, we could phrase it as saying that there is a set having all the previous ones as subsets, and then use separation to cut the union itself out of this big set.)

I think there's some disagreement about the biconditionals (well, not actual disagreement int he sense that people have arguments about it, but that different people do different things). Certainly the first set theory course I took which introduced the ZF axioms only had conditionals there (but was immediately followed by the comment that biconditionals are an easy consequence of this and separation).

I'd tend to use biconditionals for the sake of convenience - if the two axioms are so obviously equivalent, it probably doesn't matter much which you choose.

I'm not a proof theorist, but I'm a programming languages guy, so I can fake it pretty well.

Anyway, the motivation is this: proof theorists don't really care about theorems! Instead, they care about about proofs considered as mathematical objects. Inference rules that are "redundant" from the point of view of (say) a model theorist, who only cares about truth or falsity, can be tremendously interesting from a proof-theoretic point of view, because in this POV rules of inference that prove exactly the same theorems can give rise to radically different sets of proofs.

You can convert any theorem in propositional classical logic into something that only uses the single connective NAND (aka the Sheffer stroke), but to do so is perverse -- no one would willingly reason with classical logic that way, and the reason is that you destroy all the symmetry and combinatorial structure of the logic when you fuse all the connectives into a single monster.

Anyway, back to cut elimination. Cut elimination is interesting for several reasons. First, the proof of cut elimination gives you an estimate of the logical strength of the cut principle. You can look at the cut elimination procedure, and see that it can expand a cut-ful proof into a cut-free one that is possibly hyper-exponentially larger.

Secondly, the cut rule has the strange property that it includes a formula in its premises that doesn't appear in the conclusion. That is, in the cut rule

If Gamma ==> A, and Gamma, A ==> B arederivable, then Gamma ==> B is derivable.

the formula A doesn't appear in the consequent. All of the other connectives have this "subformula property" (even forall and exists, once you generalize subformula appropriately), and having the subformula property gives you a very useful property to do induction on when proving things about logics. So if you prove that cut is eliminable, then you can frequently "port" properties from the cut-free variant of the logic, in which the proof is easy, to the cut-full version, where it is harder.

The computational version of this is that cut-free logics are vastly easier to do proof search in. With cut, you have to guess which A to use when applying the cut rule, and this creates an infinite amount of nondeterminism in your proof search.

Also, seeing which features destroy cut-eliminability is itself interesting. For example, adding an induction axiom to first-order logic will cause cut-elimination to fail. When you do a proof that requires strengthening an induction hypothesis, you see this phenomenon in action. This is also why writing theorem provers that can automatically find inductive proofs is so hard.

So, I guess I'll ask a question in turn: what is the appeal of Hilbert-style axiomatic presentations of logic and ZF-style set theory? I'm afraid I don't get it; type-theoretic presentations make vastly more sense to me, since the combinatorial properties of proofs are much more apparent.

Actually, I think your question almost answers itself - the reason axiomatic hilbert style methods are preferred to type theoretic ones, is that mathematicians don't like thinking about their proofs in a combinatorial manner!

Slightly more seriously, I think it's because this is the natural way of looking at things if you're used to doing mathematics in the classical manner. You have a bunch of assumptions (axioms) and from them you want to prove a conclusion (theorem). The method of going from left to right is not nearly as important as the fact that you got there (as long as the method in question is valid).

One reason that you might want to prove that cut is redundant even while you use it is that proving cut elimination can be an economic means of proving consistency.

Since we can't have a cut free proof of falsity, we mearly have to prove that cut is redundant and we get consistency for free.

One curious thing about proofs without cut is that they are so much larger, yet they are easier to find in an automated fashion. This seems to me like it means that it is easier to verify a cut-full proof, yet easier to find a cut-free one. Presumably cut introduction would be a means of optimising functional programs. I guess this just corresponds to finding good lemmas or perhaps refactoring.

I think you should start a reading club. I'm new to logic and category theory, and I would find it extremely helpful.