Many well-known theorems have lots of "different" proofs. Often new proofs of a theorem arise surprisingly from other branches of mathematics than the theorem itself.

When are two proofs really the same proof? What I mean is this. Suppose two different proofs of theorem are first presented formally and then expanded out so that the formal proofs are presented starting from first principles, that is, starting from the axioms. Then in some sense two proofs are the same if there are trivial operations on the sequence of steps of the first formal proof to transform that proof into the second formal proof. (I'm not sure what I mean by "trivial")

16 Answers
16

You've hit on an area of research that's picking up some momentum at the moment. It involves connections between proof theory, homotopy theory and higher categories. The idea is that a proof or deduction is something like a path (from the premiss to the conclusion), and when you "deform" one proof into another by a sequence of trivial steps, that's something like a homotopy between paths. Or, in the language of higher-dimensional categories, a deduction is a 1-morphism, and a deformation of deductions is a 2-morphism. You can keep going to higher deductions.

There are close connections with type theory too. If you have the right kind of background, the following papers might be helpful:

Has anyone proposed a specific proof theory (even a toy model) where the "proof spaces" can be seen not to be homotopy discrete?
–
Reid BartonNov 3 '09 at 4:36

5

Presumably if you take intuitionistic propositional calculus then you get such a model. The proof terms correspond to the simply-typed $\lambda$-calculus. A $\beta$-reduction from one proof to another is a path between proofs, or a 2-cell in a suitable 2-categorical sense. I think it was Robert Seely who wrote about this originally.
–
Andrej BauerDec 4 '10 at 12:11

This of course is a deep question in the philosophy of mathematics. The program mentioned by Tom Leinster is certainly a very interesting contribution to this, but if it proceeds at a purely mathematical level then at most it can define an equivalence relation on the class of proofs. There's still a further question whether this equivalence relation really is "the right one" to capture the notion of "same" or "different" proofs.

Also, note that there's an open question as to whether mathematical proofs really are the sort of thing studied by proof theorists. Certainly the sort of thing that is published in a math journal is not the sort of thing that is studied by proof theorists. To cite the most obvious differences, the former have words of English in them (or French or Japanese or Russian or some other language) while the latter don't. But for more significant differences, note that the former also cite well-known results from the literature, and skip steps that are sufficiently obvious to the reader, while the latter don't.

You can avoid this problem by assuming that published proofs are converted into formal proofs by means of spelling out all the steps in the proof of the well-known theorem, or the obvious fact. But this might not preserve the notion of "same proof".

For instance, consider a theorem that in some sense only has one proof, which happens to rely essentially on quadratic reciprocity. Do we really want to say that this theorem actually has just as many distinct proofs as quadratic reciprocity does?

There are lots of interesting questions here about the relation of proof theory to actual proofs, and what light it can shed on this intuitive notion of sameness of proof. And of course, there is probably also light to be shed in the other direction too, as our technical mathematical results in proof theory and category theory absorb results from the intuitive ideas we have about proof sameness.

Right. For example, one intuition I have about proofs is that they have "tangent vectors" associated to them: different proofs of quadratic reciprocity, for example, point in different directions for further generalization. But it's unclear how one would go about formalizing this notion.
–
Qiaochu YuanNov 6 '09 at 15:57

2

This doesn't affect your main point, but it's not true of HoTT approach that ‘at a purely mathematical level then at most it can define an equivalence relation on the class of proofs’. It makes the class of proofs into a space with a homotopy type, and each equivalence class (as a connected space) may still have interesting structure. But you are correct that it remains to be seen if ‘this equivalence relation really is "the right one"’.
–
Toby BartelsJan 12 '13 at 0:08

1

The HoTT approach should also be able to handle this: ‘For instance, consider a theorem that in some sense only has one proof, which happens to rely essentially on quadratic reciprocity. Do we really want to say that this theorem actually has just as many distinct proofs as quadratic reciprocity does?’. It is simply relative homotopy. But this may be an example of ‘our technical mathematical results in proof theory and category theory absorb results from the intuitive ideas we have about proof sameness’ if it might not otherwise occur to HoTT people to consider that!
–
Toby BartelsJan 12 '13 at 0:11

It is indeed an open task of proof theory to give a good formal definition of when two proofs should be considered equivalent.

A usual thing is to consider a category with formulas as objects and equivalence classes of proofs as morphisms, where two proofs are considered equivalent if they have the same normal form (in many logics every proof can be brought into a unique normal form, i.e. a chain of deductions of which the first half are e.g. elimination rules and the second half introduction rules). Moreover this transformation of a proof into normal form can often be done algorithmically and is then described by a rewriting system. This provides the link of syntactic proof theory to homotopy theory, mentioned by Tom Leinster, it can be made very plausible via rewriting systems, see e.g Y. Lafont's homepage or the corresponding sections of P.-A. Mellies' homepage. Also check out the "Categorical Semantics of Linear Logic" paper on Mellies' page - there he considers invariants of proofs, each of which should yield a notion of equivalence!

However all of these are syntactical notions of equivalence and, as Terry Tao mentions in his comment at Gowers' blog (see the link in Justin's answer), there is also a semantic notion of equivalence saying that two proofs are equivalent if they have the same degree of generalizability. And while the syntactic notions of equivalence capture quite well the formal operations by which one can relate different proofs, the real challenge is (imho) to give a formal definition of semantic equivalence and recognize it syntactically!

The earliest published attempt I know of are two articles by Lambek, this and J. Lambek, Deductive systems and categories II, in: Lecture Notes in
Mathematics 86 (Springer, Berlin, 1969), especially the second where, if I remember well, he does in fact try to give a syntactic characterization of semantic equivalence.

My opinion, and it's only an opinion, is that it would be very difficult to formalise what it means for two proofs to be different. Here's an intuitive reason why. If I give you two proofs of theorem X, and both proofs are exactly the same, except that one proof had a couple of extra lines in the middle which proved an intermediate result which was of no relevance, then surely these two proofs would be "the same". So surely any sort of "sameness" equivalence relation that one is trying to formally set up on the set of proofs of a statement would have to allow for deleting or adding lines to a proof that aren't used. But now there's perhaps a problem, because proof A and proof B of theorem X might both be "the same" as proof C, where proof C is the disjoint union of proofs A and B.

On the other hand it's manifestly clear that sometimes two proofs of a fact are "different" on an intuitive level. For example I remember doing the exercise as an undergraduate that the map SL(2,Z)-->SL(2,Z/nZ) was surjective, but I used the fact that there was infinitely many primes in an AP. A couple of days later I found a proof that didn't use this and was entirely elementary. Clearly the proofs were "different". All I'm saying is that although this is in some sense obvious, what I'm saying is that it might be tough to formalise.

Perhaps this issue could be explained using homotopy theory (as Tom mentioned). Adding an irrelevant couple of lines certainly doesn't change the homotopy type of a path, and it's possible that the disjoint union of two sufficient proofs could be interpreted as the concatenation of their paths. (I'll readily admit that I wasn't able to make much headway in the papers Tom linked, so I have no idea if this is actually how things are done...)
–
Aaron Mazel-GeeNov 3 '09 at 2:49

4

I follow you up to the point where you transform "(1) argument A; (2) argument B; (3) conclude X from 1" to "(1) argument A; (2) argument B; (3) conclude X from 2". I can imagine that not being allowed as a "proof homotopy". But certainly you'd have to be careful to ensure that that was the case.
–
Reid BartonNov 3 '09 at 4:33

My understanding is that the difficulty is well-known, and the standard solution has been to switch from classical logic to intuitionistic. The references are in my answer in this thread, mathoverflow.net/questions/3776/…
–
Sergey MelikhovDec 5 '10 at 2:53

Thanks for referring to this. Since people who look only at the title and abstract wouldn't see anything about proofs, it may be worthwhile to refer specifically to Section 7.2 (starting at the bottom of page 20 of the version you linked to), which discusses equality of proofs.
–
Andreas BlassDec 4 '10 at 19:43

Thank you for the interesting blog. I was thinking "sameness" in a more formal, proof theoretic manner.
–
MartyguyNov 2 '09 at 10:41

There is some discussion in the comments of that blog post about the formal version of this question. Kenny's comment (near the beginning) is especially illuminating, concerning technical difficulties.
–
S. Carnahan♦Nov 3 '09 at 20:37

Some other answers have alluded to this, but just to spell it out explicitly: The Curry-Howard isomorphism, in one of its simpler forms, says that objects of the free cartesian closed category CCC[S] on a set S of objects correspond to statements of the multiplicative fragment of intuitionistic logic (things we can build from /\ and ⇒) with free variables from S, and there is at least one morphism P → Q in CCC[S] iff P ⇒ Q is a theorem. Thus we can regard a morphism P → Q as a "proof" of P ⇒ Q. There may be several morphisms from P to Q; for instance if A ∈ S and P = A × A, Q = A, then there are exactly two morphisms from P to Q (projection to the first or second factor), which we can regard as two different proofs of the theorem (A /\ A) ⇒ A.

Probably the easiest way to see what the different proofs are in this system is to use the third part of the Curry-Howard isomorphism: morphisms P → Q in CCC[S] correspond to functions in the simply typed lambda calculus of type P ⇒ Q, where × in CCC[S] is interpreted as the product of types and the internal Hom as a function type. For instance there are two functions of type (A * A) → A, namely λ(a, b). a and λ(a, b). b. A more interesting example: the theorem (A ⇒ A) ⇒ (A ⇒ A) has one proof for every natural number, corresponding to λ f. λ x. f (f (... (f x)...)). See This Weeks' Finds week 240 for more along these lines.

Not that I understand much of these - certainly not much enough to see a down-to-earth example of two explicit proofs of some elementary statement that are not homotopic, as detected by some invariant. Is anyone able to provide such an example?

Then, there are two very basic things to note. First, the question itself dates back to Hilbert's (would-be) 24th Problem, which R. Thiele discovered in Hilbert's notebooks a century later (translation and remarks by Thiele, boldface mine):

The 24th problem in my Paris lecture was to be: Criteria of simplicity, or proof
of the greatest simplicity of certain proofs. Develop a theory of the method of
proof in mathematics in general. Under a given set of conditions there can be but
one simplest proof. Quite generally, if there are two proofs for a theorem, you
must keep going until you have derived each from the other, or until it becomes
quite evident what variant conditions (and aids) have been used in the two proofs.
Given two routes, it is not right to take either of these two or to look for a third;
it is necessary to investigate the area lying between the two routes. Attempts at
judging the simplicity of a proof are in my examination of syzygies and syzygies
between syzygies. The use or the knowledge of a syzygy simpliﬁes in an
essential way a proof that a certain identity is true. Because any process of addition [is] an application of the commutative law of addition etc. [and because]
this always corresponds to geometric theorems or logical conclusions, one can
count these [processes], and, for instance, in proving certain theorems of elementary geometry (the Pythagoras theorem, [theorems] on remarkable points of
triangles), one can very well decide which of the proofs is the simplest. [Part of the last sentence is not only barely legible in Hilbert’s notebook but also grammatically incorrect. Corrections and insertions that Hilbert made in this entry show that he wrote down the problem in haste.]

Second, there is a good reason why the question has been traditionally treated in an intuitionistic setup, from Seely to Awodey. (Note that intuitionistic proofs are perhaps less scary if thought of as computer programs, via the Curry-Howard correspondence.) The reason is that in classical logic, with a standard formalization of the notion of "proof", every two proofs of the same statement must be equivalent with every reasonable notion of equivalence. The idea is in Kevin Buzzard's answer. For a rigorous explanation see Yves Lafont's Appendix B in Girard's Proofs and Types (The standard sequent calculus notation used in that appendix is introduced in the very beginning of the book.) It looks like Alessio Guglielmi has some way of overcoming this difficulty by using a non-standard proof-theoretic setup which I wish I understood better.

There is a very nice paper of Wagon, which can serve as a sort of case study. The paper presents fourteen different proofs of the following theorem.

Theorem.
If a rectangle $R$ is tiled by rectangles, each of which has at least one integer side, then $R$ itself has at least one integer side.

If you have not thought about the problem, you may want to think about it before reading the paper. At first glance some of the proofs certainly appear different. For example, there is a proof using a complex double integral, and another which uses Sperner's lemma.

In fact, all fourteen proofs are shown to be different by considering generalizations of the problem. It turns out that no two of the fourteen proofs work for the same set of generalizations. I do not know if this can be formalized in general.

The paper contains an amusing Appendix titled Apendix to justify that the proofs are different, listing the generalizations that each proof works for.

You can express any Proof as a typed Lambda-Term, looking at the theorem as a Type. This term can be normalized. I would say, if two of these Proof-Terms have the same normal form, then they name the same proof.

In combinatorics it is often useful to find bijections between two combinatorial structures one is studying. An example is a bijection between 321-avoiding permutations and 132-avoiding permutations. A lot of different bijections have been shown to exist and the paper Classification of bijections between 321- and 132-avoiding permutations by Claesson, Kitaev shows that some of these are related by "trivial" bijections. Maybe this is a very special case of what Tom Leinster mentions in his answer, about one proof (bijection in this case) is deformed into another by a sequence of trivial steps (trival bijections in this case).

The other posters have well pointed out that the proof identity problem can be approached from various directions. If you're interested in proof theory and are willing to delve into natural deduction and category theory, you might be interested in two proposals for addressing the proof identity problem: the Normalization Conjecture and the Generality Conjecture. See Dozen's "Identity of proofs based on normalization and generality" for a nice introduction to these two ways of viewing the proof identity problem.

If we have two proofs of the same theorem such that each proof has a different normal forms, can we modify the set of axioms so that there is only one normal form proof of the theorem, yet the universe of theorems remains unchanged from the original set of axioms?

More generally, can we select a set of axioms that minimizes the number of normal form proofs for each and every theorem in the original axiom set?

Taking this train of thought to the limit, for any axiom system, does there exist another axiom system with the same universe of theorems but which admits only one normal form proof of each theorem? Such a system of axioms could be called a "tight" set of axioms for a given universe of theorems.

How can you tell if a proof really relies on CFSG in an essential way?
–
JBLDec 5 '10 at 5:40

1

Jonathan, you feel that when you look at a proof, you understand immediately whether it uses CFSG or not. Many people feel when they look at a proof that they understand whether it uses diagonalization (or whatever), but it turns out that it's very hard to actually make precise what it means for a proof to rely on one step in it. I think the question of identifying when a proof relies on CFSG is probably not much easier than identifying when any proof relies on any particular step, which is one possible reformulation of the question asked.
–
JBLDec 5 '10 at 15:03

This is completely missing the point. The problem is to define a notion of proof equivalence that adequately formalizes the notion that the proofs differ in inessential details or presentation. Your definition has the opposite property that any two proofs can be made equivalent by an inessential modification (e.g., throwing in axioms of a sufficiently strong theory).
–
Emil JeřábekSep 28 '14 at 9:53

Let me repeat THE QUESTION: When are two proofs of the same theorem really different proofs?
–
Włodzimierz HolsztyńskiSep 28 '14 at 16:06

1

Let me make it even simpler. Assume we formalize mathematics in a theory $T$ axiomatized by a single axiom, e.g., GBC. Then any nontrivial proof must at some point invoke this axiom, hence by your definition, all nontrivial proofs are equivalent (or rather, no proof is more general than another). And conversely, consider proofs in, say, ZFC. Take an arbitrary proof $P$, choose an axiom of ZFC not implied by those occuring in $P$, and let $Q$ be the proof that starts with that axiom, and then continues the same as $P$. According to your definition, $Q$ is essentially more general than $P$, ...
–
Emil JeřábekSep 28 '14 at 18:47

1

... even though they are really the same proof. In short, circumstantial syntactic properties of this sort are completely useless as a criterion for being essentially different.
–
Emil JeřábekSep 28 '14 at 18:50

1

If you think so, feel free to express yourself better, I was just following what you wrote in the answer. However, rest assured that the question is way too subtle to be solvable by a simple-minded two-paragraph formal definition.
–
Emil JeřábekSep 28 '14 at 19:49