I do not know exactly how to characterize the class of proofs that interests me, so let me give some examples and say why I would be interested in more. Perhaps what the examples have in common is that a powerful and unexpected technique is introduced that comes to seem very natural once you are used to it.

Example 1. Euler's proof that there are infinitely many primes.

If you haven't seen anything like it before, the idea that you could use analysis to prove that there are infinitely many primes is completely unexpected. Once you've seen how it works, that's a different matter, and you are ready to contemplate trying to do all sorts of other things by developing the method.

Example 2. The use of complex analysis to establish the prime number theorem.

Even when you've seen Euler's argument, it still takes a leap to look at the complex numbers. (I'm not saying it can't be made to seem natural: with the help of Fourier analysis it can. Nevertheless, it is a good example of the introduction of a whole new way of thinking about certain questions.)

Example 3. Variational methods.

You can pick your favourite problem here: one good one is determining the shape of a heavy chain in equilibrium.

Example 4. Erdős's lower bound for Ramsey numbers.

One of the very first results (Shannon's bound for the size of a separated subset of the discrete cube being another very early one) in probabilistic combinatorics.

Example 5. Roth's proof that a dense set of integers contains an arithmetic progression of length 3.

Historically this was by no means the first use of Fourier analysis in number theory. But it was the first application of Fourier analysis to number theory that I personally properly understood, and that completely changed my outlook on mathematics. So I count it as an example (because there exists a plausible fictional history of mathematics where it was the first use of Fourier analysis in number theory).

Example 6. Use of homotopy/homology to prove fixed-point theorems.

Once again, if you mount a direct attack on, say, the Brouwer fixed point theorem, you probably won't invent homology or homotopy (though you might do if you then spent a long time reflecting on your proof).

The reason these proofs interest me is that they are the kinds of arguments where it is tempting to say that human intelligence was necessary for them to have been discovered. It would probably be possible in principle, if technically difficult, to teach a computer how to apply standard techniques, the familiar argument goes, but it takes a human to invent those techniques in the first place.

Now I don't buy that argument. I think that it is possible in principle, though technically difficult, for a computer to come up with radically new techniques. Indeed, I think I can give reasonably good Just So Stories for some of the examples above. So I'm looking for more examples. The best examples would be ones where a technique just seems to spring from nowhere -- ones where you're tempted to say, "A computer could never have come up with that."

Edit: I agree with the first two comments below, and was slightly worried about that when I posted the question. Let me have a go at it though. The difficulty with, say, proving Fermat's last theorem was of course partly that a new insight was needed. But that wasn't the only difficulty at all. Indeed, in that case a succession of new insights was needed, and not just that but a knowledge of all the different already existing ingredients that had to be put together. So I suppose what I'm after is problems where essentially the only difficulty is the need for the clever and unexpected idea. I.e., I'm looking for problems that are very good challenge problems for working out how a computer might do mathematics. In particular, I want the main difficulty to be fundamental (coming up with a new idea) and not technical (having to know a lot, having to do difficult but not radically new calculations, etc.). Also, it's not quite fair to say that the solution of an arbitrary hard problem fits the bill. For example, my impression (which could be wrong, but that doesn't affect the general point I'm making) is that the recent breakthrough by Nets Katz and Larry Guth in which they solved the Erdős distinct distances problem was a very clever realization that techniques that were already out there could be combined to solve the problem. One could imagine a computer finding the proof by being patient enough to look at lots of different combinations of techniques until it found one that worked. Now their realization itself was amazing and probably opens up new possibilities, but there is a sense in which their breakthrough was not a good example of what I am asking for.

While I'm at it, here's another attempt to make the question more precise. Many many new proofs are variants of old proofs. These variants are often hard to come by, but at least one starts out with the feeling that there is something out there that's worth searching for. So that doesn't really constitute an entirely new way of thinking. (An example close to my heart: the Polymath proof of the density Hales-Jewett theorem was a bit like that. It was a new and surprising argument, but one could see exactly how it was found since it was modelled on a proof of a related theorem. So that is a counterexample to Kevin's assertion that any solution of a hard problem fits the bill.) I am looking for proofs that seem to come out of nowhere and seem not to be modelled on anything.

Further edit. I'm not so keen on random massive breakthroughs. So perhaps I should narrow it down further -- to proofs that are easy to understand and remember once seen, but seemingly hard to come up with in the first place.

Perhaps you could make the requirements a bit more precise. The most obvious examples that come to mind from number theory are proofs that are ingenious but also very involved, arising from a rather elaborate tradition, like Wiles' proof of Fermat's last theorem, Faltings' proof of the Mordell conjecture, or Ngo's proof of the fundamental lemma. But somehow, I'm guessing that such complicated replies are not what you have in mind.
–
Minhyong KimDec 9 '10 at 15:18

9

Of course, there was apparently a surprising and simple insight involved in the proof of FLT, namely Frey's idea that a solution triple would give rise to a rather exotic elliptic curve. It seems to have been this insight that brought a previously eccentric seeming problem at least potentially within the reach of the powerful and elaborate tradition referred to. So perhaps that was a new way of thinking at least about what ideas were involved in FLT.
–
roy smithDec 9 '10 at 16:21

10

Never mind the application of Fourier analysis to number theory -- how about the invention of Fourier analysis itself, to study the heat equation! More recently, if you count the application of complex analysis to prove the prime number theorem, then you might also count the application of model theory to prove results in arithmetic geometry (e.g. Hrushovski's proof of Mordell-Lang for function fields).
–
D. SavittDec 9 '10 at 16:42

7

I agree that they are difficult, but in a sense what I am looking for is problems that isolate as well as possible whatever it is that humans are supposedly better at than computers. Those big problems are too large and multifaceted to serve that purpose. You could say that I am looking for "first non-trivial examples" rather than just massively hard examples.
–
gowersDec 9 '10 at 18:04

4

It seems to me that this question has been around a long time and is unlikely garner new answers of high quality. It also seems unlikely most would even read new answers. Furthermore, nowadays I imagine a question like this would be closed as too broad, and if we close this then we'll discourage questions like it in the future. So I'm voting to close.
–
David WhiteOct 13 '13 at 18:52

Fermat´s small theorem is in its nature group-theoretic. I don´t see anything surprising in proving it via Lagrange. Application of fixed point methods to differential equations is applying analysis to analysis. It only asks one to realize that differential and integral operators are maps... :-) But I think the 3rd and 4th point of your answer definitely suit to the original question.
–
efqDec 9 '10 at 19:00

Or in the original here:
Bernard Bolzano (1817). Purely analytic proof of the theorem that between any two values which give results of opposite sign, there lies at least one real root of the equation. In Abhandlungen der koniglichen bohmischen GeseUschaft der Wissenschaften Vol. V, pp.225-48.

Not fully rigorous, according to today's standards, but perhaps his method of proof could be considered a breakthrough nonetheless.

More specifically, Bolzano was first to recognize that a completeness property of the real numbers was needed, and he proposed the principle that any bounded set of real numbers has a least upper bound.
–
John StillwellAug 29 '11 at 16:51

I am always impressed by proofs that reach outside the obvious tool-kit.
For example, the proof that the dimensions of the irreducible representations of a finite group divide the order of the group relies on the fact that the character values are algebraic integers.
In particular, given a finite group $|G|$ and an irreducible character $\chi$ of dimension $n,$
$$\frac{1}{n} \sum_{s \in G} \chi(s^{-1})\chi(s) = \frac{|G|}{n}.$$
However, since $\frac{|G|}{n}$ is an algebraic integer (it is the image of an algebra homomorphism) lying in $\mathbb{Q},$ it in fact lies in $\mathbb{Z}.$

Novikov's proof of the topological invariance of rational Pontryangin classes, for which he was awarded the 1970 Fields Medal. Fundamentally new (complicating a fundamental group to simplify geometry), and also fundamentally important. Here is what Sir Michael Atiyah had to say (as cited in the introduction to Raniski's Higher Dimensional Knot Theory):

Undoubtedly the most important single result of Novikov, and one which combines in a remarkable degree both algebraic and geometric methods, is his famous proof of the topological invariance of (rational) Pontryagin classes of a differentiable manifold...
As is well-known many topological problems are very much easier if one is dealing with simply-connected spaces. Topologists are very happy when they can get rid of the fundamental group and its algebraic complications. Not so Novikov! Although the theorem above involves only simply-connected spaces, a key step in its proof consists in preversely introducing a fundamental group, rather in the way that (on a much more elementary level) puncturing the plane makes it non-simply-connected. This bold move has the effect of simplifying the geometry at the expense of complicating the algebra, but the complication is just manageable and the trick works beautifully. It is a real master stroke and completely unprecedented.

I was about to go post about the 3-5 switch in the proof of Fermat's Last Theorem when I read this answer. It made me realize both are tricks rather than "proofs which require a fundamentally new way of thinking." Indeed, I think a computer could easily come up with the idea of introducing a new variable to simplify what needs to be proven (Rabinowitsch) or of switching tactics to deal with one case separately (Wiles). I'd go so far as to say computers are much better than humans at this kind of equational reasoning.
–
David WhiteMay 19 '11 at 20:01

I work in automated theorem proving. I certainly agree, in principle, that there are no proofs that are inherently beyond the ability of a computer to solve, but I also think that there are fundamental methodological problems in addressing the problem as posed.

The problem is to come up with a solution that would not be regarded as 'cheating', i.e., somehow building the solution into the automated prover to start with. New proof methods can be captured by what are called 'tactics', i.e., programs that guide a prover through a proof. Clearly, it would not be satisfactory to analyse the original proof, extract a tactic from it (even a generic one) that captures the novel proof structure and then demonstrate that the enhanced prover could now 'discover' the novel proof. Rather, we want the prover to invent the new tactic itself, perhaps by some analysis of the conjecture to be proved, and then apply it. So we need an automated prover that learns. But anticipating what kind of tactic we want to be learnt may well influence the design of the learning mechanism. We've now kicked the 'cheating' problem up a level.

Methodologically, what we want is a large class of problems of this form. Some we can use for development of the learning mechanism, and some we can use to test it. Success on a previously unseen test set would demonstrate the generality of the learning mechanism and, hence, the absence of cheating. Unfortunately, these challenges are usually posed as 'can a prover prove this theorem' rather than 'can it solve this wide range of theorems, each requiring a different form of novelty. Clearly, this latter form of the question is hugely challenging and we're unlikely to see if solved in the foreseeable future.

The Lebesgue integral seems to have been a fundamentally new way of thinking about the integral. It's hard to prove the convergence theorems if you have the Riemann integral in mind. I suppose there are probably many instances where you can give a computer a very ineffective definition of something and ask that it prove theorems. Ask it to prove anything about the primes where you start with the converse of Wilson's theorem as the definition of a prime. Can the computer figure out that its definition is terrible? Can it figure out what a prime really "is"?

Lovasz's proof of cancellation in certain classes of finite structures still bewilders me; I can only imagine that he found the proof first and then came up with the theorem afterwards. The basic idea is to look at homomorphisms between a given structure and a sequence of other structures. A comparison of two such sequences involving structures of
the form AxC and BxC can be taken to a comparison between A and B. The condition that there exists a one-element substructure is used to show a certain nontriviality of the
comparison, and a few more details result in showing A is isomorphic to B if(f) AxC is isomorphic to BxC.

I should have asked Lovasz how he came up with the proof; I am confident that most people would not be able to come close to the method independently if they were only given the theorem statement. (Not to mention the analogous statement of unique nth roots in the same class.)

Though the topological proofs are beautiful and slick, you don't need topology to prove this fact (and the original proofs were completely algebraic; see Lyndon and Schupp's book on combinatorial group theory for nice accounts of Nielsen's original proof as well as the later Reidemeister-Schreier approach).
–
Andy PutmanDec 18 '10 at 6:54

There are two ways to prove the compactness theorem for propositional logic - either using the completeness theorem and going from semantic entailment to syntactic proof, or by a topological argument in Stone spaces. The latter, I feel, is an unexpected way of doing it - but I don't know the history of the subject so I'm probably not qualified to comment whether it was fundamentally new or not. Certainly in light of Stone's representation theorem, it seems unsurprising that there could be a topological proof of a theorem in logic, and as I understand it this connection is further investigated in topos theory?

The topology on Stone spaces is precisely the Zariski topology on the spectrum of a Boolean ring. It is one of three historical sources of the idea that commutative rings can be thought of as topological spaces, along with various work in algebraic geometry and Gelfand's work on C*-algebras. I have never been very clear on the historical relationship between the three.
–
Qiaochu YuanJan 18 '11 at 16:27

Here are two more candidates for new ways of thinking in proofs but I am not sure about the historical picture. One is Brunn sieve which led to new results in number theory. The other is Kummer's method that have led to proofs of many cases of FLT. (Frey's new way of thinking regarding FLT was already mentioned in a Roy Smith's comment.)

I am surprised that noone mentioned Hilbert's proof of Hilbert's Basis Theorem yet. It says that every ideal in $\mathbb{C}[x_1,\ldots,x_n]$ is finitely generated - the proof is nonconstructive in the sense that it does not give an explicit set of generators of an ideal. When P. Gordan (a leading algebraists at that time) first saw Hilbert's proof, he said, "This is not Mathematics, but theology!"

However, in 1899, Gordan published a simplified proof of Hilbert's theorem and commented with "I have convinced myself that theology also has its advantages."

for any Riemann integrable $f$. (The uniform distribution follows by setting $f=$ the characteristic function of an interval.) To prove the more complicated question he observes that the space $X$ of $f$'s satisfying the above equality is a vector space and it is closed with respect to a natural topology. He then observes that $X$ contains all the trigonometric polynomials (trivial computation) and thus $X$ must contain all the functions that can be approximated by trig polynomials. This implies that $X$ contains all the Riemann integrable functions. This a soft touch a proof, with no brute force computation, using ideas of functional analysis at a time when the ideas of functional analysis were not part of the mathematical arsenal.

B. Forty years later A. Grothendieck gave a beautiful proof to the (then) recently discovered Riemann-Roch-Hizebruch formula. He formulated a more complicated problem, observed that the more complicated problem has a rich structure encoded in the object he invented and now called the $K$-theory of coherent sheaves, and then used functoriality to show that to prove the most general case it suffices to prove it for two special classes of examples.

Malliavin's proof of Hormander's theorem is very interesting in the sense that one of the basic ingredients in the language of the proof is a derivative operator with respect to a Gaussian process acting on a Hilbert space. The adjoint of the derivative operator is known as the divergence operator and with these two definitions one can establish the so called "Malliavin Calculus" which has been used to recover classical probabilistic results as well as give new insight into current research in stochastic processes such as developing a stochastic calculus with respect to fractional Brownian motion. What makes his proof more interesting is that Malliavin was trained in geometry and only used the language of probability in a somewhat marginal sense at times - alot of his ideas are very geometric in nature which can be seen for example in his very dense book: P. Malliavin: Stochastic Analysis. Grundlehren der Mathematischen Wissenschaften,
313. Springer-Verlag, Berlin, 1997.

How about Goodwillie Calculus? I'm not an expert in this field, but it seems to capture a lot of very deep ideas in stable homotopy theory and in category theory more generally. Here is a stub which includes some of the traditional concepts you can get back from Goodwillie Calculus: http://ncatlab.org/nlab/show/Goodwillie+calculus

Here are some lecture notes which go over the Goodwillie calculus and use it to derive the James splitting $\Sigma^\infty \Omega \Sigma X$ and the Snaith splittings of $\Sigma^\infty \Omega^n \Sigma^n X$ in a new way (this is an example of the "proof" the question is asking for):
http://noether.uoregon.edu/~sadofsky/gctt/goodwillie.pdf

Finally, I recently saw an amazing talk given by Mark Behrens which used the Goodwillie Calculus to lift differentials in a particular spectral sequence to differentials in the EHP Spectral Sequence, meaning this abstract machinery could also lead to powerful new computational tools. This is discussed in a recent paper:
http://www-math.mit.edu/~mbehrens/papers/GoodEHPmem.pdf

Barwise compactness and $\alpha$-recursion theory. The idea many properties of the following are captured by thinking of how to define analogs in $V_\omega$:

(1) Finite sets are elements of $V_{\omega}$.

(2) Computable sets can are $\Delta_1$ definable over $V_{\omega}$.

(3) Computable enumerable sets can are $\Sigma_1$ definable over $V_{\omega}$.

(4) First order logic is $L_{\infty, \omega} \cap V_\omega$.

Then, if we replace $V_\omega$ by a different countable admissible set $A$, many of the results relating these classes have analogs. E.g. Barwise compactness, completeness, the existence of an $A$-Turing jump, ...

I would like to propose the theorem of J.H.C. Whitehead that if $X$ is a path connected space, and $Y$ is formed from $X$ by attaching $2$-cells, i.e. $Y=X \cup_{f_i}e^2_i$ for a family of maps $f_i: S^1 \to X$, then the crossed module $\partial: \pi_2(Y,X,x) \to \pi_1(X,x) \;$ is the free crossed module on the characteristic maps of the $2$-cells.

The essential geometric content of the proof uses transversality and knot theory, and was in his paper 1. The definition of crossed module was given in his paper 2. Finally the definition of free crossed module was given in his paper 3, together with an outline of the proof, referring back to paper 1. You can find my own exposition of the proof here. The referee wrote that: "The theorem is not new. The proof is not new. But the paper should be published since these papers of Whitehead are notoriously obscure." I explained the proof once to Terry Wall, and he said it was a good 1960's type proof! What my paper does is repackage Whitehead's proof for a modern audience, and with pictures and consistent notation.

It seems to me pretty good to give the essence of a proof years before you have the right definitions for the theorem!

The notion of crossed module has over recent years become more widespread, partly because of its relation to $2$-groupoids and double groupoids. This is discussed a little in a seminar I gave in Chicago last year. See also the Wikipedia entry and that from the nlab.

I think the concepts of Archimedes which are at the birth of infinitesimal calculation, as the definition of length of a circle (hence the concept of $\pi$), and how to calculate the area of ​​a circle from $\pi$.

Applications of low-dimensional geometric topology (the fundamental group) to the abstract group theory.

Archimedes' applications of mechanics to geometry (he considered them to be but heuristics, and provided additional "purely geometric" proofs for his results, but it was, as we know now, profound mathematics, something like affine geometry, with linear coefficients of points, adding to $1$, more or less being normalized weights).

Banach's method of applying Baire Theorem (Baire Property).

(I have one more recent too, which took specialists by surprize, but 5 is a nice number).

Though I am not going to answer in exactly the way required, I believe including occasions were new insights helped to give support a great unsolved problem are worth-noting.
For instance, we can consider a case in Random Matrix Theory: the statistical interplay between the distribution the zeros of the Riemann Zeta function and the eigenvalues of a random Hermitian matrix which has provided a basis for the Hilbert–Pólya conjecture. .

I think it's fair to say that this interplay has led to a lot of conjectures but no proofs; whether random matrix theory can actually be used to prove theorems about L-functions remains to be seen...
–
David HansenDec 12 '10 at 9:28