Mathematics is rife with the fruit of abstraction. Many problems which first are solved via "direct" methods (long and difficult calculations, tricky estimates, and gritty technical theorems) later turn out to follow beautifully from basic properties of simple devices, though it often takes some work to set up the new machinery. I would like to hear about some examples of problems which were originally solved using arduous direct techniques, but were later found to be corollaries of more sophisticated results.

I am not as interested in problems which motivated the development of complex machinery that eventually solved them, such as the Poincare conjecture in dimension five or higher (which motivated the development of surgery theory) or the Weil conjectures (which motivated the development of l-adic and other cohomology theories). I would also prefer results which really did have difficult solutions before the quick proofs were found. Finally, I insist that the proofs really be quick (it should be possible to explain it in a few sentences granting the machinery on which it depends) but certainly not necessarily easy (i.e. it is fine if the machinery is extremely difficult to construct).

In summary, I'm looking for results that everyone thought was really hard but which turned out to be almost trivial (or at least natural) when looked at in the right way. I'll post an answer which gives what I would consider to be an example.

I decided to make this a community wiki, and I think the usual "one example per answer" guideline makes sense here.

31 Answers
31

Here is my example. In the 1930's (I think), Wiener gave a proof that if $f$ is a continuous nonvanishing function on the circle with absolutely convergent Fourier series, then so is $1/f$. The proof was a long piece of hard analysis, involving detailed local calculations and complicated estimates. Later (in the 1940's?), Gelfand found that the statement follows from the basic theory of Banach algebras as follows. The functions on the circle with absolutely convergent Fourier series can be characterized as the image of the Gelfand transform $\Gamma: l^1(\mathbb{Z}) \to C(S^1)$. In general if $\Gamma: B \to C(M)$ is the Gelfand transform from a commutative Banach algebra to the ring of continuous functions on its maximal ideal space, then $x$ is invertible in $B$ if and only if $\Gamma(x)$ is invertible in $C(M)$. So the hypotheses on $f$ imply that $f = \Gamma(x)$ for some invertible $x$ in $l^1(\mathbb{Z})$, and a simple calculation shows that $1/f = \Gamma(x^{-1})$.

Gelfand's result was first published in 1939 ("On normed rings" announces the basics of the new theory, "To the theory of normed rings II" has this application and more), although a 1941 paper ("Normierte Ringe") which provides more details and proofs seems to be more often cited. These are in the Collected papers, volume 1.
–
Jonas MeyerMay 17 '10 at 1:01

4

It's a great example, of course. The Banach algebra in question is $l^1$ with the algebra multiplication being convolution. I haven't looked at this in a while, but when I did look at Weiner's original work, it struck me that he was using convolution in a very modern way -- the whole argument was based on properties of convolution, and the algebraic properties in particular were key. I've occasionally wondered if Gelfand noticed this aspect and if it was in any way an inspiration for what he did.
–
Carl OffnerMay 30 '10 at 14:59

There is a theorem in finite group theory, that if $a$, $b$, and $c$ are integers all greater than $1$, there exists a finite group $G$ with elements $x$ and $y$ such that:
$x$ has order $a$, $y$ has order $b$, and $xy$ has order $c$. I think the first person to prove this was G.A. Miller, whose proof looked at lots of separate cases, and had tons of long, tedious calculations in symmetric groups (I will try and find the paper and post the reference later). I don't know who discovered the more modern proofs, but Derek Holt posted a proof on the group-pub that is one of the most elegant things I've ever seen. Unfortunately, it doesn't seem to be available on the archive of the list, so I will just post it here verbatim:

Let q be a prime power such that q-1 is divisible by 2a, 2b, and 2c.
We will construct elements x,y of SL(2,q) such that x, y, and xy have orders
2a, 2b, and 2c, and then the images of x,y,xy in PSL(2,q) will have orders
a, b, and c as required.

An element of SL(2,q) with distinct eigenvalues is diagonalizable in GL(2,q),
and so its order is determined by its characteristic polynomial which is
determined by its trace. In particular, since 2a,2b,2c > 2, this applies
to elements with these orders.

Let u and v be elements of the field F_q with multiplicative orders 2a and
2b, and let x = [ [u, 1], [0, u^-1] ] and y = [ [v, 0], [t, v^-1] ] be
in SL(2,q), where t remains to be chosen. Then x and y have orders 2a and 2b.

The trace of xy is uv + t + u^-1v^-1, and so by suitable choice of t, we
can make this equal to any value we like. So we can make it equal to the
trace of an element of SL(2,q) with order 2c, and then xy will have order 2c.

Yes, you can do it with triangle groups; the hyperbolic case follows since those triangle groups are residually finite (basically my Malcev's theorem), and then you can do the spherical (already finite) and euclidean cases by hand.
–
Steve DMay 16 '10 at 22:41

3

(Or observe that the spherical and Euclidean cases are also residually finite. All these groups are linear!)
–
HJRWMay 16 '10 at 23:07

1

Do you think that the theorem is hard or is it just the case of a messy initial proof? After all, both Steve's approach rely on methods from the 19th century.
–
Victor ProtsakMay 17 '10 at 23:20

Cantor's proof of the existence of transcendental numbers. With a (now) obvious one-line argument he showed that there are uncountably many of them --- when Liouville, Hermite and others had to take (putative) transcendental numbers one at a time ...

Newman's argument (especially Korevaar's and Zagier's version of it) turned the Prime Number Theorem, which took a century to be proved, into something that can be explained in a few minutes to any graduate student.

In 1917 Hardy and Ramanujan proved that all but $o(x)$ integers $n \leq x$ have $\log\log n + O((\log\log n)^{1/2 + \epsilon})$ distinct prime factors. The proof was long and relied on establishing (by induction!) an precise bound for the number of integers with exactly $k$ distinct prime factors (with $k$ arbitrary, and possibly tending to infinity with $x$). A short "two-line" proof was found by Turan in 1934.

Hardy disliked Turan's proof, because as he claimed, it did not give proper insight. However as it turned it was Turan's method that was prone to generalization. Twenty years later his inequality became the more general Turan-Kubilius inequality. Curiously enough it was later realized by Elliott that taking the "dual" of Turan-Kubilius's inequality yields immediately the arithmetic large sieve inequality! :-)

As Persi Diaconis puts it (when discussing Hardy-Ramanujan's proof): "Impressive as the argument is, to a probabilist, the project seems out of focus; they are proving the weak law of large numbers by using the local central limit theorem. If all that is wanted is their theorem, there are much easier arguments. With all their work, one could reach much stronger conclusions". See www-stat.stanford.edu/~cgates/PERSI/papers/Hardy.pdf for the rest of Persi's nice article.
–
maksMay 16 '10 at 23:42

The Nielsen-Schreier subgroup theorem: subgroups of free groups are free. This has a very quick proof using the fact that a group is free precisely when it acts freely and without inversions on a tree.

Lomonosov's 1973 proof that every compact operator $T$ has a hyperinvariant subspace (i.e., a subspace that is invariant for every operator that commutes with $T$) was much simpler than proofs existing then that every compact operator has an invariant subspace. See http://en.wikipedia.org/wiki/Invariant_subspace_problem. However, Wikipedia fails to mention that Lomonosov's proof was further simplified to replace the Schauder fixed point theorem by the spectral radius formula $\lim \|T^n\|^{1/n}$ (see e.g. Rudin's Functional Analysis), so that Lomonosov's theorem is taught (or assigned as an exercise) in classes in which the spectral radius formula is introduced.

The Cayley-Hamilton theorem. Apparently, Cayley only proved it for $2\times2$ and - in a horrendous calculation - for $3\times3$ matrices and then wrote something outrageous in the spirit of "and similarly, we can prove it for any $n$". Hamilton then proved another special case in a paper on linear operators on the space of quaternions. Nowadays, it is proven in full generality in just a couple of lines, using the fact that the set of diagonalisable matrices of a given dimension, for which the theorem is trivially true, is dense in the set of all matrices of the same dimension.

Is there an easy and elementary proof that the diagonal matrices are Zariski dense?
–
Lennart MeierOct 5 '10 at 12:17

5

The map that assigns to a matrix the discriminant of its characteristic polynomial is a polynomial function in the entries, hence continuous. Over an algebraically closed field, the set of non-diagonalisable matrices is the pre-image of 0 under this map, since a matrix is diagonalisable if and only if it has distinct eigenvalues. The assertion now easily follows. You have two options: either working over the complex numbers (a pretty cute little argument, also only 3 lines, reduces to this case) and then you can even use the Euclidean topology on C^{n^2}, or working with the Zariski topology.
–
Alex B.Oct 5 '10 at 13:08

1

@Alex: I think you mean the discriminant of its minimal polynomial.
–
Guillermo MantillaOct 11 '10 at 7:41

2

I think I like the argument using the Zariski topology better. It is very direct. Let $c(t)$ be the characteristic polynomial function, let $f_{ij}$ be the polynomial map $a \mapsto c(a)_{ij}$ and let $\Delta$ be the discriminant of $c(t)$. So $\Delta$ and $f_{ij}$ are polynomial functions on the space of $n \times n$ matrices. Now $f_{ij}$ vanishes on any matrix $a$ with distinct eigenvalues: $f_{ij}(a) = 0$ whenever $\Delta(a) \neq 0$. So $\Delta f_{ij}$ is identically zero for all $i,j$. Hence $f_{ij} = 0$ for all $i,j$ as required.
–
Konstantin ArdakovSep 24 '11 at 11:38

The Van der Waerden conjecture for the permanents was stated in 1926 and remained open for over 50 years. It was considered "one of the famous open problems in combinatorial theory" (in van Lint's words). It turned out to be an easy consequence of the Alexandrov-Fenchel inequality from late 1930s. See this article for the history and basically the whole proof.

This article homepages.cwi.nl/~lex/files/perma5.pdf from a recent issue of the American Mathematical Monthly describes the recent - and completely elementary - proof of Van der Waerden's theorem due to Leonid Gurvits.
–
Jon YardMay 17 '10 at 17:15

Though the idea behind it all is childishly simple, yet the method of analytic geometry is so powerful that the very ordinary boys of seventeen can use it to prove results which would have baffled the greatest of Greek geometers--Euclid, Archimedes, and Apollonius.

The associativity of the group law on an elliptic curve can be proved by a tedious and unenlightening calculation, but it can be derived pretty quickly once you have developed some curve theory (Riemann-Roch, etc.).

The theory of the Weierstrass p-function isn't entirely unenlightening. I assume you're referring to the chord-and-tangent proof, which is awful, but it's not so bad over C when elliptic functions are available.
–
Qiaochu YuanMay 17 '10 at 3:59

4

@Timothy: Riemann-Roch seems like a lot just to prove associativity. You only need to note the $P+(Q+R)=(P+Q)+R$ holds separately for the obvious cases $P=O$ and $Q=O$, then apply the Rigidity Theorem. This has a simple proof (Theorem 7.13 of J.S. Milne's notes on Alg. Geom. ver 5.10. jmilne.org/math/CourseNotes/ag.html).
–
George LowtherJan 25 '11 at 2:20

Gauss himself later found another 5 proofs; in particular, his proof based on Gauss sums (reproduced in Ireland and Rosen) is very short and uses all the cyclotomy that you can possibly need.
–
Victor ProtsakMay 18 '10 at 2:26

I've lately have found myself admiring the proof of the fundamental theorem of algebra using linear algebra, due to H. Derksen, American Mathematical Monthly, 110 (7) (2003), 620–623.

He proves directly that linear operators on finite dimensional complex vector spaces admit eigenvectors, and deduces the fundamental theorem from this. I like the argument because it is completely elementary: All it uses is that odd dimensional polynomials over the reals have a real root, and that complex numbers have complex square roots (in particular, it avoids the machinery of complex or real analysis, and can even be presented without any reference to determinants). Moreover, the proof gives the result that $R(\sqrt{-1})$ is algebraically closed whenever $R$ is a real closed field, which before I had only seen proved using Galois theory or analogous, relatively sophisticated techniques.

Derksen's proof is a nice induction where first odd dimensions are taken care of, then dimensions of the form 4k+2, then of the form 8k+4, etc.

Well, this is not really a quick proof. The proofs by complex variables are much faster (granting enough build-up in complex variables). To be fair, a few years ago I made Derksen's proof the goal of my undergraduate linear algebra course for math majors, but it took me two days to go through the argument carefully and I decided it might have been hard for them to appreciate when the argument goes on that long.
–
KConradMay 16 '10 at 23:48

3

" Well, this is not really a quick proof. The proofs by complex variables are much faster (granting enough build-up in complex variables) " Oh, sure, "granting enough build-up" is the key here. This proof is perhaps the fastest I know from the ground up. But we rarely start at the ground anymore.
–
Andres CaicedoMay 17 '10 at 0:09

2

What about the topological proof? Building up to the fundamental group of the circle only takes about a page!
–
Steven GubkinMay 17 '10 at 1:08

2

What I meant by "this (Derksen's proof) is not really a quick proof" is that even if you try to explain it to a mathematician I still think it will take a bit of lead-in to get into the argument and you don't walk away thinking "oh, that was very natural", which was the attitude which the original question was about. By the way, are you saying Derksen's proof is the fastest you know which starts from scratch? There are proofs by multivariable calculus which take less time. A proof with double integrals is at math.uconn.edu/~kconrad/blurbs/fundthmalg/…
–
KConradMay 17 '10 at 4:54

1

Ha! @KConrad: I've been looking at the notes at your site. Very nice!
–
Andres CaicedoMay 17 '10 at 15:12

The original article containing proof of the Radon-Nikodým theorem has about 50 pages. John von Neumann proved it by little trick and Riesz representation theorem (that one about Hilbert space functionals) in three lines.

I have two entries, although there is a wealth of elementary geometric examples similar to #1 and several alternative proofs of #2.

(1) Pascal's theorem: If H is hexagon whose vertices lie on a conic section Q then the points $A,B,C$ where the pairs of the opposite sides intersect are collinear.

I think that the first proof used Menelaus's criterion of collinearity and required a figure, as well as keeping track of various points and lines in order to use Menelaus's theorem. A beautiful short proof based on Bezout's theorem is in vol 1 of Shafarevich's "Algebraic geometry":

If the sides of H are given by the vanishing of linear forms $l_1,l_2,l_3$ and $m_1,m_2,m_3$ in homogeneous projective coordinates, where $l_i$ is the opposite of $m_i$, then $l_1 l_2 l_3 - \lambda m_1 m_2 m_3$ vanishes at the vertices of $H$ and one more arbitrarily chosen point on Q, for a suitable $\lambda$; since $6+1>2\cdot 3$, by Bezout, the cubic is reducible, so it consists of Q and another component, which is a line passing through $A,B,C$.

(2) Isoperimetric inequality: If a simple closed curve in the plane has length $L$ and bounds the region of area $A$ then $L^2-4\pi A\geq 0$ (with equality only in the case of a circle).

The first proof of the isoperimetric property of the circle was attempted by Jacob Steiner using the "four rod" method (related to "Steiner's symmetrization"), but it proceeded under the assumption that the minimum is attained and so was incomplete. Weierstrass gave the first rigorous proof based on variational calculus and it was painstaking. Adolf Hurwitz found an essentially one-line proof (after all the notation has been set up) that is reproduced in "Einfuhrung in die Differentialgeometrie" by Wilhelm Blaschke (p.33 of 1950 edition):

The Menelaos proof isn't particularly complicated, if done right. Geometry books like to obscure it by giving the points arbitrary names, but if you follow the path of "rewriting the problem in terms of triangle geometry, and then applying Menelaos" (a very good general tactics, since triangles are probably the mathematical object we know most about), it becomes straightforward and quick. Pascal, rewritten in terms of triangles: Let ABC be a triangle, A' and A'' two points on the line BC, B' and B'' two points on the line CA, and C' and C'' two points on the line AB. [...]
–
darij grinbergMay 18 '10 at 10:18

The theorem that the left-hand trefoil knot is not isotopic to the
right-hand trefoil knot was originally proved (by Max Dehn in 1914),
by a rather grueling analysis of the automorphisms of the trefoil
knot group. The theorem became much easier with the advent of the
Jones polynomial in the 1980s.

Surely it's much easier to just note that the signature is not zero and that it is alternating. The Jones polynomial looks to me like massive overkill.
–
Daniel MoskovichMay 17 '10 at 3:23

1

Daniel, thanks for this information, which I have since found in Lickorish's An Introduction to Knot Theory, p.86. I still think it is debatable, however, whether the signature approach is easier than the Jones polynomial.
–
John StillwellMay 17 '10 at 7:08

1

The proof became much easier even before the Jones polynomoal. Seifert's classification of Seifert fibred 3-manifolds does the job. See for example Hatcher's 3-manifolds notes (on-line).
–
Ryan BudneyJun 18 '10 at 6:12

It was about 20 years ago that I have learned it, so I migth not remember things correctly. However, I think Stoke's theorem was considered non-trivial originally. But using differential forms it can be proved by a one line argument.

We are told by Maxwell in his A Treatise on Electricity and Magnetism (1873, p. 27), "This theorem was given by Professor Stokes, Smith’s Prize Examination, 1854, question 8." However, this does not mean that Stokes's theorem was considered easy -- those old Cambridge exams were possibly the most difficult of all time.
–
John StillwellMay 18 '10 at 5:07

I think that this one is only a half victory. When I went through the cohomological machinery use to prove Bezout's theorem, it wasn't as quick as it first seemed. At the same time, I think that there is a resultant-ish proof based on the Hilbert-Poincare series of the projective varieties involved, which is comparable in difficulty to the homological proof. Note that if you want Bezout's theorem in positive characteristic, you have to either develop etale cohomology or do something more direct.
–
Greg KuperbergMay 18 '10 at 3:26

Here is another example from functional analysis. There are several basic results, such as the principle of uniform boundedness and the open mapping theorem, that follow easily from the Baire category theorem. However, I recall from reading Halmos's autobiography I Want to Be a Mathematician that the original proofs of these results were rather complicated and the theorems were considered to be significant achievements.

I think the Baire category theorem is responsible for quick proofs of lots of hard theorems, such as the existence of continuous nowhere differentiable functions. I recently worked out a way to use it to prove the existence of bump functions, too. I think the Hahn-Banach theorem was also considered difficult in its time, and lots of its easy consequences started life as nontrivial theorems. I guess there are a wealth of examples in functional analysis, perhaps since analysis has been around for so long. For that reason I was expecting more examples in number theory and Algebraic geometry.
–
Paul SiegelMay 20 '10 at 14:00

1

@Paul: can you explain (or link to) your argument about Baire and bump functions? It would be nice to see.
–
Andrea FerrettiMay 21 '10 at 12:42

I was told by my (graduate school) teacher of functional analysis that originally the complex case of the Hahn-Banach theorem was considered a major open problem. It was eventually shown to be such a simple consequence of the real case, that now, no one knows who came up with the trick.

Don't know for sure if this example qualifies, but it certainly is a hard problem which becomes trivial from the right point of view. (I learned this from Martin Gardner, proper credits might be researched if necessary).

Problem: three circles in the plane, no two with the same radius, pairwise disjoint. For each pair of circles, there are four straight lines tangent to both; take the two which leave both circles on the same side; they intersect at a point. Repeat the construction for each pair of circles. We get three points: prove that they are collinear.

You may want to think a little about the problem; can be solved both by plane or analytic geometry, with some effort. Not too difficult, but not a one-liner.

Now consider the following solution: add a dimension. You have three spheres, and if you section them through their centers with a plane you get the original three circles. Consider the cones determined by each couple of spheres; the section is the couple of tangent lines seen above, and the tips of the cones are the three points in the problem. Now take two planes touching the three spheres from above and from below....

There is another one-line solution to this. Each of the three points is the center of a homotopy which maps one of the three circles to another one. The composition of the three homotopies (in the right order) is id, since it fixes a circle and is not a 180° rotation. So the centers of the homotheties are collinear.
–
darij grinbergJan 24 '11 at 23:25

4

I think it was from one of Peter Winkler's books that I first became aware that the 3D proof doesn't quite work as nicely as it seems to at first glance. In particular, it doesn't apply in all cases. Perhaps the easiest example to see is the case of two giant spheres and a tiny sphere. Then there is no plane touching all three spheres from above or from below.
–
Timothy ChowAug 11 '11 at 14:39

I believe Schur's Lemma was originally considered difficult (after all it did get named). However it is now a one-line proof in an undergraduate course.

I suspect Schur was interested in finite dimensional representations of finite dimensional algebras over the complex numbers. Then the lemma is that the endomorphism ring of an irreducible representation is the complex numbers. Don't ask me why this was considered difficult. The definition of an abstract algebra was not published until after Molien and Wedderburn's results so I can see the statement would have been convoluted.

As far as I know, Schur's proof was the one that we still use. "After all it did get named" is not a good argument: it's an extremely important result, even though the proof is very easy, hence nomenclature "lemma". A better example within the same realm would be Hurwitz's proof of complete reducibility of representations of GL_n using averaging over the maximal compact subgroup: that had previously been known only in special cases, via the (complicated) Cayley $\Omega$ process.
–
Victor ProtsakMay 18 '10 at 2:30

4

I once simultaneously audited a physics course taught by J. Van Vleck (who later won a Nobel prize) and took a math course from George Mackey. Both of them proved Schur's Lemma, and only a few days apart. Van Vleck's proof was done entirely with complicated matrix manipulations and took about fifteen minutes. Mackey gave the easy one sentence proof without even writing anything on the board !
–
Dick PalaisJan 25 '11 at 6:58

How about de Branges' proof of the Bieberbach conjecture? My understanding is that his original proof ran to 100+ pages, but others soon found a way of bringing it down to considerably less than that - maybe not a quick proof, but a relatively quick proof.

How does that fit the requirements? What is the sophisticated result with short proof from which it follows? de Branges' original "proof" was notorious for its complexity and gaps, so yes, anything else would be an advance...
–
Victor ProtsakMay 17 '10 at 23:06

The incompressibility method based on Kolmogorov complexity is desribed in "Kolmogorov Incompressibility Method in Formal Proofs A Critical Survey", V Megalooikonomou - 1997, as often being more elegant, intuitive, simpler and shorter than counting arguments, or the probabilistic method, in areas such as lower bounds, average case complexity, random graphs or pumping lemmas in formal language theory.

Short-time existence for the Ricci flow was initially proved by Hamilton by a long and involved argument using the Nash-Moser implicit function theorem. Then Dennis DeTurck found a nice trick which showed that the Ricci flow is equivalent to a standard parabolic problem.

Most of the problems tackled in introductory calculus courses (tangent lines of and areas under basic curves, volumes and areas of solids of revolution, etc) had to be solved on a case-by-case basis, with some pretty complicated and ingenious proofs; now any undergraduate can solve them in a few lines by rote methodology.