Every non-empty set S of non-negative integers contains a least element; that is, there is some integer a in S such that for all b’s belonging to S.

Because this principle plays a role in many proofs related to foundations of mathematics, let us use it to show that the set of positive integers has what is known as the Archimedean property.

Archimedean property:

If a and b are any positive integers, then there exists a positive integer n such that .

Proof:

By contradiction:

Assume that the statement of the theorem is not true so that for some a and b, we have for every positive integer n. Then, the set consists entirely of positive integers. By the Well-Ordering Principle, S will possess a least element, say, . Notice that also lies in S; because S contains all integers of this form. Further, we also have contrary to the choice of as the smallest integer in S. This contradiction arose out of original assumption that the Archimedean property did not hold; hence, the proof. QED.

First Principle of Finite Induction:

Let S be a set of positive integers with the following properties:

a) the integer 1 belongs to S.

b) Whenever the integer k is in S, the next integer is also in S.

Then, S is the set of all positive integers.

Second Principle of Finite Induction:

Let S be a set of positive integers with the following properties:

a) the integer 1 belongs to S.

b) If k is a positive integer such that belong to S, then must also be in S.

Then, S is the set of all positive integers.

So, in lighter vein, we assume a set of positive integers is given just as Kronecker had observed: “God created the natural numbers, all the rest is man-made.”

Mathematics Hothouse shares:

Like this:

(The following is reproduced from the book “The Way I Remember It” by Walter Rudin. The purpose is just to share the insights of a formidable analyst with the student community.)

When I arrived at MIT in 1950, Banach algebras were one of the hot toppers. Gelfand’s 1941 paper “Normierte Ringe” had apparently only reached the USA in the late forties, and circulated on hard-to-read smudged purple ditto copies. As one application of the general theory presented there, it contained a stunningly short proof of Wiener’s lemma: the Fourier series of the reciprocal of a nowhere vanishing function with absolutely convergent Fourier series also converges absolutely. Not only was the proof extremely short, it was one of those that are hard to forget. All one needs to remember is that the absolutely convergent Fourier series form a Banach algebra, and that every multiplicative linear functional on this algebra is evaluation at some point of the unit circle.

This may have led some to believe that Banach algebras would now solve all our problems. Of course, they could not, but they did provide the right framework for many questions in analysis (as does most of functional analysis) and conversely, abstract questions about Banach algebras often gave rise to interesting problems in “hard analysis”. (Hard analysis is used here as Hardy and Littlewood used it. For example, you do hard analysis when, in order to estimate some integral, you break it into three pieces and apply different inequalities to each.)

One type of Banach algebras that was soon studied in detail were the so-called function algebras, also known as uniform algebras.

To see what these are, let be the set of all complex-valued continuous functions on a compact Hausdorff space X. A function algebra on X is a subset A of such that

(i) If f and g are in A, so are , , and for every complex number c (this says that A is an algebra).

(ii) A contains the constant functions.

(iii) A separates points on X (that is, if , both in X, then for some f in A), and

(iv) A is closed, relative to the sup-norm topology of , that is, the topology in which convergence means uniform convergence.

A is said to be self-adjoint if the complex conjugate of every f in A is also in A. The most familiar example of a non-self-adjoint function algebra is the disc algebra which consists of all f in that are holomorphic in U. (here, and later, U is the open unit disc in C, the complex plane, and is its closure). I already had an encounter with , a propos maximum modulus algebras.

One type of question that was asked over and over again was: Suppose that a function algebra on X satisfies … and …and is it C(X)? (In fact, 20 years later a whole book, entitled “Characterizations of C(X) among its Subalgebras” was published by R. B. Burckel.) The Stone-Weierstrass Theorem gives the classical answer. Yes, if A is self-adjoint.

There are problems even when X is a compact interval I on the real line. For instance, suppose A is a function algebra on I, and to every maximal ideal M of A corresponds a point p in I such that M is the set of all f in A having (In other words, the only maximal ideals of A are the obvious ones). Is ? This is still unknown, in 1995.

If are in , and the n-tuple separates points on I, let be the smallest closed subalgebras of that contains and I.

When is 1-1 on I, it follows from an old theorem of Walsh (Math. Annalen 96, 1926, 437-450) that .

Stone-Weierstrass implies that if each is real-valued.

In the other direction, John Wermer showed in Annals of Math. 62, 1955, 267-270, that can be a proper subset of !

Here is how he did this:

Let E be an arc in C, of positive two-dimensional measure, and let be an algebra of all continuous functions on the Riemann sphere S (the one-point compactification of C). which are holomorphic in the complement of E. He showed that for every g in , that contains a triple that separates points on S and that the restriction of to E is closed in . Pick a homeomorphism of I onto E and define . Then, , for if h is in then for some g in , so that

is the closure of an open subset of (except when h is constant).

In order to prove the same with two function instead of three I replaced John’s arc E with a Cantor set K, also of positive two-dimensional measure (I use the term “Cantor set” for any totally disconnected compact metric space with no isolated points; these are all homeomorphic to each other.) A small extra twist, applied to John’s argument, with in place of , proved that can also be smaller than .

I also used to show that contains maximal closed point-separating subalgebras that are not maximal ideals, and that the same is true for whenever X contains a Cantor set. These ideas were pushed further by Hoffman and Singer in Acta Math. 103, 1960, 217-241.

In the same paper, I showed that when of the n given functions are real-valued.

Since Wermer’s paper was being published in the Annals, and mine strengthened his theorem and contained other interesting (at least to me) results, I sent mine there too. It was rejected, almost by return mail, by an anonymous editor, for not being sufficiently interesting. I have had a few others papers rejected over the years, but for better reasons. This one was published in Proc. AMS 7, 1956, 825-830, and is one of six whose Russian transactions were made into a book “Some Questions in Approximation Theory”, the others were three by Bishop and two by Wermer. Good company.

Later, Gabriel Stolzenberg (Acta Math. 115, 1966, 185-198) and Herbert Alexander (Amer. J. Math., 93, 1971, 65-74) went much more deeply into these problems. One of the highlights in Alexander’s paper is:

if are of bounded variation.

A propos the Annals (published by Princeton University) here is a little Princeton anecdote. During a week that I spent there, in the mid-eighties, the Institute threw a cocktail party. (What I enjoyed best at that affair was being attacked by Armand Borel for having said, in print, that sheaves had vanished into the background.) Next morning I overheard the following conversation in Fine Hall:

Prof. A: That was a nice party yesterday, wasn’t it?

Prof. B: Yes, and wasn’t it nice that they invited the whole department.

Prof. A: Well, only the full professors.

Prof. B: Of course.

The above-mentioned facts about Cantor sets led me to look at the opposite extreme, the so-called scattered spaces. A compact Hausdorff space Q is said to be shattered if Q contains no perfect set, every non-empty compact set F in Q thus contains a point that is not a limit point of F. The principal result proved in Proc. AMS 8, 1957, 39-42 is:

THEOREM: Every closed subalgebra of is self-adjoint.

In fact, the scattered spaces are the only ones for which this is true, but I did not state this in that paper.

In 1956, I found a very explicit description of all closed ideals in the disc algebra (defined at the beginning of this chapter). The description involves inner function. These are the bounded holomorphic functions in U whose radial limits have absolute value 1 at almost every point of the unit circle . They play a very important role in the study of holomorphic functions in U (see, for instance, Garnett’s book, Bounded Analytic Functions) and their analogues will be mentioned again, on Riemann surfaces, in polydiscs, and in balls in .

Recall that a point on is called a singular point of a holomorphic function f in U if f has no analytic continuation to any neighbourhood of . The ideals in question are described in the following:

THEOREM: Let E be a compact subset of , of Lebesgue measure 0, let u be an inner function all of whose singular points lie in E, and let be the set of all f in such that

(i) the quotient f/u is bounded in U, and

(ii) at every in E.

Then, is a closed ideal of A(U), and every closed ideal of is obtained in this way.

One of several corollaries is that every closed ideal of A(U) is principal, that is, is generated by a single function.

I presented this at the December 1956 AMS meeting in Rochester, and was immediately told by several people that Beurling had proved the same thing, in a course he had given at Harvard, but had not published it. I was also told that Beurling might be quite upset at this, and having Beurling upset at you was not a good thing. Having used this famous paper about the shift operator on a Hilbert space as my guide, I was not surprised that he too had proved this, but I saw no reason to withdraw my already submitted paper. It appeared in Canadian J. Math. 9, 1967, 426-434. The result is now known as Beurling-Rudin theorem. I met him several times later, and he never made a fuss over this.

In the preceding year Lennart Carleson and I, neither of us knowing what the other was doing proved what is now known as Rudin-Carleson interpolation theorem. His paper is in Math. Z. 66, 1957, 447-451, mine in Proc. AMS 7, 1956, 808-811.

THEOREM. If E is a compact subset of , of Lebesgue measure 0, then every f in C(E) extends to a function F in A(U).

(It is easy to see that this fails if . To say that F is an extension of f means simply that at every in E.)

Our proofs have some ingredients in common, but they are different, and we each proved more than is stated above. Surprisingly, Carleson, the master of classical hard analysis, used a soft approach, namely duality in Banach spaces, and concluded that F could be so chosen that . (The norms are sup-norms over the sets appearing as subscripts.) In the same paper he used his Banach space argument to prove another interpolation theorem, involving Fourier-Stieltjes transforms.

On the other hand, I did not have functional analysis in mind at all, I did not think of the norms or of Banach spaces, I proved, by a bare-hands construction combined with the Riemann mapping theorem that if is a closed Jordan domain containing then f can be chosen so that also lies in . If is a disc, centered at 0, this gives , so F is a norm-preserving extension.

What our proofs had in common is that we both used part of the construction that was used in the original proof of the F. and M. Riesz theorem (which says that if a measure on gives for every f in then is absolutely continuous with respect to Lebesgue measure). Carleson showed, in fact, that F. and M. Riesz can be derived quite easily from the interpolation theorem. I tried to prove the implication in the other direction. But that had to wait for Errett Bishop. In Proc. AMS 13, 1962, 140-143, he established this implication in a very general setting which had nothing to do with holomorphic functions or even with algebras, and which, combined with a refinement due to Glicksberg (Trans. AMS 105, 1962, 415-435) makes the interpolation theorem even more precise:

THEOREM: One can choose F in so that at every in E, and at every z in .

This is usually called peak-interpolation.

Several variable analogues of this and related results may be found in Chap. 6 of my Function Theory in Polydiscs and in Chap 10 of my Function Theory in the Unit Ball of .

The last item in this chapter concerns Riemann surfaces. Some definitions are needed.

A finite Riemann surface is a connected open proper subset R of some compact Riemann surface X, such that the boundary of R in X is also the boundary of its closure and is the union of finitely many disjoint simple closed analytic curves . Shrinking each to a point gives a compact orientable manifold whose genus g is defined to be the genus of R. The numbers g and k determine the topology of R, but not, of course, its conformal structure.

denotes the algebra of all continuous functions on that are holomorphic in R. If f is in and at every point p in then, just as in U, f is called inner. A set is unramified if every point of has a neighbourhood in which at least one member of S is one-to-one.

I became interested in these algebras when Lee Stout (Math. Z., 92, 1966, 366-379; also 95, 1967, 403-404) showed that every contains an unramified triple of inner functions that separates points on . He deduced from the resulting embedding of R in that is generated by these 3 functions. Whether every is generated by some pair of its member is still unknown, but the main result of my paper in Trans. AMS 150, 1969, 423-434 shows that pairs of inner functions won’t always do:

THEOREM: If contains a point-separating unramified pair f, g of inner functions, then there exist relatively prime integers s and t such that f is s-to-1 and g is t-to-1 on every , and

(*)

For example, when and , then (*) holds for no integers s and t. When and , then is the only pair that satisfies (*) but it is not relatively prime. Even when the theorem gives some information. In that case, , so (*) becomes , which means:

If a pair of finite Blaschke products separates points on and their derivatives have no common zero in U, then at least one of them is one-to-one (that is, a Mobius transformation).

There are two cases in which (*) is not only necessary but also sufficient. This happens when and when .

But there are examples in which the topological condition (*) is satisfied even though the conformal structure of R prevents the existence of a separating unramified pair of inner functions.

This paper is quite different from anything else that I have ever done. As far as I know, no one has ever referred to it, but I had fun working on it.

As I mentioned earlier, my thesis (Trans. AMS 68, 1950, 278-363) deals with uniqueness questions for series of spherical harmonics, also known as Laplace series. In the more familiar setting of trigonometric series, the first theorem of the kind that I was looking for was proved by Georg Cantor in 1870, based on earlier work of Riemann (1854, published in 1867). Using the notations

,

, where and are real numbers. Cantor’s theorem says:

at every real x, then for every n.

Therefore, two distinct trigonometric series cannot converge to the same sum. This is what is meant by uniqueness.

My aim was to prove this for spherical harmonics and (as had been done for trigonometric series) to whittle away at the hypothesis. Instead of assuming convergence at every point of the sphere, what sort of summability will do? Does one really need convergence (or summability) at every point? If not, what sort of sets can be omitted? Must anything else be assumed at these omitted points? What sort of side conditions, if any, are relevant?

I came up with reasonable answers to these questions, but basically the whole point seemed to be the justification of the interchange of some limit processes. This left me with an uneasy feeling that there ought to be more to Analysis than that. I wanted to do something with more “structure”. I could not have explained just what I meant by this, but I found it later when I became aware of the close relation between Fourier analysis and group theory, and also in an occasional encounter with number theory and with geometric aspects of several complex variables.

Why was it all an exercise in interchange of limits? Because the “obvious” proof of Cantor’s theorem goes like this: for ,

, which in turn, equals

and similarly, for . Note that was used.

In Riemann’s above mentioned paper, the derives the conclusion of Cantor’s theorem under an additional hypothesis, namely, and as . He associates to the twice integrated series

and then finds it necessary to prove, in some detail, that this series converges and that its sum F is continuous! (Weierstrass had not yet invented uniform convergence.) This is astonishingly different from most of his other publications, such as his paper on hypergeometric functions in which mind-boggling relations and transformations are merely stated, with only a few hints, or his painfully brief paper on the zeta-function.

He included the statement: This cannot be proved, as is commonly believed, by term-by-term integration.

Apparently, it took a while before this was generally understood. Ten years later, in Math. America 16, 1880, 113-114, he patiently explains the differenence between pointwise convergence and uniform convergence, in order to refute a “simpler proof” published by Appell. But then, referring to his second (still quite complicated) proof, the one in Math. Annalen 4, 1871, 139-143, he sticks his neck out and writes: ” In my opinion, no further simplification can be achieved, given the nature of ths subject.”

That was a bit reckless. 25 years later, Lebesgue’s dominated convergence theorem became part of every analyst’s tool chest, and since then (*) can be proved in a few lines:

Rewrite in the form , where . Put

, .

Then, , at every x, so that the D. C.Th., combined with

shows that . Therefore, for all large n, and . Done.

The point of all this is that my attitude was probably wrong. Interchanging limit processes occupied some of the best mathematicians for most of the 19th century. Thomas Hawkins’ book “Lebesgue’s Theory” gives an excellent description of the difficulties that they had to overcome. Perhaps, we should not be too surprised that even a hundred years later many students are baffled by uniform convergence, uniform continuity etc., and that some never get it at all.

In Trans. AMS 70, 1961, 387-403, I applied the techniques of my thesis to another problem of this type, with Hermite functions in place of spherical harmonics.

(Note: The above article has been picked from Walter Rudin’t book, “The Way I Remember It)) — hope it helps advanced graduates in Analysis.

When we come to multiplication, it is most convenient to confine ourselves to positive numbers (among which we may include zero) in the first instance, and to go back for a moment to the sections of positive rational numbers only which we considered in articles 4-7. We may then follow practically the same road as in the case of addition, taking (c) to be (ab) and (O) to be (AB). The argument is the same, except when we are proving that all rational numbers with at most one exception must belong to (c) or (C). This depends, as in the case of addition, on showing that we can choose a, A, b, and B so that C-c is as small as we please. Here we use the identity

.

Finally, we include negative numbers within the scope of our definition by agreeing that, if and are positive, then

, , .

(iv) Division.

In order to define division, we begin by defining the reciprocal of a number (other than zero). Confining ourselves in the first instance to positive numbers and sections of positive rational numbers, we define the reciprocal of a positive number by means of the lower class and the upper class . We then define the reciprocal of a negative number by the equation . Finally, we define by the equation

.

We are then in a position to apply to all real numbers, rational or irrational the whole of the ideas and methods of elementary algebra. Naturally, we do not propose to carry out this task in detail. It will be more profitable and more interesting to turn our attention to some special, but particularly important, classes of irrational numbers.

Below are the views of the master expositor of mathematics, Paul Halmos:

Some graduate students now-a-days object to being made to learn to read two languages as a Ph.D. requirement. “Why should we learn about flowers and families and genitives and past principles? — all we want is to read last month’s Paris seminar report.” Some go further:”Who needs German? — for me Fortran (C/C++) is much more relevant.”

Horrors! I am upset and I predict that the result of such anti-linguistic, anti-cultural, anti-intellectual attitudes will lead to a deterioration of international scientific information exchange, and to a lot of bad writing. Every little bit I ever learned about any language was later of help to me as a writer. That is true of the Danish and Portuguese and Russian and Romanian that I learned for specific mathematical reasons, but it is also true of the hint or two of Greek and of Sanskrit that I managed to be exposed to. I have always rued that I was never taught Greek; every ounce of it would have paid off with a pound of linguistic insight. In the course of the years I managed to pick up quite a few Greek root words; my source of them was my shelf of English dictionaries, especially the American Heritage and the second edition of Webster. I feel that I need to look up the etymologies of words before I can use them precisely, and I know (a small matter, but here is where it belongs) that the reason I have no trouble spelling in English is that even a nodding familiarity with other languages makes me aware of where most of the difficult words come from.

To give the devil his due, I admit that substituting FORTRAN for German is only 90% bad, not 100. What it loses in the understanding of culture and mastering the art of communication, it gains in meticulous attention to detail and moving closer to mastering the science of communication. A knowledge of the theory and practice of formal languages might be a help for writing with precision, especially to students whose talents are not mathematical but it is of no help at all for writing with clarity. The distinction is sometimes ignored or even argued away, but that is a sad error — there is all the difference in the world between an exposition that cannot be misunderstood and one that is in fact understood.

(From: I want to be a mathematician: An Automathography: Paul R. Halmos).

We now proceed to meaning of the elementary algebraic operations such as addition, as applied to real numbers in general.

(i), Addition. In order to define the sum of two numbers and , we consider the following two classes: (i) the class (c) formed by all sums , (ii) the class (C) formed by all sums . Clearly, in all cases.

Again, there cannot be more than one rational number which does not belong either to (c) or to (C). For suppose there were two, say r and s, and let s be the greater. Then, both r and s must be greater than every c and less than every C; and so cannot be less than . But,

;

and we can choose a, b, A, B so that both and are as small as we like; and this plainly contradicts our hypothesis.

If every rational number belongs to (c) or to (C), the classes (c), (C) form a section of the rational numbers, that is to say, a number . If there is one which does not, we add it to (C). We have now a section or real number , which must clearly be rational, since it corresponds to the least member of (C). In any case we call the sum of and , and write

.

If both and are rational, they are the least members of the upper classes (A) and (B). In this case it is clear that is the least member of (C), so that our definition agrees with our previous ideas of addition.

1) If r and s are rational numbers, then , , , and are rational numbers, unless in the last case (when is of course meaningless).

Proof:

Part i): Given r and s are rational numbers. Let , , where a, b, c and d are integers, and b and d are not zero; where a and b do not have any common factors, where c and d do not have any common factors, and c and d are positive integers.

Then, , which is clearly rational as both the numerator and denominator are new integers (closure in addition and multiplication).

Part ii) Similar to part (i).

Part iii) By closure in multiplication.

Part iv) By definition of division in fractions, and closure in multiplication.

2) If are positive rational numbers, and , then prove that , , are positive rational numbers. Hence, show how to determine any number of right-angled triangles the lengths of all of whose sides are rational.

Proof:

This follows from problem 1 where we proved that the addition, subtraction and multiplication of rational numbers is rational.

Also, Pythagoras’ theorem holds in the following manner:

3) Any terminated decimal represents a rational number whose denominator contains no factors other than 2 or 5. Conversely, any such rational number can be expressed, and in one way only, as a terminated decimal.

Proof Part 1:

This is obvious since the divisors other than 2 or 5, namely, 3,6,7,9, and other prime numbers do not divide 1 into a terminated decimal.

Proof Part 2:

Since the process of division produces a unique quotient.

4) The positive rational numbers may be arranged in the form of a simple series as follows:

The following is a recollection of John Nash’s seminal contribution to Geometry. It includes some descriptions of his interactions with other mathematicians. I have picked it up from his famous biography “A Beautiful Mind” by Sylvia Nasar.

There are two kinds of mathematical contributions: work that is important to the history of mathematics and work that is simply a triumph of the human spirit. — Paul J. Cohen, 1996.

In the spring of 1953, Paul Halmos, a mathematician at the University of Chicago, received the following letter from his old friend Warren Ambrose, a colleague of Nash’s:

There’s no significant news from here, as always. Martin is appointing John Nash to an Assistant Professorship (not the Nash at Illinois, the one out of Princeton by Steenrod) and I am pretty annoyed at that. Nash is a childish bright guy who wants to be “basically original,” which I suppose is fine for those who have some basic originality in them. He also makes a damned fool of himself in various ways contrary to this philosophy. He recently heard of the unsolved problem about imbedding a Riemannian manifold isometrically in Euclidean space, felt that this was his sort of thing, provided the problem were sufficiently worthwhile to justify his efforts;; so he proceeded to write to everyone in the math society to check on that, was told that it probably was, and proceeded to announce that he had solved it, modulo details, and told Mackey he would like to talk about it at the Harvard colloquium. Meanwhile, he went to Levinson to inquire about a differential equation that and Levinson says it is a system of partial differential equations and if he could only get to the essentially simpler analog of a single ordinary differential equation it would be a damned good paper — and Nash had only the vaguest notions about the whole thing. So it is generally conceded he is getting nowhere and making an even bigger ass of himself than he has previously been supposed by those with less insight than myself. But we have got him and saved ourselves the possibility of having a gotten a real mathematician. He’s a bright guy but conceited as Hell, childish as Wiener, hasty as X, obstreperous as Y, for arbitrary X and Y.

Ambrose had every reason to be both skeptical and annoyed.

Ambrose was a moody, intense, somewhat frustrated mathematician in his late thirties, full, as his letter indicates of dark humour. He was a radical and nonconformist. He married three times. He gave a lecture on “Why I am an atheist.” He once tried to defend some left-wing demonstrators against police in Argentina — and got himself beaten up and jailed for his efforts. He was also a jazz fanatic, a personal friend of Charlie Parker, and a fine trumpet player. Handsome, solidly built, with a boxer’s broken nose — the consequence of an accident in an elevator — he was one of the most popular members of the department. He and Nash clashed from the start.

Ambrose’s manner was calculated to give an impression of stupidity. “I am a simple man, I can’t understand this.” Robert Aumann recalled. “Ambrose came to class one day with one shoelace tied and the other untied. “Did you know your right shoelace is untied?” we asked. “Oh, my God,” he said, “I tied the left one and thought that the other must be tied by considerations of symmetry.”

The older faculty in the department mostly ignored Nash’s putdowns and jibes. Ambrose did not. Soon a tit-for-tat rivalry was under way. Ambrose, was famous, among other things, for detail. His blackboard notes were so dense that rather attempt the impossible task of copying them, one of his assistants used to photograph them. Nash, who disliked laborious, step-by-step expositions, found much to mock. When Ambrose wrote what Nash considered as an ugly argument on the blackboard during a seminar, Nash would mutter, “Hack, Hack” from the back of the room.

Nash made Ambrose the target of several pranks. “Seminar on the REAL mathematics!” read a sign that Nash posted one day. “The seminar will meet weekly Thursdays at 2PM in the Common Room.” Thursday at 2PM was the hour that Ambrose taught his graduate course in analysis. On another occasion, after Ambrose delivered a lecture at the Harvard mathematics colloquium, Nash, arranged to have a large bouquet of roses delivered to the podium as if Ambrose were a ballerina taking her bows.

Ambrose needled back. He wrote “F*** Myself” on the To Do list that Nash kept hanging over his desk on a clipboard. It was he who nicknamed Nash “Gnash” for constantly making belittling remarks about other mathematicians. And, during a discussion in the common room, after one of Nash’s diatribes about hacks and drones, Ambrose said disgustedly, “If you are so good, why don’t you solve the embedding problem for manifolds?” — a notoriously difficult problem that had been around since it was posed by Riemann.

So Nash did.

Two years later at the University of Chicago, Nash began a lecture describing his first really big theorem by saying, “I did this because of a bet.” Nash’s opening statement spoke volumes about who he was. He was a mathematician who viewed mathematics not as a grand scheme, but as a collection of challenging problems. In the taxonomy of mathematicians, there are problem solvers and theoreticians, and by temperament, Nash belonged to the first group. He was not a game theorist, analyst, algebraist, geometer, topologist, or mathematical physicist. But he zeroed in on areas in these fields where essentially nobody had achieved anything. The thing was to find an interesting question that he could say something about.

Before taking on Ambrose’s challenge, Nash wanted to be certain that solving the problem would cover him with glory. He not only quizzed various experts on the problem’s importance, but according to Felix Browder, another Moore Instructor, claimed to have proved the result long before he actually had. When a mathematician at Harvard confronted Nash, recalled Browder: “Nash explained that he wanted to find out whether it was worth working on.”

“The discussion of manifolds was everywhere,” said Joseph Kohn in 1995, gesturing to the air around him. “The precise question that Ambrose asked Nash in the common room one day was the following: Is it possible to embed any Riemannian manifold in a Euclidean space?”

It’s a “deep philosophical question” concerning the foundations of geometry that virtually every mathematician — from Riemann and Hilbert to Elie-Joseph Cartan and Hermann Weyl — working in the field of differential geometry for the past century had asked himself. The question, first posed explicitly by Ludwig Schlaffi in the 1870s, had evolved naturally from a progression of other questions that had been posed and partly answered beginning in the mid-nineteenth century. First mathematicians studied ordinary curves, then surfaces, and finally, thanks to Riemann, a sickly German genius and one of the great figures of nineteenth century mathematics, geometric objects in higher dimensions. Riemann discovered examples of manifolds inside Euclidean spaces. But, in the early 1950s interest shifted to manifolds partly because of the large role that distorted space and time relationships had in Einstein’s theory of relativity.

Nash’s own description of the embedding problem in his 1995 Nobel autobiography hints at the reason he wished to make sure that solving the problem would be worth the effort: “This problem, although classical, was not much talked about is an outstanding problem. It was not like, for example, the four-colour conjecture.”

Embedding involves portraying a geometric object as — or, a bit more precisely, making it a subset of — some space in some dimension. Take the surface of a balloon. You can’t put it on a blackboard, which is a two-dimensional space. But you can make it a subset of spaces of three or more dimensions. Now take a slightly more complicated object, say a Klein bottle. A Klein bottle looks like a tin can whose lid and bottom have been removed and whose top has been stretched around and reconnected through the side to the bottom. If you think about it, it’s obvious that if you try that in three-dimensional space, the thing intersects itself. That’s bad from a mathematical point of view because the neighbourhood in the immediate vicinity of the intersection looks weird and irregular, and attempts to calculate various attributes like distance or rates of change in that part of the object tend to blow up. But, put the same Klein bottle into a space of 4 dimensions and the thing no longer intersects itself. Like a ball embedded in three space, a Klein bottle in four space becomes a perfectly well-behaved manifold.

Nash’s theorem stated that any kind of surface that embodied a special notion of smoothness can actually be embedded in Euclidean space. He showed that you could fold the manifold like a silk handkerchief without distorting it. Nobody would have expected Nash’s theorem to be true. In fact, everyone would have expected it to be false. “It showed incredible originality,” said Mikhail Gromov, the geometer whose book Partial Differential Relations builds on Nash’s work. He went on:

“Many of us have the power to develop existing ideas. We follow paths prepared by others. But most of us could never produce anything comparable to what Nash produced. It’s like lightning striking. Psychologically the barrier he broke is absolutely fantastic. He has completely changed the perspective of partial differential equations. There has been some tendency in recent decades to move from harmony to chaos. Nash says chaos is just round the corner.”

John Conway, the Princeton mathematician who discovered surreal numbers and invented the game of Life, called Nash’s result “one of the most important pieces of mathematical analysis in this century.”

It was also, one must add, a deliberate jab, at then-fashionable approaches to Riemannian manifolds, just as Nash’s approach to theory of games was a direct challenge to von Neumann’s. Ambrose, for example, was himself involved in a highly abstract and conceptual description of such manifolds at the time. As Jurgen Moser, a young German mathematician who came to know Nash well in the mid-1950’s, put it, “Nash didn’t like that style of mathematics at all. He was out to show that this, to his mind, exotic approach was completely unnecessary since any such manifold was simply a submanifold of a high dimensional Euclidean space.”

Nash’s important achievement may have been the powerful technique he invented to obtain his result. In order to prove his theorem, Nash had to confront a seemingly insurmountable obstacle, solving a certain set of partial differential equations that were impossible to solve with existing methods.

That obstacle cropped up in many mathematical and physical problems. It was the difficulty that Levinson, according to Ambrose’s letter, pointed out to Nash, and it is a difficulty that crops up in many many problems — in particular, nonlinear problems. Typically, in solving an equation, the thing that is given is some function, and one finds estimates of derivatives of a solution in terms of derivatives of the given function. Nash’s solution was remarkable in that the a priori estimates lost derivatives. Nobody knew how to deal with such equations. Nash invented a novel iterative method — a procedure for making a series of educated guesses — for finding roots of equations, and combined it with a technique for smoothing to counteract the loss of derivatives.

Newman described Nash as a ‘very poetic, different kind of thinker.” In this instance, Nash used differential calculus, not geometric pictures or algebraic manipulations, methods that were classical outgrowths of nineteenth-century calculus. The technique is now referred to as Nash-Moser theorem, although there is no dispute that Nash was its originator. Jurgen Moser was to show how Nash’s technique could be modified and applied to celestial mechanics — the movement of planets — especially, for establishing the stability of periodic orbits.

Nash solved the problem in two steps. He discovered that one could embed a Riemannian manifold in a three-dimensional space if one ignored smoothness. One had, so to speak, to crumple it up. It was a remarkable result, a strange and interesting result, but a mathematical curiosity, or so it seemed. Mathematicians were interested in embedding without wrinkles, embedding in which the smoothness of the manifold could be preserved.

In his autobiographical essay, Nash wrote:

“So, as it happened, as soon as I heard in conversation at MIT about the question of embeddability being open I begann to study it. The first break led to a curious result about the embeddability being realizable in surprisingly low-dimensional ambient spaces provided that one would accept that the embedding would have only limited smoothness. And later, with “heavy analysis”, the problem was solved in terms of embedding with a more proper degree of smoothness.”

Nash presented his initial “curious” result at a seminar in Princeton, most likely in the spring of 1953, at around the same time that Ambrose wrote his scathing letter to Halmos. Emil Artin was in the audience. He made no secret of his doubts.

“Well, that’s all well and good, but what about the embedding theorem?” said Artin. “You’ll never get it.”

“I’ll get it next week,” Nash shot back.

One night, possibly en route to this very talk, Nash was hurtling down the Merritt Parkway. Poldy Flatto was riding with him as far as the Bronx. Flatto, like all the other graduate students, knew that Nash was working on the embedding problem. Most likely to get Nash’s goat and have the pleasure of watching his reaction, he mentioned that Jacob Schwartz, a brilliant young mathematician at Yale whom Nash knew slightly, was also working on the problem.

Nash became quite agitated. He gripped the steering wheel and almost shouted at Flatto, asking whether he had meant to say that Schwartz had solved the problem. “I didn’t say that,” Flatto corrected. “I said I heard he was working on it.”

“Working on it?” Nash replied, his whole body now the picture of relaxation. “Well, then there’s nothing to worry about. He doesn’t have the insights I have.”

Schwartz was indeed working on the same problem. Later, after Nash had produced his solution, Schwartz wrote a book on the subject of implicit-function theorems. He recalled in 1996:

“I got half the idea independently, but I couldn’t get the other half. It’s easy to see an approximate statement to the effect that not every surface can be exactly embedded, but that you can come arbitrarily close. I got that idea and I was able to produce the proof of the easy half in a day. But then I realized that there was a technical problem. I worked on it for a month and couldn’t see any way to make headway. I ran into an absolute stone wall. I didn’t know what to do. Nash worked on that problem for two years with a sort of ferocious, fantastic tenacity until he broke through it.”

Week after week, Nash would turn up in Levinson’s office, much as he had in Spencer’s, at Princeton. He would describe to Levinson what he had done and Levinson would show him why it didn’t work. Isadore Singer, a fellow Moore Instructor, recalled:

“He’d show the solutions to Levinson. The first few times he was dead wrong. But, he didn’t give up. As he saw the problem get harder and harder, he applied himself more, and more and more. He was motivated just to show everybody how good he was, sure, but on the other hand he didn’t give up even when the problem turned out to much harder than expected. He put more and more of himself into it.”

There is no way of knowing what enables one man to crack a problem while another man, also brilliant, fails. Some geniuses have been sprinters who have solved problems quickly. Nash was a long-distance runner. If Nash defied von Neumann in his approach to the theory of games, he now took on the received wisdom of nearly a century. He went into a classical domain where everybody understood what was possible and what was not possible. “It took enormous courage to attack these problems,” said Paul Cohen, a mathematician at Stanford University and a Fields medalist. His tolerance for solitude, great confidence in his own intuition, indifference to criticism — all detectable at a young age but now prominent and impermeable features of his personality — served him well. He was a hard worker by habit. He worked mostly at night in the MIT office — from ten in the evening until 3.00AM — and on weekends as well, with, as one observer said, “no references, but his own mind and his supreme self-confidence.” Schwartz called it “the ability to continue punching the wall until the stone breaks.”

The most eloquent description of Nash’s single-minded attack on the problem comes from Moser:

“The difficulty that Levinson had pointed out, to anyone in his right mind, would have stopped them cold and caused them to abandon the problem. But Nash was different. If he had a hunch, conventional criticism didn’t stop him. He had no background knowledge. It was totally uncanny. Nobody could understand how somebody like that could do it. He was the only person I ever saw with that kind of power, just brute mental power.”

The editors of the Annals of Mathematics hardly knew what to make of Nash’s manuscript when it landed on their desks at the end of October 1954. It hardly had the look of a mathematics paper. It was as thick as a book, printed by hand rather than typed and chaotic. It made use of concepts and terminology more familiar to engineers than to mathematicians. So, they sent it to a mathematician at Brown University, Herbert Federer, and Austrian born refugee from Nazism and a pioneer in surface area theory, who, although only thirty-four, already had a reputation for high standards, superb taste, and an unusual willingness to tackle difficult manuscripts.

Mathematics is often described, quite rightly, as the most solitary of endeavours. But when a serious mathematician announces that he had found the solution to an important problem, at least one other serious mathematician, and sometimes several, as a matter of longstanding tradition that goes back hundreds of years, will set aside his own work for weeks and months at a time, as one former collaborator of Federer put it, “to make a go of it, and to straighten everything out.” Nash’s manuscript presented Federer with a sensationally complicated puzzle and he attacked the task with relish.

The collaboration between the author and referee took months. A large correspondence, many telephone conversations, and numerous drafts ensued. Nash did not submit the revised version of the paper until nearly the end of the following summer. His acknowledgement to Federer was, by Nash’s standards effusive. “I am profoundly indebted to H. Federer, to whom may be traced most of the improvement over the first chaotic formulation of this work.”

Armand Borel, who was a visiting professor at Chicago when Nash gave a lecture on his embedding theorem, remembers the audience’s shocked reaction. “Nobody believed his proof at first,” he recalled in 1995. “People were very skeptical. It looked like a beguiling idea. But when there’s no technique you are skeptical. You dream about a vision. Usually you are missing something. People did not challenge him publicly, but they talked privately.” (Characterically, Nash’s report to his parents merely said, ‘talks went well.’)

Gian-Carlo Rota, professor of mathematics and philosophy at MIT confirmed Borel’s account. “One of the great experts on the subject told me that if one of his graduate students had proposed such an outlandish idea he’d have thrown him out of his office.

The result was so unexpected and Nash’s methods so novel, that even the experts had tremendous difficulty understanding what he had done. Nash used to have drafts lying around the MIT common room. A former MIT graduate student recalls a long and confused discussion between Ambrose, Singer and Masatake Kuranishi, (a mathematician at Columbia University who later applied Nash’s result), in which each one tried to explain Nash’s result to the other without much success.

Jack Schwartz recalled:

“Nash’s solution was not just novel, but very mysterious, a mysterious set of weird inequalities that all came together. In my explication of it I sort of looked at what happened and could generalize and give an abstract form and realize it was applicable to situations other the specific one he treated. But, I didn’t quite get to the bottom of it either.”

Later, Heinz Hopf, professor of mathematics in Zurich and a past president of the International Mathematical Union, “a great man with a small build, friendly, radiating a warm glow, who knew everything about differential geometry,” gave a talk on Nash’s embedding theorem in New York. Usually, Hopf’s lectures were models of crystalline clarity. Moser, who was in the audience recalled. “So we thought NOW we will understand what Nash did. He was naturally skeptical. He would have been an important validator of Nash’s work. But, as the lecture went on, my God, Hopf was befuddled himself. He couldn’t convey a complete picture. He was completely overwhelmed.”

Several years later, Jurgen Moser tried to get Nash to explain how he had overcome the difficulties that Levinson had originally pointed out:”I did not learn so much from him. When he talked, he was vague, hand waving. ‘You have to control this. You have to watch out for that.’ You couldn’t follow him. But, his written paper was complete and correct.” Federer not only edited Nash’s paper to make it more accessible, but also was the first to convince the mathematical community that Nash’s theorem was indeed correct.

Martin’s surprise proposal, in the early part of 1953, to offer Nash a permanent faculty position set off a storm of controversy among the eighteen-member mathematics faculty. Levinson and Wiener were among Nash’s strongest supporters. But, others like Warren Ambrose and George Whitehead, the distinguished topologist, were opposed. Moore Instructorships weren’t meant to lead to tenure-track positions. More to the point, Nash had plenty of enemies and few friends in his first year and a half. His disdainful manner towards his colleagues and his poor record as a teacher rubbed many the wrong way.

Mostly, however, Nash’s opponents were of the opinion that he hadn’t proved he could produce. Whitehead recalled, “He talked big. Some of us were not sure he could live up to his claims.” Ambrose, not surprisingly, felt similarly. Even Nash’s champions could not have been completely certain. Flatto remembered one occasion on which Nash came to Levinson’s office to ask Levinson whether he’d a draft of his embedding paper. Levinson said, “To tell you the truth I don’t have enough background in this area to pass judgement.”

When Nash finally succeeded, Ambrose did what a fine mathematician and sterling human being would do. His applause was as loud as or louder than anyone else’s. The bantering became friendlier and among other things, Ambrose took to telling his musical friends that Nash’s whistling was the purest, most beautiful tone he had ever heard.