Amplification, arbitrage, and the tensor power trick

It occurred to me recently that the mathematical blog medium may be a good venue not just for expository “short stories” on mathematiKcal concepts or results, but also for more technical discussions of individual mathematical “tricks”, which would otherwise not be significant enough to warrant a publication-length (and publication-quality) article. So I thought today that I would discuss the amplification trick in harmonic analysis and combinatorics (and in particular, in the study of estimates); this trick takes an established estimate involving an arbitrary object (such as a function f), and obtains a stronger (or amplified) estimate by transforming the object in a well-chosen manner (often involving some new parameters) into a new object, applying the estimate to that new object, and seeing what that estimate says about the original object (after optimising the parameters or taking a limit). The amplification trick works particularly well for estimates which enjoy some sort of symmetry on one side of the estimate that is not represented on the other side; indeed, it can be viewed as a way to “arbitrage” differing amounts of symmetry between the left- and right-hand sides of an estimate. It can also be used in the contrapositive, amplifying a weak counterexample to an estimate into a strong counterexample. This trick also sheds some light as to why dimensional analysis works; an estimate which is not dimensionally consistent can often be amplified into a stronger estimate which is dimensionally consistent; in many cases, this new estimate is so strong that it cannot in fact be true, and thus dimensionally inconsistent inequalities tend to be either false or inefficient, which is why we rarely see them. (More generally, any inequality on which a group acts on either the left or right-hand side can often be “decomposed” into the “isotypic components” of the group action, either by the amplification trick or by other related tools, such as Fourier analysis.)

The amplification trick is a deceptively simple one, but it can become particularly powerful when one is arbitraging an unintuitive symmetry, such as symmetry under tensor powers. Indeed, the “tensor power trick”, which can eliminate constants and even logarithms in an almost magical manner, can lead to some interesting proofs of sharp inequalities, which are difficult to establish by more direct means.

The most familiar example of the amplification trick in action is probably the textbook proof of the Cauchy-Schwarz inequality

(1)

for vectors v, w in a complex Hilbert space. To prove this inequality, one might start by exploiting the obvious inequality

(2)

but after expanding everything out, one only gets the weaker inequality

. (3)

Now (3) is weaker than (1) for two reasons; the left-hand side is smaller, and the right-hand side is larger (thanks to the arithmetic mean-geometric mean inequality). However, we can amplify (3) by arbitraging some symmetry imbalances. Firstly, observe that the phase rotation symmetry preserves the RHS of (3) but not the LHS. We exploit this by replacing v by in (3) for some phase to be chosen later, to obtain

.

Now we are free to choose at will (as long as it is real, of course), so it is natural to choose to optimise the inequality, which in this case means to make the left-hand side as large as possible. This is achieved by choosing to cancel the phase of , and we obtain

(4)

This is closer to (1); we have fixed the left-hand side, but the right-hand side is still too weak. But we can amplify further, by exploiting an imbalance in a different symmetry, namely the homogenisation symmetry for a scalar , which preserves the left-hand side but not the right. Inserting this transform into (4) we conclude that

where is at our disposal to choose. We can optimise in by minimising the right-hand side, and indeed one easily sees that the minimum (or infimum, if one of v and w vanishes) is (which is achieved when when are non-zero, or in an asymptotic limit or in the degenerate cases), and so we have amplified our way to the Cauchy-Schwarz inequality (1). [See also this discussion by Tim Gowers on the Cauchy-Schwarz inequality.]

— Amplification via phase, homogeneity, or dilation symmetry —

Many similar examples of amplification are used routinely to prove the basic inequalities in harmonic analysis. For instance to deduce the complex-valued triangle inequality

(5)

(where is a measure space and f is absolutely integrable) from its real-valued counterpart, we first apply the latter inequality to to obtain

.

To make the right-hand side phase-rotation-invariant, we crudely bound by , obtaining

and then one can arbitrage the imbalance in phase rotation symmetry to obtain (5). For another well-known example, to prove Hölder’s inequality

(6)

for non-negative measurable and dual exponents , one can begin with the elementary (weighted) arithmetic mean-geometric mean inequality

(7)

for non-negative a,b (which follows from the convexity of the function , which in turn follows from the convexity of the exponential function) to obtain the inequality

.

This inequality is weaker than (6) (because of (7)); but if one amplifies by arbitraging the imbalance in the homogenisation symmetry one obtains (6). As a third example, the Sobolev embedding inequality

(8)

which is valid for (and also valid in some endpoint cases) and all test functions (say) f on , can be amplified to obtain the Gagliardo-Nirenberg inequality

(9)

where is the number such that , by arbitraging the action of the dilation group . (In this case, the dilation action does not leave either the LHS or RHS of (8) invariant, but it affects the LHS in a well controlled manner, which can be normalised out by dividing by a suitable power of .) The same trick, incidentally, reveals why the Sobolev embedding inequality fails when or when , because in these cases it leads to an absurd version of the Gagliardo-Nirenberg inequality. Observe also that the Gagliardo-Nirenberg inequality (9) is dimensionally consistent; the dilation action affects both sides of the inequality in the same way. (The weight of the representation of the dilation action on an expression is the same thing as the exponent of the length unit that one assigns to the dimension of that expression.) More generally, arbitraging a dilation symmetry allows a dimensionally consistent inequality to emerge from a dimensionally inconsistent (or dimensionally inefficient) one.

— Amplification using linearity —

Another powerful source of amplification is linearity (the principle of superposition). A simple example of this is depolarisation. Suppose one has a symmetric bilinear form from a normed vector space X to the real numbers, and one has already proven the polarised inequality

for all f in X. One can amplify this by replacing f with f+cg for arbitrary f, g in X and a real parameter c, obtaining

Optimising this in c (e.g. taking ) and using the triangle inequality, one eventually obtains the amplified (depolarised) inequality

for some absolute constant C.

For a slightly more sophisticated example, suppose for instance that one has a linear operator for some and some measure spaces X,Y, and that one has established a scalar estimate of the form

(10)

for arbitrary scalar functions f. Then by replacing by a signed sum , where are arbitrary functions in and are signs, and using linearity, we obtain

If we raise this to the power, take the to be random (Bernoulli) signs (in order to avoid unexpectedly large cancellations in the series), and then take expectations of both sides, we obtain

for some constant depending only on p (in particular, it is independent of N). We can then use the monotone convergence theorem to amplify the finite sum to an infinite sum, thus

.

Comparing this to (10) we see that we have amplified a scalar inequality (in which the unknown function f takes values in the real or complex numbers) to a vector-valued inequality (in which we have a sequence taking values in the Hilbert space ). [This particular amplification was first observed by Marcinkiewicz and Zygmund.]

If the estimate one is studying involves “localised” operators or norms, then one can use linearity to amplify a global estimate into a more localised one. For instance, let us return to the Sobolev inequality (8). We can establish a partition of unity for some bump function , then we see that

Applying the Sobolev inequality (8) to each localised function and then summing up, one obtains the localised Sobolev inequality

where is the cube of sidelength 1 centred at n. This estimate is a little stronger than (8), because the summation norm is smaller than the summation norm.

— Amplification via translation invariance —

If is a translation-invariant operator on which is not identically zero, one can automatically rule out a large variety of estimates concerning T due to their incompatibility with translation invariance (they would amplify themselves into an absurd estimate). For instance, it will not be possible to establish any weighted estimate involving power weights such as in which there is a higher exponent on the left. More precisely if are real numbers and , then it is not possible for any estimate of the form

to be true. Indeed, if such an estimate was true, then by using the translation invariance we can amplify the above estimate to

for any . But if one fixes f and lets go to infinity, we see that the right-hand side grows like while the left-hand side grows like (unless Tf vanishes entirely), leading to a contradiction.

[There is a Fourier dual to the above principle, familiar to experts in the analysis of PDEs, which asserts that a function space norm with a low number of derivatives (i.e. a low-regularity norm) cannot control a norm with a high number of derivatives. Here, the underlying symmetry that drives this principle is modulation invariance rather than translation invariance.]

One can obtain particularly powerful amplifications by combining translation-invariance with linearity, because one can now consider not just translates of a single function f, but also consider superpositions of such functions. For instance, we have the principle (which I believe was first articulated by Littlewood) that a non-trivial translation-invariant linear operator T can only map to when . (Littlewood summarised this principle as “the higher exponents are always on the left”.) To see this, suppose that we had an estimate of the form

(11)

We can amplify this estimate by replacing by, where N is some integer and are widely separated points. If these points are sufficiently far apart, then the RHS of (11) is comparable to , whereas the LHS is comparable to (note how this uses both the translation-invariance and linearity of T). Thus in the limit we obtain

The combination of translation invariance and linearity is so strong that it can amplify even a very qualitative estimate into a quantitative one. A good example of this is Stein’s maximal principle. Suppose we have some maximal operator on some compact group G with normalised Haar measure dm, where the are a sequence of translation-invariant operators which are uniformly bounded on some space for some . Suppose we are given the very weak information that is finite almost everywhere for every . (This is for instance the case if we know that converge pointwise almost everywhere.) Miraculously, this qualitative hypothesis can be amplified into a much stronger quantitative one, namely that is of weak type :

(12)

To see this, suppose for contradiction that (12) failed for any C; by homogeneity, it would also fail even when restricted to the case . What this means is that for any , there exists such that

(13)

where is the set where .

At present, could be a very small subset of G, although we know that it has positive measure. But we can amplify this set to be very large by the following trick: pick an integer N comparable to , select N random shifts and random signs and replace by the randomised sum . This sum will tend to be large (greater than or comparable to 1) on most of the union ; this can be made precise using Khintchine’s inequality. On the other hand, another application of Khintchine’s inequality using (13) shows that has an norm of on the average. Thus we have constructed functions f of arbitrarily small norm whose maximal function Mf is bounded away from zero on a set of measure bounded away from zero. From this and some minor additional tricks it is not difficult to then construct a function f in whose maximal function is infinite on a set of positive measure, leading to the desired contradiction.

— The tensor power trick —

We now turn to a particularly cute source of amplification, namely the tensor power operation which takes a complex-valued function on some set X and replaces it with a tensor power defined by

If one has an estimate for which only one of the sides behaves nicely under tensor powers, then there can be some opportunity for arbitrage. For instance, suppose we wanted to prove the Hausdorff-Young inequality

(14)

on arbitrary finite additive groups G and all , where is the dual exponent of p, is the Pontryagin dual of G (i.e. the group of characters on G), we give G normalised counting measure, and is the Fourier transform on G. If we had the Riesz-Thorin interpolation theorem, we could quickly deduce (14) from the trivial inequality

indeed, this is one of the textbook applications of that theorem. But suppose for some reason one did not wish to use the Riesz-Thorin theorem (perhaps in a desire to avoid “non-elementary” methods, such as complex analysis), and instead wished to use the more elementary Marcinkiewicz interpolation theorem. Then, at first glance, it appears that one can only conclude the weaker estimate

for some constant . However, we can exploit the fact that the Fourier transform commutes with tensor powers. Indeed, by applying the above inequality with f replaced by (and G replaced by ) we see that

for every ; taking roots and then letting M go to infinity we obtain (14); the tensor power trick has “magically” deleted the constant from the inequality. More generally, one can use the tensor power trick to deduce the Riesz-Thorin interpolation theorem from the Marcinkiewicz interpolation theorem (the key point being that the operator norm of a tensor power of a linear operator T is just the power of the operator norm of the original operator T). This gives a proof of the Riesz-Thorin theorem that does not require complex analysis.

Actually, the tensor power trick does not just make constants disappear; it can also get rid of logarithms. Because of this, we can make the above argument even more elementary by using a very crude form of the Marcinkiewicz interpolation argument. Indeed, suppose that f is a quasi-step function, or more precisely that it is supported on some set E in G and takes values between A and 2A for some . Then from (15) and (16) we see that and , and hence . Now if f is not a quasi-step function, one can decompose it into such functions by the “wedding cake” decomposition (dividing the range of |f| into dyadic intervals from to ; the portion of |f| which is less than can be easily dealt with by crude methods). From the triangle inequality we then conclude the weak Hausdorff-Young inequality

.

If one runs the tensor power trick again, one can eliminate both the constant factor and the logarithmic factor and recover (14) (basically because converges to 1 as M goes to infinity). [More generally, the tensor power trick can convert restricted or weak-type estimates into strong-type estimates.]

The deletion of the constant may seem minor, but there are some things one can do with a sharp estimate that one cannot with a non-sharp one. For instance, by differentiating (14) at p=2 (where equality holds) one can obtain the entropy uncertainty principle

whenever we have the normalisation . (More generally, estimates involving Shannon entropy tend to be rather amenable to the tensor power trick.)

[I should remark that in Euclidean space, the constant in Hausdorff-Young can be improved to below 1, but this requires some particularly Euclidean devices, such as the use of Gaussians, although this is not too dissimilar as there are certainly many connections between Gaussians and tensor products (cf. the central limit theorem). All of the above discussion also has an analogue for Young’s inequality.]

The tensor power trick also allows one to disprove certain estimates. Observe that if two functions f, g on a finite additive group G such that for all x (i.e. g majorises f), then from Plancherel’s identity we have

and more generally (by using the fact that the Fourier transform intertwines convolution and multiplication) that

for all even integers . Hardy and Littlewood conjectured that a similar bound held for all , thus

But if such a bound held, then by the tensor power trick one could delete the constant . But then a direct computation (for instance, inspecting what happens when f is infinitesimally close to g) shows that this amplified estimate fails, and so the Hardy-Littlewood majorant conjecture is false. (With a little more work, one can then transfer this failure from finite abelian groups G to other groups, such as the unit circle or cyclic groups , which do not obviously admit tensor product structure; this was first done by Bachelis, and with stronger quantitative estimates by Mockenhaupt-Schlag and by Green-Ruzsa.)

The tensor product trick is also widely used in additive combinatorics (I myself learned this trick from a survey paper of Ruzsa). Here, one deals with sets A rather than functions f, but the idea is still the same: replace A by the Cartesian power , see what estimate one gets, and let . There are many instances of this trick in the literature, but I’ll just describe one representative one, due to Ruzsa. An important inequality of Plünnecke asserts, among other things, that for finite non-empty sets A, B of an additive group G, and any positive integer k, the iterated sumset obeys the bound

(17)

(This inequality, incidentally, is itself proven using a version of the tensor power trick, in conjunction with Hall’s marriage theorem, but never mind that here.) This inequality can be amplified to the more general inequality

via the tensor power trick as follows. Applying (17) with , we obtain

The right-hand side looks a bit too big, but this is the same problem we encountered with the Cauchy-Schwarz or Holder inequalities, and we can resolve it in a similar way (i.e. by arbitraging homogeneity). If we replace G with the larger group and replace each set with the larger set , where is the standard basis for and are arbitrary positive integers (and replacing A with ), we obtain

Optimising this in (basically, by making the close to constant; this is a general rule in optimisation, namely that to optimise X+Y it makes sense to make X and Y comparable in magnitude) we obtain the amplified estimate

for some constant ; but then if one replaces with their Cartesian powers , takes roots, and then sends M to infinity, we can delete the constant and recover the inequality.

[Update, Sep 5: Optimal value for in the proof of Cauchy-Schwarz fixed. (Thanks to furia_kucha for the correction.)]

I must respectfully disagree with you on the “publication-quality” issue. With some cosmetic changes this can be an excellent expository paper. And if you think (as I do) that this does not look like your typical expository paper — it only shows that it’s important to publish it.

There are dozens of professors out there who will show this paper to hundreds of students (for their great benefit, I am sure), if it is ever published in a more traditional venue.

The blog entry is useful as is; why should it be “published” in a “traditional” venue? To give Prof. Tao additional headaches? To make it _less_ accessible (one might have to print it, or go to the library, or worry about some journal’s draconian copyright policies)? To make it expensive? To waste natural resources?

Some others have asked me (by email) regarding conversion of some of the blog posts here to a print form, for instance somehow creating a book version of this blog at some point, roughly in analogy with the conversion of lecture notes into book form. The main advantage I see for this is that it would create a permanent, stable, and citable reference for these articles; with a blog, there is always the risk that the web page goes defunct, and (more subtly) that the page changes over time, with no way to access older versions (unlike wiki-based pages such as Wikipedia or Scholarpedia). By the same token, though, the dynamic nature of the blog format, in particular the cycle of incorporating suggestions from the comments, could not be easily duplicated in that format. (For instance, I do not know what copyright issues would arise if one attempted to insert some of the blog comments here into a book.)

As the issue does not seem to need any urgent resolution, I have not come to a decision about this, but perhaps will revisit it in a few months when there are more articles here that might be convertible to a publishable format. Though if anyone else has some suggestions on this issue, I would be happy to hear them.

With respect to alternate modes of publication, I definitely think that a PDF version (possibly with the comments included, if feasible) could be a great idea. Although screen reading is doable, and the print-preview trick leads to readable/printable versions of the contents, I think having the posts accessible together in such a format could be very useful.

If done as an e-book, it wouldn’t necessarily have the disadvantage of being quickly outdated if produced purely electronically, in a way that could easily be kept up to date.

————————
More mathematically, here are some other examples of applications of the kinds of tricks described in the post:

(1) Deligne’s proof(s) of the RH over finite fields uses some kind of tensor power trick: roughly speaking, one needs to show that the eigenvalues of some operator are of modulus at most , and the “trivial bound” is . Moreover, it is shown that for , , one can get a bound …

(2) Recently, Bernstein and Reznikov (“Analytic continuation of representations and estimates of automorphic forms”, Ann. of Math. 150, 1999) proved a very nice result concerning bounds for the coefficients of the spectral expansion of the square of a cusp form (for a subgroup of ), where the goal is to obtain a precise rate of exponential decay. One of their ideas was to use the fact that a certain function is both bounded by a Sobolev norm and invariant under a group action. The Sobolev norm is not invariant, thus the estimate can be “amplified” to estimate the function by the smallest
invariant norm, which is then proved to be much smaller than the “trivial” bound.

It’s also worth mentioning that the terminology “amplification method” has been used in analytic number theory in the last few years to refer to an idea (originally introduced by Iwaniec in “The spectral growth of automorphic -functions”, J. reine angew. Math. 428 (1992), 139–159) that leads to a number of important estimates for (typically) special values of automorphic L-functions, especially in the context of the so-called subconvexity problem for estimating L-functions.

(For a survey of this method, one can see P. Michel’s Park City lectures in “Automorphic Forms and Applications”, edited by P. Sarnak and F. Shahidi, IAS/PCMS vol. 12.)

I agree with AB— with a few minor changes (like replacing links with brief statements of the cited named theorems), this would make for a wonderful Monthly paper. I’d like to see you add the Chauvenet Prize to your list of honors! (Has the Chauvenet Prize been discontinued? If so, it should be resurrected.)

This is the exactly kind of paper I found really inspiring as an undergraduate. (My analysis teachers at Cornell were Tony Knapp, Robert Strichartz and Eugene Dynkin, and this blog entry reminded me of how much I love analysis when I was an undergraduate— thanks!)

More tricks, please! This is great stuff, it really is!

I too would love to see the best of your blog posts collected someday in printed form. Unlike most other blog posts, I think only light editing would be required to enable them to form a fascinating and useful book.

I’d like to challenge readers who liked this entry as much as I did to add brief descriptions of their own favorite applications of “amplification”, after they’ve had some time to play around with applications! (Perhaps the best of these could be eventually added as additional examples if this entry appears in printed form.)

Regarding dimensional analysis: I think of this as a special case of Lie’s method of attacking differential equations by analyzing and exploiting their symmetries. Indeed, all the methods taught in undergraduate ODE courses can be regarded as special cases of his approach (which apply to PDEs, although certainly no magic bullet). See for example the readable UG textbook by Cantwell, Introduction to Symmetry Analysis, or various books by Olver or Ibragimov.

In view of an elementary connection between “entropy” and “asymmetry” known from group actions, and more sophisticated speculations from topos theory and tiling, I’ve long been intrigued by the idea that many mathematical arguments can be reformulated in a manner somewhat analogous to manipulating Rubik’s cube into a desired state, or moving around “tiles” on a table.

BTW, in a recent article by Brian Hayes, “Sorting out the Genome”, American Scientist 95 (2007), note that he is discusing what I have called the restoration problem for various sets of generators for the symmetric group (unsigned perms) and for its wreath product with C_2 (signed perms), and he illustrates the kind of “regularization” which can result from wreathing.

On the copyright issue:
All original ideas appearing on this blog belong to the owner or renter
of the server on which this blog is running.
Scientific copyright customarily lasts 28 years. Theoretically, what
this means, is that after copyright of an article is ceded to a journal,
before those 28 years passed, it would be a plagiarism for anybody to:
1. write another related paper with only minor changes,
2. copy the content of the article into a book.
Practically, nobody complains that these fine rules for academic
conduct seem to be violated more and more often, due e.g., to peer
pressure, seen everywhere.
In particular, I don’t understand why arxiv is not deleting preprints
after they are published. I suppose physicists do not care about
rigour, as e.g., it is customary in their community to rename whole
theories.
These bits and pieces I found on the net, and if they are incorrect or
there is new legislation on these matters, I would like to learn about it.

It is easy to see that in my post I expressed no opinion on what
Terry should do with his blog. However, what is nice about Terry’s blog,
is that science can be openly discussed here, at least on the issues he
puts forward. However, I look towards times in which e.g., there would
be a blog connected to every published paper reviewed on the AMS
website, with free access to people with math PhD degrees.
On copyright: it would be interesting to read an opinion from someone who
experienced her/his work violated in one of the ways I described above.
And actually, it seems that it is the conduct of arxiv that runs directly counter to the Open Access movement. The following SHERPA website:http://www.sherpa.ac.uk/romeo.php
gives a complete description of publishers’ various restrictions on archiving
articles in public domain. E.g., the American Phytopathological Society
is a so-called yellow publisher, i.e.,
“author can archive pre-print,
subject to restrictions below, author can archive post-print
only publisher PDF may be used,
published source must be acknowledged
cover page must be included (provided by publisher).”
To be more precise, I think it would be a good idea for arxiv to delete all articles which don’t respect these fine rules. Unless spending time working
hard on ever more papers and books is more important.

Thanks for the examples! I can’t believe I had forgotten about Deligne’s proof of RH for finite fields, which is one of the more spectacular applications of the tensor power trick.

The amplification method for subconvexity bounds can be viewed as a kind of special case of the amplification trick; here the symmetry one is exploiting is the ability to manipulate the underlying measure or weight in the summation, which is an important symmetry that I didn’t mention in the main post. For instance, if one takes the log-convexity of norms,

for and that Shuanglin mentioned, and one uses the symmetry for arbitrary weight functions w, one can amplify the log-convexity inequality to Holder’s inequality.

Dear Chris: There are some arguments in analysis (and probably in other areas of mathematics too) which definitely have the feel of a Rubik’s cube-like puzzle, in which there are a relatively small number of “moves” available, and to prove a given result one needs to execute these moves in a well-chosen order. For example, in the analysis of nonlinear PDEs, there are a large variety of multilinear estimates that one needs to prove in order to control any given PDE, but any such estimate can often be proven by the judicious application of a remarkably small number of inequalities (e.g. Holder’s inequality, Sobolev embedding, triangle inequality, the Leibnitz rule, and Bernstein’s inequality). The extreme version of this state of affairs is of course that of a calculus (such as the differential calculus), in which a small set of formal rules allow one to perform any computation of a certain type.

Dear all: I appreciate the lively discussion here, but please try to keep all comments polite and constructive. Thank you!

I too thought this was a great post, and it set me thinking about the issues raised by some of the comments above. Your initial remark, that it is not conventionally publishable (because it contains no new results) is obviously correct. Here, by “conventionally publishable” I mean “publishable as a research article”. And yet, what I look for in written mathematics is to be taught something. Usually this is an irritatingly indirect process: an author just displays the results and proofs and one has to think very hard before being able to work out which parts of the argument were genuine breakthroughs and which were “tricks” or “semi-standard moves”. If the area is not one’s own, this can be more or less impossible. Your article lays bare a way of thinking, one that I did not know (in particular, I was not aware of that very nice way of thinking about the Cauchy-Schwarz inequality, though I suppose parts of it I had appreciated without thinking about it too consciously) and in that way directly teaches me something.

I find myself sympathizing with those who would like it to be published in print form, and also with those who object to that on the grounds that the blog medium should be thought of as a perfectly valid medium in itself. I would suggest that the main reason people would like a print version is the convenience of having a number of such “tricks” articles in the same place. But there is nothing in principle against having that convenience online: what is needed is a single (virtual) place where articles of this kind are collected, indexed in many different ways (by author, by subject, and so on). I myself would be very happy to contribute articles of such a kind, as I feel strongly that our present mathematical culture needs changing: we should be more explicit about our thought-processes and thereby more welcoming to new mathematicians. I think that keeping such articles as an online resource would emphasize very nicely that this was intended as a challenge to the more traditional way of doing things.

I’m left with a question — forgive me if it’s very ignorant. In order to do something along the lines I suggest, would it be necessary to start up a new site, or is there some really convenient way that one can produce a “super-structure” to do the job of indexing the blog posts, keeping the posts where they are? (One could do it crudely by simply setting up a web page, but that relies on a single person to keep it updated, which is not satisfactory.)

Dear David,
I think that such new ideas can rarely be successful. E.g., I find the new
Scholarpedia quite cynical – the so-called curators can make decisions
on what is allowed to be changed, and at the same time take credit
for changes, as an article improves. Besides the editor-in-chief clearly
has a vested interest in promoting computational neuroscience through
this outlet. As if indiscriminate government funding was not enough.
At the same time Wikipedia has a growing positive reputation. Maybe
they would allow to connect blogs to articles there?

Terry, one thought I would like to toss out is that while wikis are great for collaboration, I claim the “private wikis” (possibly world readable but editable only by the author) are useful as authorship tools. Here we encounter a technical issue: it is not straightforward to convert a document created under say MediaWiki into a conventional latex formatted paper. But it can be done. (Jacques Distler is working on a wiki which he says can export a latex file with one mouse click, but his wiki software might have drawbacks mitigating this advantage.)

Tim G, I have also been musing about the potential a collaborative website of mathematical exposition. I’d like to see a wiki which combines the best elements of making mathematical writing more enjoyable by providing a fun authorship environment, while encouraging signed expository articles under the control of the author (an expert both in expository writing and in the subject matter), and providing convenient linking of related articles and more or less public discussion forums. Sites like http://eqworld.ipmnet.ru/index.htm and http://mathworld.wolfram.com/ are inspiring; I think with a suitable wiki and institutional sponsorship one could do even better. Unfortunately there may be nontrivial technical obstacles. Maybe we should find a suitable forum in which to extend discussion of “wikis we’d like to see”?

Chris, speaking about physicsforums.com: the New Scientist has nicer
graphics.
To switch from playing with software discussions to Prof. Gowers’
constructive suggestion that we should be more welcoming to new mathematicians, let me share an experience from my workplace.
We run programs separately, both maths and physics take about
100 new undergrads. While physics gets about 10 excellent students
each year (e.g., who during the 1st semester by themselves learn about Euler’s proof of the Basel problem), maths gets at most average ones.
I doubt this is a country-specific problem. Nobody I spoke to can give a sensible reason for this, and it is not the job situation, as mathematicians
are respected by industry employers here. Maybe our programs are too finance/other superficial fads oriented to attract the brightest?

[…] as a result of a very interesting post of Terence Tao. Both the post and the discussion can be found here . The post outlines a rather general idea, or trick, that can be used in many mathematical […]

Chris, I agree with you that maybe this is not the best place to have a very general discussion about how to organize expository material on the web, so I’ve followed your suggestion and attempted to get a discussion going. See the link just above this comment.

A random, if not irrelevant question: I find it interesting that you used intuition/terminology from economics. Do you find economics interesting? I was under the impression that many (pure) mathematicians dislike economics, since for instance, it lacks the elegance and beauty of theorem-proof. Did you learn economics as a byproduct of your mathematical activities or was this analogy with “arbitrage” for instance just a coincidental word from economics?

I find it helpful to draw analogies from any source that can usefully provide them, whether it be physics or economics or Tomb Raider. :-) Indeed, once one has developed the ability to distinguish between rigorous and non-rigorous thinking, there’s really no further reason why the “mathematics” compartment of one’s brain needs to be walled off from the rest of one’s experiences and cognitive abilities (which of course includes various non-rigorous modes of thinking, such as reasoning by analogy).

On the other hand, one should also be able to recognise when an analogy has reached its natural limits. For instance, it seems unlikely that one could set up an efficient market in estimates that would automatically arbitrage away symmetry imbalances, though I must admit that such a market would be very handy in my own research…

My collaborators, Erwin Lutwak and Gaoyong Zhang, were struck by the remark by Emmanuel Kowalski about Bernstein and Reznikov’s technique of “amplifying” the Sobolev inequality. We have been doing exactly this with both sharp isoperimetric and Sobolev inequalities on Euclidean space. Except we’ve always called it “optimization”.

In our PLMS paper on “L_p John ellipsoids”, we show how to amplify the isoperimetric inequality (as well as L_p version of it) for convex bodies over the group SL(n). You get an affine isoperimetric inequality, where the extremal bodies are necessarily not just balls but also ellipsoids. One cool thing about this is that with the “amplified norm”, you can get sharp *reverse* isoperimetric inequalities, where the extremal body is a simplex.

In our IMRN paper on “Optimal Sobolev inequalities”, we show how to amplify a Sobolev norm over all Banach norms (I guess this is over a rather large infinite dimensional group) and obtain sharp affine-invariant Sobolev inequalities.

Another theme in our work, which I believe is also pervasive in harmonic analysis, is averaging. Basically, we know two ways to turn Euclidean-invariant inequalities into affine-invariant inequalities: optimization (i.e., amplification) and averaging.

Maybe authors could include some additional tags in their ‘tricks explained’ posts that would help a computer to categorise them, then it should be quite trivial to write a short program that would serve as a meta engine. I could try some quick hack along these lines if you tell me which ways of accessing it you would like to see.

Adam: OK I am a physicist and have different standards but the idea of the arxiv removing published papers sounds ridiculous to me: Why would I want to give up the possibility of easy access to papers from my desk (for other than fitness reasons of more frequent visits of the library)? I would think the trend would be rather in the opposite direction: An easily accessible (and that would most likely mean central) repository of all papers and journals only adding the quality control, possibly only virtually. You could imagine that a ‘journal’ only digitally signs a version of a file as “according to standards of journal XYZ” without any paper version etc. Let’s face it, this is the only real value that the journals provide these days (and for which libraries pay enormous sums).

Robert: I am not taking any sides in the Open Access debate.
However, it is hard to deny the publishers the right to make
profit, as it is equivalent to criticizing capitalism. Especially as
that probably what enabled those price increases was the needless establishment of many new academic journals. The only problem
with arxiv is that it doesn’t respect the publishers’ requirements,
which are not that stringent. I think other solutions could be
discussed. E.g., why doesn’t the government buy out the journals,
instead of spending public money for the purpose of subsiding
big business (e.g., math education schemes, equivalent to every
student buying a graphical calculator).
To me, the only virtual quality control which seems to have worked so
far is anonymous contribution, e.g., Wikipedia, or the Linux project.
In fact, there are still many people who put more value in constructive
work than in money, politics, coteries, prizes, and other nonsense.
Let me give you examples of values journals provide.
Suppose you are asked to referee a paper, which is a blatant copyright
infringement of one of your works. Then you can at least put up a fight.
Also, sometimes referees ask questions which are conducive for
improving a paper.
I would think that in mathematics it is not too hard to notice good work.
However, what goes on in physics is another matter, and I don’t
know what solutions are needed for a science area which might just
be exhausted.

Of course, publishers can (and have the right to) ask ridiculous things. But if they do we should refuse to cooperate with them (by submitting papers, refereeing or serving on editorial boards). It should be in our interest as scientists to keep our fields healthy. If important results effectively disappear because they are only available in some journal that my (or many other) department’s library cannot afford to subscribe this cannot be in my interest. As in other situations where people decide that have to suffer the consequences at most indirectly (scientists decide where to submit but they don’t have to worry about libraries’ budgets or funding money going into publishing houses pockets) it’s a bit harder to raise awareness.

I would not go so far that governments should buy out publishers (which are legitimate businesses) but if their service for money is bad we as users should use (or set up which has been proven to be not too hard) better alternatives.

Government agencies could buy out not publishers, but journals.
Note how well e.g., journals of the AMS are run. They don’t insist
on ceding copyrights, rogue editors are made to resign, etc.
I believe before they were bought out by publishers, many journals
were managed by non-profit organizations. Because many new
were set up, publishers seized the opportunity to make profit.
The problem is that there are no known better alternatives.
If we disseminate our work in electronic repositories, the value
of such papers would be measured only in the amount of citations.
However, this doesn’t work for mathematics, as often original
results take years to be digested by the community. The AMS
webpage has just posted an article about the so-called Impact
Factor, which deals with this issue.
As for journals dissappearing from libraries, welcome to the world.
In my country, only the library of the Academy of Sciences in the
capital has a fine collection. Because of this, many talented people
working in other cities have long had to resign themselves to
uncompetive backwater projects. However, they are not sharing
the physicists’ whining about making revolutionary changes.
Maybe it is because they are not thinking only about “what is in
my interest.”

The ethics and future of mathematical publishing are interesting and important topics, but are a little distant from the topic of the original post. I have already received one request to ask that the discussion be moved to a more appropriate forum. Perhaps one of the readers here would like to initiate a new post on their own blog (or point to an existing post on this topic), and then advertise it here. For instance, the following recent post covers similar ground:

Dear Terry,
Apologies for getting off the main topic.
I have a question, which is somewhat related to amplification and
the work of your colleague, Perez-Marco. What are your thoughts
on the power of complex analysis tools in real analysis problems?
Are there any classical and well researched results for which real
analysis techniques were not enough? Examples of what I am
thinking about is the Selberg sieve, the Ablowitz-Ramani-Segur
conjecture, or some other problems.

Complex methods tend to be particularly powerful in problems in one real variable, or (naturally enough) in the complex plane or other Riemann or conformal surfaces. For instance, complex analysis methods are the only methods known that give good control on the asymptotic behaviour of general solutions to completely integrable equations in one dimension; a corresponding real-variable theory would be highly desirable, but is still in its embryonic stages.

The type of complex analysis that comes up in analytic number theory (analytic continuation of zeta and L-functions, shifting contours, locating zeroes and poles, etc.) can be partially replaced by Fourier analysis (using changes of variables to convert the Mellin transform to the Fourier-Laplace transform, for instance), but not completely. In particular, the Weierstrass factorisation of a meromorphic function based on its zeroes and poles is very hard to duplicate in a natural fashion using just the Fourier transform. (More generally, real-variable methods are hard-pressed to come up with a usable substitute of the obvious fact from complex analysis that the product of two meromorphic functions is again meromorphic.) But Fourier methods can be used for instance to replicate the Selberg sieve (I have done this in some of my papers with Ben Green and Tamar Ziegler) and to prove the prime number theorem (in one of my unpublished notes on my web page), though the proofs are not as slick as the complex variable ones.

In higher dimensions, complex methods are generally not as powerful as their real-variable counterparts in solving problems in many real dimensions. Indeed, a reverse phenomenon occurs: several results in several complex variables (notably those involving the -problem) require techniques from real-variable harmonic analysis and PDE in their proof.

[…] of applications, for instance to sum-product estimates. It is also an excellent example of the amplification trick in action; here the main source of amplification is the freedom to pass to subobjects, which is a […]

re tensor trick: this also occurs in the proof that not only does the degree of an irreducible rep of a finite group divide the order of G but also divides order of G / order of center of G. Proved by considering large tensor powers of the rep.

I loved this post and would like to offer another easy example — it’s the proof that Landau gave for the maximum modulus principle for analytic functions (it’s mentioned in a footnote in Polya and Szego).

What you do is write down the Cauchy integral formula for f(z), then bound |f(z)| by the maximum on the boundary and the length of the boundary. This gives you a “weak” inequality which you amplify by considering f(z)^n. Now you get to make the constants disappear just like they do from the tensorization trick.

For another financial metaphor, you can also call this process “leveraging the generality” of an inequality. T

There is another way to look at the Cauchy-Schwarz inequality in the finite-dimensional real case, i.e.
for real numbers . Let and be the corresponding vectors. We assume that all are positive. (All other cases can be deduced from this.)

You said that the minimum of the function
is . But the other hand is the minimum of the function.
Note that . The Cauchy-Schwarz inequality now follows from the fact the sum of the minima is less than or equal to the minimum of the sum.

Dear Terence,
I think it is a wonderful idea to group tricks together and give them a collective colourful name like “arbitrage”.
“Amplification” reminds me of the little trick of proving the ” freshman dream” equality (a+b)^q= a^q + b^q in a ring of characteristic p, first for q=p and then for q=power of p. But maybe this should just be called induction…
Best, A.S.

[…] version below is an updated and correct one. Also, Andreas pointed out an excellent post on “amplification, arbitrage and tensor power trick” (by Terry Tao) in which the “tricks” discussed are indeed very useful and […]

[…] of squares in Timothy Gowers’s first proof. Also the use of tensor products may remind you of Terry Tao’s post about tensor products that includes a proof of Cauchy-Schwarz, but this proof doesn’t actually use the tensor […]

Dear Terry,
I wanted to express my appreciation for this post. I am an undergraduate student in mathematics, currently enrolled in an analysis course, and although I lack the background to fully comprehend the topics discussed here, I have been able to extract a lot of useful tools that have helped me in my analysis problem sets. I look forward to reading more such expository papers. I would like to ask a slightly unrelated question. I have developed a very keen interest in analysis, and I would like to gain some exposure (even if it is a very rudimentary exposure) to some of the more cutting edge areas in analysis, and I was hoping you could offer some advice regarding where I could/should begin. For instance, if you could suggest some problems I could look into or texts I could read, it would really help a lot. I should also mention that I am also interested in mathematical finance, and so if you have any advice concerning areas in analysis that are related to this, I would really appreciate it. Thank you so much once again. Take care!

[…] rather than norm convergence, known as Stein’s maximal principle (discussed for instance in this previous blog post of mine). For instance, it reduces Carleson’s theorem on the pointwise convergence of Fourier […]

[…] of two quadratic residues is again a quadratic residue. One way to use this sort of structure to amplify bad behaviour in a single short interval into bad behaviour across many short intervals. Because of […]

[…] in the proof of 2 is a tensor power trick along the same lines as those described by Terence Tao in his blog. This will give the following result. Lemma 1 Let be positive constants with as goes to […]

Sorry, by “constant” I meant “constant in i” rather than “absolute constant”. The point is that the arithmetic mean-geometric mean inequality is close to equality when the are comparable, no matter how large the are. In this particular application, we want to reverse the AM-GM inequality with , so the optimisation proceeds by setting the to be comparable, e.g. setting for some large parameter L.

[…] to amplify a hybrid inequality into a dimensionally pure one by optimising over all rescalings; see this previous blog post for a discussion of this trick (which, among other things, amplifies the inhomogeneous Sobolev […]

Another great example of amplification is the proof of Chernoff’s bound from Markov’s inequality in probability theory, which stunned me when I first came across it. Markov’s inequality states that $P(X \ge a) \le E(X)/a$ for non-negative random variables $X$. Using that $P(X \ge a) = P(\exp(\theta X) \ge \exp(\theta a))$ for any $\theta > 0$, we find that $P(X \ge a) \le E(\exp(\theta X)) \exp(-\theta a)$, in which the parameter $\theta > 0$ can be chosen freely.

Here are two examples of amplification, where the amplification trick gets rid of constants.
1. The maximum principle for holomorphic functions in a disk (which could be another domain). Let and be a holomorphic function. Then Cauchy’s integral formula gives you an inequality , with respect to the sup-norm over the boundary. Apply the inequality to , then take the -th root. The constant is changed into , which tends to as tends to infinity.
2. In matrix analysis, let be an algebra norm over , that is . Let denote the spectral radius of a matrix (which involves the modulus of complex eigenvalues too). Then . This would be obvious if the norm was subordinated to a complex norm (take an eigenvector associated with an eigenvalue of maximal modulus, bla-bla). If not, take any such subordinated norm , we thus have . By equivalence of norms, we obtain . Apply this to , use , then take the -th root and let tend to infinity.

[…] case of the Kadison-Singer paving conjecture, but we may use a number of standard maneuvers to amplify this to the full paving conjecture, loosely following the papers of Popa and of Akemann-Weaver. […]

[…] that the additional term could be deleted by an amplification trick similar to those discussed in this previous post). See this survey of Montgomery for these results and on the (somewhat complicated) evolution of […]

In order to show that , one assumes without loss of generality . By expanding

$\displaystyle \|x-t\alpha y\|^2\geq 0$

where and , one can not only conclude that the Cauchy inequality is true but also that the equality is true only when are linearly dependent.

The choice of and seem rather mysterious; one the other hand, the way one chooses and in this excellent post is extraordinarily natural. However, maybe I miss something in the post; I don’t see how to conclude the “equality” part from your argument.

Hi, Prof. Tao. I think there is a typo after the step of using ‘Khintchine’s inequality to compute the expectations’ on eq. (10). I think It should become.
In the article, it missed raising the RHS to the power of .
Please correct me if I am wrong.

For commenters

To enter in LaTeX in comments, use $latex <Your LaTeX code>$ (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy.