I don't know, but after reading the paper at the link given by the OP, it is sure Yitang Zhang would be a perfect answer to the question "major mathematical advances pas age fifty", were it not closed. mathoverflow.net/questions/25630/…
–
JoëlMay 20 '13 at 3:41

46

@Gerhard Paseman: I think this is a perfectly legitimate question on this forum.
–
Bombyx moriMay 20 '13 at 3:49

36

@Gerhard Paseman: I would rather NOT ask you about system design but I would definitely ask this question in this forum because if it is more than perfectly fitted here :)
–
Nilotpal SinhaMay 20 '13 at 7:35

17

@Niloptal Sinha: what is the point of piling on on Garhard Paseman's comment in this way? The general(?) disagreement with it was amply expressed before. And this (type of) question is most definitely not "more than perfectly fitted here" or "perfectly legitimate" (@user32240). It is likely acceptable as an exception. But on the general principle Gehard Pasemen is perfectly right.
–
quidMay 20 '13 at 10:54

4 Answers
4

My understanding of this, which is essentially cobbled together from the various news accounts, is as follows:

Let $\pi(x;q,a)$ denote the number of primes less than $x$ congruent to $a\bmod q$, and $\pi(x)$ the number of primes less than $x$. Denote by EH($\theta$) the assertion that the following inequality holds:

In the mid 2000's Goldston, Pintz and Yildirim proved that if the Elliott–Halberstam conjecture holds for any level of distribution $\theta>1/2$ then one has infinitely many bounded prime gaps (where the size of the gap is a function of $\theta$, for $\theta>.971$ they get a gap of size 16). Since Bombieri–Vinogradov tells us that EH($\theta$) holds for $\theta <1/2$, in some sense the Goldston–Pintz–Yildirim arguments just barely miss giving bounded gaps.

On the other hand, in the 1980s Fouvry and Iwaniec were able to push the level of distribution in the Bombieri–Vinogradov theorem above $1/2$ at the expense of (1) removing the absolute values, (2) removing the maximum over residue classes and, (3) weighting the summand by a `well-factorable' function. This was subsequently improved in a series of papers by Bombieri, Friedlander and Iwaniec. For the Goldston–Pintz–Yildirim argument the first two restrictions didn't pose a significant hurdle, however the inclusion of the well-factorable weight appeared to prevent one from using it with the Goldston–Pintz–Yildirim machinery.

This is where my knowledge becomes extremely spotty, but my understanding is that Zhang replaces the Goldston–Pintz–Yildirim sieve with a slightly less efficient one. While less efficient, this modification allows him some additional flexibility. He then is able to divide up the quantity that is estimated using (*) into several parts. Some of these are handled using well-established techniques, however in the most interesting range of parameters he is able to reduce the problem to something that involves a well-factorable function (or something similar) which he ultimately handles with arguments similar to those of Bombieri, Friedlander and Iwaniec.

Zhang's argument reportedly gives a gap size that is around 70 million. My suspicion is that this gap will quickly decrease. In their theorem, Bombieri, Friedlander and Iwaniec give a level of distribution around $4/7$, where (according to these notes) Zhang seems to be working with a level of distribution of the form $1/2+ \delta$ for $\delta$ on the order of $1/1000$, so there is likely room for optimization. As a reference point if one had the level of distribution $55/100 $ ($< 4/7$) in the unmodified Elliot–Halberstram conjecture (without the well-factorable weight), the Goldston, Pintz and Yildirim gives infinitely many gaps of size less than 2956.

The parity problem, however, is widely known to prevent approaches of this form (at least on their own) from getting a gap smaller than length $6$. Of course, there are more technical obstructions as well and even on the full Elliott–Halberstam conjecture one can only get a gap of 16 at present.

Note: the first result of "Bombieri-Friedlander-Iwaniec style", going beyond the Riemann Hypothesis but not as far as the exponent $4/7$, is due to Fouvry-Iwaniec ("Primes in arithmetic progressions", Acta Arith. 42 (1983), 197-218; the exponent there is $9/17$). All these results depend crucially on the spectral theory of automorphic forms to estimates sum of Kloosterman sums, and especially on the results of Deshouillers and Iwaniec in this direction.
–
Denis Chaperon de LauzièresMay 20 '13 at 7:11

1

@Denis, thanks for the correction. I've update the text to include a reference to the Fouvry-Iwaniec result. Certainly one of the most fascinating aspects of Zhang's work is the (likely) application of deep estimates from the theory of Kloosterman sums and automorphic forms via the Bombieri-Friedlander-Fouvry-Iwaniec-type arguments. This, of course, should be compared to the Goldston-Pintz-Yildirim estimates that rely on Bombieri-Vinogradov, which in turn is derived from more classical estimates (such as Siegel-Walfisz, Vaughn's identity and the large sieve).
–
Mark LewkoMay 20 '13 at 8:13

4

It's interesting to speculate on how much the 70000000 will be reduced. In this regard, the table on page 12 of Goldston-Pintz-Yildirim is relevant. If one had level of distribution 4/7 with no strings attached, it seems one might get gaps of size 500 or so. To get gaps under 100 without some completely new idea one would have to go out to level of distribution nearly 2/3, i.e. double the improvement of BFI. I'd wager that getting down to 10000 or so is going to prove pretty difficult.
–
Ben GreenMay 20 '13 at 9:39

3

@Ben, I have no idea which approach Zhang follows, I haven't seen a copy of the paper. I see your right that I wasn't looking at the table associated to the more refined argument. I'm curious to what extent the length of admissible tuples of a given size have been nailed down (note that the table on page 9 of the GPY paper seems to suggest the smallest tuple of size 421 isn't known). It might be possible to reduce the bound by a non-negligible factor just by computing a smaller admissible tuple of whatever size Zhang needs.
–
Mark LewkoMay 20 '13 at 9:55

What new approach did Yitang Zhang try & what did the experts miss in the first place?

Yes it is a good question as to why (say) FI did not hit upon such a result, as the two major glue components, dispersion a la BFI, and beating the square-root barrier akin as Friedlander/Iwaniec, are due to them. As Zhang puts it, last paragraph section 2 page 7, he saves a $\sqrt r$ factor in a Kloostermann type bound, and can take $r$ as a small power (well, I think he should take it larger than he does, after review).

Most everything else is the paper is rather "standard", as the idea of restricting to integers/moduli which have no small/large factors is common, the dispersion method is in BFI, maybe in S10 Zhang has to work a little to get his conditions on congruence classes to roll through. But then the Type III estimate (trilinear) in 13 and 14 is the heart, of why he wins. Indeed, he handles each $d$ separately, rather than on average (as dispersion or large sieve). Conceptually again he brings forth a Weyl shift to copy a sum many times, then turning to Fourier techniques. The extra flexibility of factoring $d=qr$ is not distressed, for he uses Chinese Remainder Theorem in repair, and in fact the Ramunajan sums pop out too. See bottom page 52, at 14.13 and below to top line page 53, where the Birch-Bombieri bound is applied. See in $J(m_1,m_2)$, the part from $r$ is a Ramanujan sum, so the double $r$-sum is bounded in essence by $r$, not $r^{3/2}$. Again the preceeding Fourier analysis has technique, but the idea is already bookish.

So I repeat, the main advance was not to apply Deligne to something like $Z(k;m_1,m_2)$ below 14.12 on page 52 directly on modulus $d$, but to first peel over factor, small perhaps but useful, as $d=qr$ leaving $q$ to geometry and $r$ to Ramanujan after spinning out the $N_3'$ sum over this modulus. Well, that is my word, I don't say I understand all, not even philosophically why this line should give the win.

PS. It could say useful to have a rewrite of 13 and 14, independent of the rest of the work, for he chooses parameters there for his purpose, but this vitiates the generality. They are independent for the whole.

Or again, look at $P_2$ near bottom page 48. This should be of size $d_1^{3/2}K^2$ if you apply direct reasoning, but Zhang first rolls out $n$ modulo $r$. It is still mysterious for me why this avails, factoring $V$ into $W\cdot C$ in 14.7 the latter a Ramanujan.

Zhang wants to estimate $|\Delta(\gamma,d,c)|$ where $\gamma=\alpha\star\chi_{N_3}\star\chi_{N_2}\star\chi_{N_1}$ is a triple convolution with $N_1\ge N_2\ge N_3$ of decent size, say $N_3\ge x^{1/4-6\omega}$. The characteristic functions $\chi_N$ are for an interval say $N$ to $N+N/(\log N)^B$. Here we have the standard discrepancy $\Delta$ defined as
$$\Delta(\gamma,d,c)=\sum_{n\sim x\atop n\equiv c (d)}\gamma(n)
-{1\over\phi(d)}\sum_{n\sim x\atop (n,d)=1}\gamma(n).$$
Most importantly, $d=qr$ with $r$ convenient. The estimate $|\Delta(\gamma,d,c)|\ll x^{1-\kappa}/d$ is desired for some positive $\kappa$.

Zhang replaces $\chi_{N_1}$ by a smooth approximant. This is standard, there are various versions of this in the field, the idea being that if a function goes from 0 to 1 over an interval of length $Y$, you can control its derivatives as powers of $Y$. Then one executes the inner sum in $\Delta$ over this variable, and replaces the sum over the smoothed function by its Fourier transform. This allows the main terms in $\Delta$ to cancel from the frequency 0 contribution, leaving a deal with highers. See middle page 45 and following. Copying,
$$\sum_{n\equiv c (d)}\gamma^\star(n)=\sum_{(m,d)=1}\alpha(m)
\sum_{n_3\sim N_3\atop (n_3,d)=1}\sum_{n_2\sim N_2\atop (n_2,d)=1}\sum_{mn_3n_2n_1\equiv c (d)}f(n_1),$$
and the inner sum is
$${1\over d}\sum_{|h|\le H} \hat f(h/d)e_d(-ch\overline{mn_3n_2})+O()$$
where $H=d/N_1$ essentially. The sum over $m$ handles itself, the inner part will yield the cancel.

So Zhang's goal is to obtain the estimate
$$\sum_{1\le h\le H}\sum_{n_3\sim N_3\atop (n_3,d)=1}\sum_{n_2\sim N_2\atop(n_2,d)=1}\hat f(h/d)e_d(-ch\overline{mn_2n_3})\ll x^{1-\kappa}/M.$$
In fact to use a Möbius inversion device that I omit below, we need this for $d$ not just near $\sqrt x$ (beyond the Bombieri-Vinogradov range), but also divisors of $d$ so a wider range. When $d$ is small enough, or $N_2$ large enough to say another way, a one-variable estimate suffices. Actually the trickiest case is when $N_1\sim N_2\sim N_3\sim x^{5/16-9\omega/2}$ (maybe this is not the exact $\omega$ multiplier, but 5/16 is right) and $d\sim x^{5/12-9\omega}$, neither a 1-variable method or the ensuing, is going to swell. But when $d\le N_1$ the estimate for the $H$-sum is empty, and when $d^{3/2}N_3\ll x^{1-\kappa}/M$ a 1-variable bound on $n_2$ suffices. I digress. In fact, as noted per (2.4) in Friedlander/Iwaniec, a 2-variable bound can be applied from Deligne some times
(the inner double sum then bounded by $\sqrt{d^2}$). Zhang does not do this extra, I find it gives a suitable bound when $d^2\ll x^{1-\kappa}/M$.

Back to the main story. After a application of Möbius to insert a coprimality condition in the frequency variable $h$, Zhang then uses the idea of the Weyl shift. It is key that he shifts by multiples of $r$, this convenient factor of $d$. The idea of the Weyl shift is to copy a sum many times, only partially shifted by much less than its length. In the above, Zhang replaces $n_2$ by $n_2+hkr$ for $k$ up to some bound $K$, and then computes that the difference between the sum and the $hkr$-shifted sum is small (provided $K$ small enough of course). Then one wants to bound the average over the shifts
$$N(d,k)=\sum_{1\le h\le H\atop (h,d)=1}\sum_{n_3\sim N_3\atop (n_3,d)=1}\sum_{n_2\sim N_2\atop(n_2+hkr,d)=1}\hat f(h/d)e_d(-ch\overline{m(n_2+hkr)n_3}).$$
I simplified the formula on top of page 47 a bit, not including the Möbius step, the above is not truly correct, but an idea of how it goes self-contained here.
To state again, the idea is that we want to bound $N(d,0)$, we know that $N(d,k)$ is close to $N(d,0)$ for small $k$, and will establish a bound on average for ${1\over K}\sum_{k\sim K} N(d,k)$.

From Cauchy, and substituting $l\equiv \bar hn_2$ modulo $d$, one is left to estimate ($P_2$ of bottom page 48)
$$\sum_{l (d)}|\sum_{k\sim K\atop (l+kr,d)=1}\sum_{n\sim N_3\atop (n,d)=1}
e_d(b\overline{(l+kr)n})|^2.$$
Staring at this, if you expand, the $l$-sum is over $d$ and the $n$-sums are essentially incomplete sums modulo the same, so expect $d^{3/2}K^2$. But the shift by a multiple of $r$ will allow us to win. The idea is that $N_3$ exceeds $r$ by enough to allow sprawling $n=rn'+s$ over residue classes modulo $r$, and this to be efficient. Now normally, this should not gain, see bottom page 49 with 14.6, Zhang wants to estimate
$$\sum_{k_1\sim K}\sum_{k_2\sim K}\sum_{s_1\le r\atop (s_1,r)=1}\sum_{s_2\le r\atop (s_2,r)=1}\sum_{n_1\sim N_3/r}\sum_{n_2\sim N_3/r}\sum_{l (d)} e_d(b\overline{l(n_1r+s_1)}-b\overline{(l+kr)(n_2r+s_2)}).$$
Again one doesn't expect to win, as though the $e_d$ will factor over $d=qr$ to $e_q()e_r()$, there will still be three sums over a variable modulo $r$, and $r^{3/2}$ shall appear. But the idea is that the fact the Weyl shift was a multiple of $r$, so upon unwinding the CRT, the triple sum modulo $r$ is really a double sum of a Ramanujan sum. So that's why I typed out the innards of $e_d$ above.

Construing the technicalities with Fourier transforms, this results as needed. The key is that $r$ can be taken as a small power of $x$, and we win by $\sqrt r$, or really the fourth-root after Cauchy, but this is enough.

I have to say that, now that I have studied the paper in some detail, this seems to me to be an extraordinarily accurate (if perhaps difficult to parse on a first reading) answer to the original question.
–
Ben GreenJun 9 '13 at 17:20

5

I might say, Zhang could call his paper "A new bound for an incomplete Kloosterman sum to composite moduli, and bounded prime gaps". Kowalski noted (and was kinda surprised) the "higher" Kloostermania such as Kuznetsov trace formula and sums of Kloosterman sums did not appear (as in BFI). To state the "philosophy" of Zhang's new element in one (long) sentence: When you Weyl shift a suitable incomplete Kloosterman sum to a composite modulus by multiples of a (small) factor of said modulus, you beat Deligne since the shifts modulo this factor induce an essentially bounded Ramanujan piece.
–
v08ltuJun 9 '13 at 20:55

1

To be complete, the other "new" elements by Zhang were reducing to smoothed moduli (already in Motohashi/Pintz, and certainly standard in the field), and the handling (Section 10) of more than one residue class per modulus, via a multiplicative system. Again as Kowalski spoke, it was not immediately clear to experts how Zhang would resolve the second issue, as higher Kloostermania is almost forced to consider but a solitary fixed residue. But that is unneeded here (and indeed, is one aspect that keeps the paper readable to a wider audience -- assuming you import Deligne's bounds, of course).
–
v08ltuJun 9 '13 at 21:04

1

And by the way, I think he should have called his paper "On a new bound for an incomplete Kloosterman sum to composite moduli, with applications".
–
Ben GreenJun 9 '13 at 21:29

1

I may try and read BFI. If you do it just in some critical range, rather than in the extreme generality they do with unspecified parameters $C, D, H, R, N,\dots$, my guess is it will boil down to the same basic ingredients in a different order, once one has applied a suitable decomposition into bilinear forms: Cauchy, Weyl shift, completion of sums/Fourier expansion. What else is there?
–
Ben GreenJun 10 '13 at 7:54

To summarize very briefly, he is building on work by Goldston, Pintz, and Yıldırım. The article describes his innovation as "... to use not the GPY sieve but a modified version of it, in which the sieve filters not by every number, but only by numbers that have no large prime factors."

A lot of smart people were working very hard on this problem, and surely the overview article makes it sound a lot easier than it actually was! But the article does make one feel one understands a little of the overview or "philosophy" of the proof.

I'll denote any points I get from this observation (which I stole from someone else anyway) to a good cause. An additional point to be made, now the preprint is available, is that the Bombieri-Fouvry-Friedlander-Iwaniec type results on level of distribution depend on an estimate for sums of Kloosterman sums that requires (via a lemma of Bombieri and Birch) Deligne's work on the Weil Conjectures. It seems to me (though I'm certainly not an expert) that these estimates are not accessible by the more elementary methods such as Dwork/Stepanov.
–
Ben GreenMay 21 '13 at 16:19

9

@User32240, I'm sure Zhang is well aware of this, as well as many other paths towards refining the bound. Clearly his priority was not to squeeze the best possible results out of his methods, but rather to present the argument in a transparent manner. My comment was only intended to emphasize this point.
–
Mark LewkoMay 22 '13 at 8:39

38

Slightly off-topic, but I find it disappointing that such a landmark piece of human knowledge seems to be available only behind a paywall.
–
Pablo ShmerkinMay 22 '13 at 12:34