Another thing to discuss is an invitation to Polymath8 to write a feature article (up to 8000 words or 15 pages) for the Newsletter of the European Mathematical Society on our experiences with this project. It is perhaps premature to actually start writing this article before the main research paper is finalised, but we can at least plan how to write such an article. One suggestion, proposed by Emmanuel, is to have individual participants each contribute a brief account of their interaction with the project, which we would compile together with some additional text summarising the project as a whole (and maybe some speculation for any lessons we can apply here for future polymath projects). Certainly I plan to have a separate blog post collecting feedback on this project once the main writing is done.

166 comments

I’m done with the short Section 4.3. About the only change worth noting is that I replaced the preprint reference for the second Bessel identity with a reference to Watson’s book (according to Watson, the formula is due to Lommel, and goes back to 1879, and a special case is already in Fourier…)

Regarding Figure 3 in Section 3, I think it would be better to use H rather than H(k) in the label for the vertical axis, since we have defined H(k) as the minimal diameter of an admissible k-tuple and the plot is using upper bounds H on H(k). I would change the captions for both figures 2 and 3 to read “…best known upper bound H for H(k)…”.

I’ve done some (small) changes to Sections 3.1 and 3.8. The changes in Section 3.1 might be a mere reflection of my own confusion, so feel free to undo them if you want. I was a bit confused about the statement that if , then the expected number of missing residue classes is . E.g. if (which is also ) then this is not true. But conversely I agree that the minimal factor one has to include for the conclusion to become true is , so that’s probably what was meant.

This was one of the points on which I also didn’t feel I understood the underlying argument, so my write-up was certainly shaky. I won’t touch Section 3 soon, so you can change it as needed, as far as I am concerned.

An ‘epsilon’: In the abstract “the q-van der Corput A-process of Heath-Brown and Ringrose” should probably be “the q-van der Corput A-process of Heath-Brown and Graham-Ringrose” as in the rest of the paper.

5. Section 3.4, first paragraph (currently page 30). The last sentence is confusing, as “Hensley and Richards” are mentioned both directly before the parenthetical, and directly afterwords.

6. In the same paragraph, it says this “leads to the asymptotic bound…” but then later better upper bounds are obtained. I recommend changing “asymptotic” to “upper”.

7. Section 3.5. Change “sieve the the residue” by removing one of the the’s.

8. Section 3.6. Change “is to check is whether…” by removing the second “is”

9. Next paragraph, change “which has all one…” to “which has all but one…”

10. Paragraph before Section 3.6.1: change “sketch three of them” to “sketch four of them”

11. Section 3.6.1, change “residue class $a_p$ of $a$ modulo $p$…” to “residue class $a_p$ in $\Z/p\Z$…”. [It may be confusing to the reader that the notation is being changed here. In sections 1-2, this residue class would be written as “a_p (p)”.]

12. Last paragraph of 3.6.1. An inequality is spilling off the line. Also, instead of $a_1 \cup \cdots \cup_a n$ it should be $a_1 \cup \cdots \cup a_n$. [The $a$ shouldn’t be in the subscript.]

13. Section 3.7. A space is missing in “ofnarrow”

14. Section 3.8. The second paragraph is a parenthetical, with a parenthetical inside. I recommend removing one of the levels of parens.

15. Four paragraphs later, change “it is better to use directly…” to “it is better to directly use…”

16. Later in the same paragraph, it says “where the optimal value is roughly…” It may be helpful to write “where the optimal value of $Q$ is roughly…”

17. Next paragraph, one sentence ends “so that $M(d)\leq k-1$ if and only if $H(k)\geq d$.” I recommend making this phrase its own sentence.

Great. I see that xfxie changed the label on the vertical axis in Figure 3 per my suggestion above (thanks!), and I just changed the captions to match. I won’t have time to work on it today, so feel free to make further edits if anything comes up. I’ll plan to work on narrow.tex on Friday and over the weekend.

where is a cutoff function. This generalises the GPY sieve (which is basically the case when ), but the increased flexibility of the multi-dimensional cutoff function f seems to give better results in the asymptotic regime when is very large. (It may be that Maynard has done something very similar at the other end of the spectrum when is very small, assuming EH – I haven’t gotten any more details on that yet though.) However it is a bit difficult to compute explicitly with these multidimensional cutoffs, so I’m not sure that these methods can help us for, say, , since 632-dimensional integrals are not exactly easy to work with. But this may be another thing to consider for a potential “Polymath 8b” sequel to this current project. (Certainly there is no reason why the current project should be regarded as the definitive last word on prime gaps for all eternity!)

I was just thinking that the statement as written sounds too definitive (referring to the strongest possible results, rather than for example the strongest known results). Indeed, it seems to be already contradicted by the discussion of possible new sieves in Section 11 paragraph iii. Apologies if I’m misunderstanding.

I’ve been in touch with James Maynard. He has indeed (provisionally) obtained B[12] on EH, basically by performing a computer optimisation on the multidimensional cutoff functions discussed above, and also is provisionally reporting a result B[H] for a small value of H (perhaps as small as H=700) using Bombieri-Vinogradov, and also intervals of that size with three primes (rather than two) assuming EH. But this is all numerical work and is subject to confirmation. The details are not yet written up and probably won’t be for a few more weeks yet. So the Polymath8 results may soon be overtaken by events; but the results in our paper have independent interest beyond the primary goal of getting B[H] for H as small as possible. I’ve invited Maynard to contribute to a future “Polymath8b” project in which H is optimised further, but he requested some time to think about it.

Gergely: given how diligently Terry seems to read everyone’s posts here, I’d be awfully surprised if his failure to reply to Emmanuel does not imply that he meant (essentially, anyway) what he wrote. But regardless, it’ll be extremely exciting to see which elements of polymath8 further improve k0 over what Maynard has (provisionally, of course) already achieved. Will we see a two-digit k0 this year? It’s beginning to be hard to bet against that. As one of the non-specialists who have been in awe of what you experts have been doing here, I feel that one of the most illuminating aspects has been the ebb and flow of perceived plausibility of progress in the various available avenues.

A related problem is to find a “good structure” for the generalized f in terms of several one-dim. functions for which the optimization may lead to a system of ODE (rather than PDE for the most general f.)

I am continuing in section 4. Here’s a point that puzzles me: page 52 of the current newgap.pdf, we write (between (4.37) and (4.38)): “so by repeating the analysis in Section 4.2 to deal with error terms, we conclude…” How does this work?

Compared with Section 4.2, there is a factor F(d)-\tilde{F}(d), and if I understood right, the point in Section 4.2 was to get a negligible error by having a difference of two sums with the same asymptotic behavior. Here we do not have an asymptotic formula for the difference F(d)-\tilde{F}(d), only estimates, so I don’t see how to cancel the main terms to get a negligible contribution from the error term for F(d). Or did I misunderstand something in the treatment of Section 4.2??

It seems to me that Section 4.2 can be done asymptotically (by doing the Möbius inversion asymptotically instead of isolating the contribution of the divisor 1 and saying that the remainder is O(d/varphi(d)-1), as in the last line of page 48), using a variant of Lemma 4.2 that has a coprimality condition inserted (i.e., that is uniform with respect to such changes of the multiplicative function of Lemma 4.2). This might even give a slightly nicer-looking argument.

I’ve added a lemma 4.10 which encapsulates the use of the Mertens formula for the sums over primes which occur twice in this section and two times more in the next. The location might not be the best: it’s analogue to Lemma 4.2, and maybe would be better around there?

Apart from that, there are only cosmetic changes and some more details for certain computations.

It appears that my version of section 4 may soon be out of date. I’ve only gotten through the first five pages so far, so maybe I’ll wait a day or two and download the new version. Anyway, here are my suggestions (hopefully the numbering still matches):

1. First paragraph of section 4, it says “All but the first one can be used…” I recommend change “one” to “of these sieves” [to clarify you are talking about the sieves, and not the previous list of Theorems].

2. The second offset equation in the proof of Lemma 4.1, reads . This is the negative of the correct lower bound, and the \alpha,\beta summands should be switched.

3. In the next paragraph, it says that is set equal to an elementary Selberg sieve. It might sound better to say something like is a weight/counting function in the sieve (rather than equal to the [whole] sieve).

4. In equation (4.7), the summation is over . In responses to my suggestions for Section 1, it was stated that the notation is used only finite intervals, but it might be good to rethink that since this notation is used often in this section.

5. In that same equation, in the subscript, we also have the restriction . I cannot seem to find the definition of P(n) anywhere in the paper. I looked pretty thoroughly, but may have missed it. [If I recall correctly, this is the place where the admissible tuples comes in, and P(n) does NOT equal to the product of primes up to n, which is how P(z) is usually defined in sieves.]

6. In the proof of Lemma 4.2, the second sentence reads “By rescaling we may assume that is supported on .” I can’t seem to find any place that this is ever used, so you may want to delete this sentence.

7. Later in the proof, in the second offset equation after (4.12), in one of the summations, the subscript is . The ending “;d” should be removed.

8. Throughout Lemma 4.2, and its proof, there seems to be a weird extra space between and the left parenthesis, when the left parenthesis is larger. This spacing isn’t especially awful, except in the offset equation I mentioned in point 7, where it almost looks like you have .

[I also noticed that the infinity signs in the paper are not symmetric–the left side is smaller than the right side, which is somewhat odd.]

9. Later in the proof of Lemma 4.2, it says “and hence by summation by parts…” followed by an offset equation. I’m sad to admit that when I was reading this part it took me nearly half an hour to figure out why this equality holds. Part of the problem for me was that I didn’t believe that a summation by parts would result in an integral involving g. Finally, when I decided to just try the actual integration by parts, I realized where the g came from– we are doing a chain rule on , and the g’ part gets absorbed into the error terms.

I don’t know that anything needs to be done here. If I had just believed the statement that I should do summation by parts, it would have worked out cleanly and quickly. But maybe a statement about simplifying or absorbing terms into the errors would be helpful.

10. In my version of the paper, bottom of page 42, offset equation directly before (4.13). In the subscript it has — where in all previous summations the interval is . It might be helpful to state explicitly that we let at this point.

11. Sentence below (4.13): it say “by another application of Lemma 4.2 the previous expression may be upper bounded…”. It isn’t clear at first whether the “previous expression” is (4.13), or the offset equation directly before that [I believe you meant the latter].

Later in the same sentence, it has . The extra parentheses can be removed. Also, is there any specific reason for the flip from to the reciprocal?

Later in the same sentence, it says “which is negligible for (4.2) by (4.5),(4.1).” It may be better to say “…by (4.1), (4.5), and (4.6).”

12. In the subscript of the summation in (4.15), I recommend changing to . [Add the parens.]

13. In the second offset equation after (4.17), the RHS is . Should it be |a_d|^2 rather than |a_d|?

Indeed, this section is undergoing changes as I continue reading through, so a number of corrections have already been made or are not directly relevant (e.g., the polynomial P(n) that was missing is now defined…)

I have just finished re-reading Lemma 4.11 (properties of dense divisibility). Apart from a few typos, I just expanded the shorter proofs — since I was checking the details anyway. It may be possible to make it more concise again (obviously, checking the details for oneself helps a lot in clarifying the concept). I also used references to the d.d. property of smooth numbers (which is part of the lemma) instead of speaking of “greedy algorithm”.

I finished with section 4.5. Nothing special to mention, except that I moved the final remark of that section (on quantitativer results) to the end fo Section 4.1, since it seems to fit better at the end of the general discussion of the basic ideas of the GPY method. (I uniformized the notation of the parameters with the changes I had made for typographical reasons in subtheorems.tex)

I read the final optimizing section of Section 4. The changes I made are cosmetic, since I don’t know anything about this type of optimization methods (for the same reason, this should be checked again to make sure I didn’t inadvertently introduce a problem). I’d suggest adding a standard reference at the beginning, that might be useful to orient readers like me.

I downloaded the scripts K0Finder.zip mentioned in the footnote, but I haven’t succeeded in making them work on my computer. From just looking at the shell files, it seems that the DEPSO instance does not contain the required problem file (?) (at least running OptSolver.sh leads to an error of this type). Also I didn’t see where/how the Maple script is created.

Just tested the K0Finder.zip package on another machine. Seems I did introduce some errors as trying to make the package clearer. Sorry for that. Now the package (and README) is updated.

Until now, it has been only tested it on mac and Ubuntu linux. Please let me know if there are any problems to make it work on your machine.

BTW: There are two examples in the directory “test”. The components for the Maple script will not be deleted if you comment out the lines in exec/clear.sh. The original files are located in the directory “inst” (for instances of c_varpi,c_delta, i) and “opt” (for optimization models).

On a related note, I am trying to reproduce the values in Table 4 (for k_0=632) using Pari/GP. Everything works fine, except that I get a much smaller value of kappa_3. Here are some intermediate values that I got, which are not in Table 4, so that we can figure out where the computation goes wrong:

It’s true that this notation is currently being used in two sections (Section 5 and Section 11.1); but I am thinking in fact that we may end up deferring Section 11 (which is rapidly becoming obsolete) to a second paper (perhaps renaming the current paper as “A new bound for gaps between primes, I”).

Hi, in Table 5 we include a number of upper bounds on obtained by various methods. The first row “k primes past k” refers to the tuple obtained by taking the first primes greater than . The second row “Zhang sieve” refers to the tuple corresponding to the minimal for which is admissible. Is there a reason why we call this the Zhang sieve?

This follows the wiki write-up, but I think it would make more sense to call it the sieve of Eratosthenes, per the description on p. 28 (in the paragraph beginning “More practically”). I believe one can likely prove that this sieve will always yields tuples of the form (c.f. footnote 10), but in any case it certainly applies to all the tuples listed in Table 5.

Yes, the Eratosthenes sieve would make more sense as terminology (and is broadly consistent with the usage in Gordon-Rodemich, http://www.ccrwest.org/gordon/ants.pdf ). I think the name Zhang sieve originated very early in the Polymath project, before we knew of the relevant literature, and needed a name to distinguish the sieve arising from blocks of primes from Hensley-Richards type sieves.

I didn’t notice this when reading Section 3; I think “Zhang’s sieve” is only used in the table, however, so it’s probably best to replace it by something else, and add a few words somewhere at the beginning to indicate this terminology.

P. 64, l. -2: “” -> “” (this should be changed in quite a few places).
P. 65, l. -1: “” -> “” (this should be changed in quite a few places).

Furthermore, the aesthetic of the tables can be improved a bit. I will gladly give my (second) suggestion on how to type them properly, when the proof-reading is all over and no further changed to the tables will be needed.

I implemented a number of corrections, largely to Section 10 in the main folder, that were supplied to me by Paul Nelson, who also noted that he had gone carefully through this section of the paper. Now that all the sections of the paper have been at least double-checked (though not by the same sets of authors), I feel comfortable now with removing the question mark on our main result of H=4,670 from the wiki.

Incidentally, when I compiled newgap in the main folder, my local copy of LaTeX (which is MiKTeX) failed to find the images h-plot, h-plot-rel, plot-pi, and plotk0-2, but this is presumably a local issue since I can see those files in the directory.

I’d like to record here a small refinement to the asymptotics in Section 3.2 that Wouter pointed out (and which I just verified). In the progress report it was noted that taking the first k primes greater than k gives the bound

.

But if you carefully compute using the bounds in (3.1) that follow from a strong form of the Prime Number Theorem, you can actually show that taking the first k primes greater than k yields the tighter bound

,

which is the same thing you get for the tuple for any (even ). There is a small asymptotic improvement from taking , but it is hidden inside the term.

1. At the end of footnote 14 (given on page 39): the references can be grouped together if you want.

2. Second offset equation in the proof of Lemma 4.1: the two summands inside the curly braces are in reverse order. It should be .

3. Lemma 4.2.

3a. In the statement of the lemma, the second line of the first offset equation has the condition “ and .” To make this match the corresponding restrictions on the next page for , and to make the three cases be mutually disjoint, I recommend changing to j>1.

3b. In the next paragraph, when it says “this formula holds uniformly for any…” it might be helpful to the reader to add “in ” after “uniformly.”

3c. The next sentence starts “In particular,…” This currently reads as though the “in particular” statement follows from the sentence directly beforehand about uniformness. But I think the “in particular” statement just follows directly from (4.11). Maybe some sentences could be moved around to make this clearer.

3d. In the statement of the lemma, is assumed to be *fixed*. However, this is never explicitly assumed in the proof. If it is a necessary restriction it should probably be assumed in the first sentence of the proof, and then when that assumption is used it may be helpful to the reader to point it out.

Further, the sentence about the equation holding “uniformly for uniformly integrable families” leaves open the possibility that the family being used is *NOT* fixed (with respect to x). Indeed, I believe that this is forced, since the translates are not fixed, since depends on .

Maybe a word of explanation on these points would be helpful?

3e. In the first offset equation in the proof of the lemma, I recommend changing to . [Add the integration index.]

3f. Next paragraph, change “We use prove the…” by removing “use”

3g. In the offset equation directly following the phrase “and in turn the inner sum transforms into…”, on the RHS, the parentheses are smaller than on the LHS.

3h. Two offset equations later, on the RHS there is a formula using curly braces, which are smaller than some of the parentheses used inside. I recommend using \left\{ … \right\}.

3i. In that same offset equation, in the little-o term, in the summation subscript, I recommend changing just to . [The restriction is unnecessary, and the change then matches what is done on the next line.]

3j. The end of the next sentence, which reads “… and so .” doesn’t seem to be necessary (but I may be missing something).

4. Fourth line of the proof of Proposition 4.3: change “are pairwise non-congruence modulo” but replacing “congruence” with “congruent”.

5. Last page of the proof of Proposition 4.3, starting at “Following Selberg…” through the end of the proof: I personally found the original argument, using a sum over the variable , followed by Mobius inversion, to be easier to follow. I’m not an expert on these things, so take that for what it’s worth. [By the way, I did find many of the other changes very helpful!]

6. Two offset equations after 4.18: I had a great deal of trouble trying to understand the claimed equality. Here is what I think needs to be done, but again maybe I’m missing something.

First, on the RHS on the inner sum, I believe that the third line of the subscript should be .

Second, even with this change I don’t believe that this is a true equality. For instance, if is prime, and is a power of , then this should contribute a single term to the summation on the LHS (from ) that is not accounted for on the RHS. This is easily fixed by adding to the RHS. Perhaps a short word on this issue should be mentioned?

7. In equation (4.19), in the second line of the subscript of the inner sum, I believe you want .

8. A few lines later, change “arihmetic” to “arithmetic”.

I hope these are helpful to you. The paper is looking very good from what I’ve seen.

I don’t want to interfere with Emmanuel’s editing, so I won’t edit in these corrections directly, but just wanted to comment on a few of them:

regarding point 3b, you are correct that for our application, g is not fixed, but instead lies in an fixed equicontinuous family of functions, namely the translates of a fixed function. I think we can address this by strengthening the “Moreover” claim of the lemma to something like “Moreover, this formula holds uniformly when g is not necessarily fixed, but instead varies within a fixed equicontinuous family of functions”. One would also need to slightly modify the first line of the proof of this lemma accordingly.

regarding point 3j, I think this estimate is needed to control the terms of a few lines previously. (But perhaps this could be stated explicitly.)

Regarding point 6: you’re right that should be , but I don’t see the problem with any missing terms here. The identity has nothing to do with the nature of , it is simply asserting that if n is an integer, then holds if and only if there is an such that , and furthermore this a is unique. Again, maybe one could state this more explicitly since it seems to be causing confusion.

On point 6, consider the polynomial . If we evaluate at , then we have . This is nearly the same as , except we are missing a factor . So, it appears that can divide and not divide , in the case that has a common factor with . [My earlier comment about the nature of was only meant to point out that this trivial exception can actually contribute to the summation, for arbitrarily large .] In particular, if is prime, and is (say) a power of , then we have , but we don’t have that for any .

Ah, I see now, thanks. Well caught! In any case we only need a lower bound here, so we can replace the = sign here by a sign (and note that it is almost an equality), after modifying some of the later displays on this page accordingly.

[Edit: actually, a lower bound here is not enough, due to the presence of the Mobius function elsewhere in the calculation. So the O(1) fix (or more precisely, ) is probably the simplest.]

We apply this equation to estimate . As g is supported on [0,1], d goes ranges up to , but then we can extend the range of integration on g up to infinity, again because of the support.

But actually, now that I think about it, we could alternatively remove the restriction from the first display (and have the integration range from 0 to infinity); but we also need to add a hypothesis that the support of is bounded uniformly in x (in addition to equicontinuity).

Do you know if the 700 bound is coming from a triple of primes inside an admissible k-tuple (and if so, what value of k he is using?). Even without knowing any details of his proof, if we know the parameters we might be able to sharpen his result by enumerating all the admissible k-tuples (at this point I have provably complete lists for all k’s with H(k) up to about 2000).
I note that there is no k for which H(k) is exactly 700 or 1400, so I’d be curious to know where the 700 is coming from.

Sharpening his result (even trivially) might give him an incentive to join forces with us :).

János Pintz (who is in Oberwolfach at the moment) told me that Maynard can take and this leads to . I understand you have a better for this . Valentin Blomer (who is also in Oberwolfach at the moment) told me that there is hope that any weak Bombieri-Vinogradov leads to bounded gaps between primes. All this is very exciting indeed. It is always a good sign when a proof simplifies and the result becomes stronger simultaneously.

The link you give is to a *test* version of the admissible tuples database (as indicated in the bold note at the top of the page, it was intentionally populated with sub-optimal tuples so that it would be easy for people to submit improvements). The correct link for the admissible tuples database is:

Yes, I believe by using the sieve with a multidimensional cutoff, one can deduce bounded gaps from any ; similarly, for any fixed , one should be able to get bounded gaps with any number of primes in it (not just 2). I’ve talked to Maynard a bit about this and have made some supporting calculations, but as Maynard has priority, it is probably not appropriate to discuss them in detail until after he has a preprint (or if he decides to join our efforts). [In any case, we have a paper to finish!]

From what I understand (without having seen Maynard’s work) is that the limit comes from the multi-dimensionality of the analysis (perhaps optimizations of a 5-dimensional sieve?). To get , one would need to work with , and in that case the dimension of the sieve would just be 2. In that case, using “more primes” would be of no use.

Dear Terry, I agree with you. I only shared information that I learned (through Pintz and Blomer) from public sources (namely a lecture in Oberwolfach). I hope that PolyMath8 will continue, and the sequel will be equally exciting. If Emmanuel and others have finished updating the paper, I will try to go through it quickly. It does not make sense to me to do this while the paper is constantly changing, and I also got busy with another project.

I was busy the last two days (I will give a lecture tomorrow concerning conductors of linear transforms of trace functions) but I will continue reading the paper carefully, probably finishing Section 5 by Saturday.

I then hope that the next sections will go a bit faster, but in any case Sections 1 to 4 can be re-read now I believe (one should probably update the proof of Lemma 4.2 at some point, since it is rather condensed at the moment). I will move the ek-subfolder version of Section 4 to the main folder tomorrow.

I got a cold on Thursday so I only finished going through heathbrown.tex today (and I didn’t yet review the corrections to gpy.tex suggested above — I will do it tomorrow).

Someone pointed out that the definition of discrepancy in Section 2 was not quite the usual one (used by BFFI and then by Zhang), because the condition (n,q)=1 was omitted in the “main term”. I think this was a typo (since other sections use the standard definition), and I corrected it. I’ll make sure as I continue reading that we use the notation consistently.

Apart from that, I only corrected typos and changed wording in minor ways (the proof of Lemma 5.1 is maybe very slightly simpler), and added a second reference for Siegel-Walfisz for the Möbius function, since I think that the paper of Siebert is probably impossible to find.

I’ve made some minor edits to the abstract and introduction, to emphasise more the distribution theorem on primes in arithmetic progressions we have that improves upon Zhang’s theorem (and also does not necessarily rely on Deligne’s work), given that this is the portion of our work which is not directly superseded by Maynard’s newest results. (I also made a small mention of Maynard’s work in the intro in a footnote. We should probably revisit the question of how to refer to his work, and/or to any future paper by us, when the paper is closer to being done. James tells me that he may have a preprint on his work ready within a week or two, so in particular we would be able to directly refer to his paper by the time we’re done with the proofreading.)

I have also been thinking about the best way to present the paper once Maynard’s work is public and fully checked. Indeed an essential contribution of Polymath8 is, in my mind, the new results on the distribution of primes in arithmetic progressions to large moduli (i.e., beyond the Riemann Hypothesis). I think that the new variants of the exponent of distribution (with densely divisible moduli) that we define, where one can significantly beyond the 1/2 bound, are certainly going to be useful in other applications. I also think that (building on Zhang’s work) we bring a potentially critical new insight to this question: the importance and potential of relying on Deligne’s work in its strongest forms. In some sense, this replaces the spectral theory of automorphic forms, and happens to bring extra flexibility.

I have finished editing Section 3. The only significant changes I made were to section 3.1 where I fixed some errors in the the fast admissibility test and tried to improve the clarity of the description.

I started reviewing exponential.tex and I am almost done (only the last corollary remains).

My version has relatively big changes in some technical aspects of the presentation (no important corrections).

(1) Instead of formal quotients, I use simply the extension of e_q to the projective line over Z/qZ. I think this fits better with the Deligne material (where this extension for finite fields is natural). In practice it makes almost no difference of course, but this means we can’t write e_q(0/0)…

Lemma 6.5 is changed a bit. Unless I misunderstood the intent of the formal quotients, there was a minor mistake as stated: if f=1/(qX²), then (f,q)=1, but (f’,q), as defined, is large since no cancellation between numerator and denominator seems to be intended in the formal quotient of polynomials. Indeed, the proof used an assumption that the denominator of f is monic, which did not appear in the statement.

I’ve adapted the lemma to my setting, and added a second part which was used implicitly: if (q,f)=1 and deg(P)<deg(Q), then then (q,f')=1 (adding the assumption that the degree of Q doesn't change by reduction modulo primes dividing q; this is automatic if Q is monic, which is the most likely case in applications, but it can be convenient to have this slight generalization).

(2) Because the results of this section are very general and I think it might be used by other people as a kind of nice summary and reference, I made it independent of the asymptotic convention of having a single variable x tending to infinity (and moduli q of polynomial size with respect to x, etc). This requires a bit more notation and care with q^{eps}'s and similar small quantities, but it seems worth it. (Similarly, the Deligne section is/will be independent of the asymptotic conventions.)

Ah, this is a nice way to deal with the division-by-zero issue! Some minor comments on this section:

* should the title of the section be “One-dimensional exponential sums” rather than “One-dimensional exponential sum”? Also, we should probably give a reason why we are suspending the asymptotic notation (namely, to produce bounds which are more portable for future applications).

* It was pointed out to me by Andrew Granville that the assertion after (6.4) that (6.3), (6.4) are basically the only 1D complete exponential sum bounds we need is not quite true; due to the van der Corput process, we also need to control sums of the form .

* We could put absolute values on the LHS of (6.7)-(6.9) for emphasis (although it is, strictly speaking, unnecessary). Similarly for the display in the proof of Lemma 6.8.

* in the display after (6.11) defining the inner product, one could use := instead of =.

* The hypothesis that d_1,d_2 be coprime in Lemma 6.8 should probably be dropped.

* Remark 6.10: “prohibitive restriction” should be “a prohibitive restriction”. Also, I am not sure what uniformity of estimates means in this context – what parameter is one considering uniformity with respect to?

* Remark 6.13 is still written with the asymptotic notation of the paper in force (i.e. using \lessapprox instead of \ll q^\eps). Similarly for the final corollary (but I guess you are still working on that).

Thanks for the remarks and corrections!
Remark 6.10 is indeed unclear (I re-read the original version and I realize that I probably did not capture the meaning while trying to expand the remark). What I mean by “uniformity” is that the estimates are fully explicit and often give at least some decay of incomplete sums, independently of any special features of the rational function f (in the case of e_q(f(n))) with the exception of the numerator and denominator (so it is uniform with respect to the summand in wide classes). I have in mind that (as indeed with estimates for primes in arithmetic progressions) it is not so much the extent of the gain compared with the trivial bound that may be relevant for applications, but rather the fact that a relatively weak decay is valid uniformly.

I will finish Remark 6.13 and the corollary and make the changes (though maybe only tomorrow morning…)

Andrew Granville has released a first draft of his “Current Events” article for the Bulletin of the AMS on “Bounded gaps between primes” at http://www.dms.umontreal.ca/~andrew/CurrentEventsArticle.pdf . Among other things, he traces out the “minimal” route to bounded gaps from our Polymath draft paper (avoiding Deligne, dense divisibility, or really narrow admissible tuples). (Of course, with the upcoming Maynard preprint, there will likely be a much shorter route to just bounded gaps between primes, but this route also establishes a distribution theorem for primes in arithmetic progressions that goes beyond Bombieri-Vinogradov and should be useful in other contexts.)

Anyway, I am sure he would be interested in any feedback on the article.

I read through the article very quickly, just to get a feeling for the techniques and ideas. I thought the statement on Zhang was very nice. I also liked how each of the arguments was worked out explicitly. The first few sections were an easy read for non-experts.

There were two things I ran into that were a bit confusing to me–probably typo related. On page 30, one of the corollaries has “” which I couldn’t see used anywhere. Also, on page 34, there is a place which says “…with . Evidently, …” This sudden reversal of inequalities was confusing.

Finally, there was one question I had for the experts. One aspect of this work I’ve been working the hardest to figure out is the sieve portion of the argument. In Granville’s paper, he takes the weights slightly differently than is done in the polymath paper. Am I correct in gathering that his choice, while sufficient, is not as optimized as the polymath choice? How much affect do these differences incur?

I made a mistake in Footnote 10, suggesting that the proposed strengthening of Conjecture 5.3 in [BFI] is wrong, when in fact it is still open. Friedlander had told me this a few weeks ago, but I neglected to edit it right away, and forgot!

There was a small editing conflict in the first part of the introduction. I tried to make our edits compatible, but feel free to undo (I promise I won’t touch Section 1 anymore). Essentially I just moved the reference to Bombieri’s proof of the Weil conjectures to the point where you mentioned Stepanov: as far as I understand it, Bombieri’s proof is a natural reinterpretation/generalization of Stepanov’s proof, which was restricted to hyperelliptic curves? I also changed the precise reference (we referred to the wrong article, I think).

I also have a consistency question: we sometimes attribute the prime tuples conjecture to Hardy-Littlewoord, and sometimes to Dickson-Hardy-Littlewood. Maybe we should uniformize this? Which one is more standard?

I believe the statement “for all positive integers k, DHL[k,k] holds” is precisely the intersection of Dickson’s conjecture (1904) and the first Hardy-Littlewood conjecture (1928), both of which are more general (Dickson treats arbitrary linear forms, Hardy-Littlewood gives a precise asymptotic, not just an infinite set).

I found that the ek copy of biblio.tex was not synchronised with the main copy, so I syncronised it by hand; also I added a reference to Granville’s new survey at the end of Sec 2 (and commented out the speculation as to whether one can proceed just with Bombieri-Vinogradov, as the question seemsto have been answered by Maynard.)

A few more typos:
1. Table 1 (p8), is now according to the new notation in Def 2.14
2. p74, immediately after proof of Lemma 5.3, “We will can prove”: “can” can probably be changed to “now”.
3. Bottom of p74, definition of is probably missing “1+”.
4. p81, 4th display & 2 lines above it, there are 2 uses of . Probably better to use to be consistent with later and with other sections.
5. p83, 1st paragraph of sec 6.3,”We with a” is missing a verb like “begin”.
6. p 170, Ref 22, missing \”a in Birkhauser.
7. p 172, Ref 75, Yildirim may be changed to Y\i ld\i r\i m.

I think that footnote 5 on page 13 isn’t correct. Bombieri and Davenport’s proof does not give (1.1) on EH, in fact assuming EH does not help in their proof at all. The problem is that the major arc approximation used there becomes worse for Farey arcs with denominator larger than N^(1/2) because the arcs become so short. Even if one replaces Bombieri and Davenport’s arc widths 1/Q^2 with 1/qQ this does not help. One can generalize their proof as I did in http://www.math.sjsu.edu/~goldston/article17.pdf and obtain (1.1) on EH for both Lambda and the sieve weight Lambda_Q (stated on the second page of that paper,) but I’ve never believed that assuming EH for an object like Lambda_Q is worth anything. I tried to get (1.1) from EH for Lambda and for Mu, but never succeeded.

Thanks for the correction! Actually what you write is pretty much what I had read in your paper when writing my Bourbaki report on your work with Pintz and Yildirim (http://www.math.ethz.ch/~kowalski/goldston-pintz-yildirim.pdf, see Remarque 3.1, which mentions your paper and that “a certain form of EH would imply…”), but when I wrote the footnote in the Polymath8 paper I just vaguely remembered the conclusion (and had forgotten the rest…)

I think it’s simpler here to delete the footnote (in the earlier account, it was useful as a way of showing a faint historical trail, and emphasizing the link between the original motivation of Bombieri for the B-V theorem and gaps between primes.)

Here’s a question concerning the reduction to exponential sums (with references to the current newgap.pdf in the main folder): it seems to me that when completing the sum, we are doing as if (in (7.22) and later) the modulus of the sum that we complete is q_1q_2r instead of [q_1,q_2]r (see the value of H in (7.23)). Of course this is irrrelevant in terms of the length of the dual sum over h, which can be increased, but it affects the factor 1/H=M/q (roughly) from Lemma 6.9 (ii), which seems to be bigger than what we write by a factor q_0.

If I understood this right, this means we need a target (7.27) smaller by a factor 1/q_0, i.e., a RHS x^{-\eps}Q^2NR/q_0^2.

From a quick read-through of the remainder ot Section 7 and of Section 10, this is actually something we prove, because the exponents of q_0 in all conditions to check are always strong enough (and of course it doesn’t affect the generic case q_0=1). But before making the changes to (7.27), I’d like to make sure that I am not missing something obvious here…

Oops, you are right, this extra factor of is indeed missing, but we have enough powers of to spare to deal with this. (I checked Andrew Granville’s version of the argument and it looks to be OK because he does the completion of sums a little differently.) I’ll make sure to look out for the powers of q_0 when you’re done with this section (and with Section 10).

I am getting confused now with the next step (the Type II estimate) — on the last line of Page 109 of the main folder version, the powers of q_0 are tight (q_0^{-2} on both sides), so if we need a better saving in q_0, it doesn’t work outright (precisely, we need q_0^{-4} after squaring, but we only gain one q_0 from the smaller size of H).

This doesn’t seem to be problematic for Type I estimates on the other hand.

The problem comes from the absence of gain in q_0 in the second term of Proposition 7.8, which however seems wasteful. (I haven’t checked yet in Zhang’s treatment or Andrew’s to compare…)

Oh dear, this is a non-trivial issue. For non-zero , I think it can be repaired by exploiting more fully the constraint in the phase (7.24), which after fixing constrains to a single residue class modulo . This gives an additional factor of saving in (7.28) which seems to give enough room to fit in the loss of you pointed out.

The case remains a problem. Of course, by restricting to this case, one saves a factor of , but the way things are set up right now, is tiny (as small as ) and so this does not directly help us. Zhang sets to be larger than this in the Type II case (something more like , if I remember correctly), which would solve this problem but worsen the Type II numerology a little bit, which would be annoying.

The other thing we could do is try to eliminate the case earlier, before completion of sums. This is actually what Zhang does, but the price one pays for this is that the residue class that one initially works with can no longer be completely arbitrary, but has to obey a “controlled multiplicity” condition, basically for any , one cannot have more than moduli for which .

I think the issue may also affect Andrew’s version of the argument; he correctly tracked the powers of (which he calls ) all the way through to the third display of page 49, which I have not yet checked carefully. (Also there is a tiny typo in that paper: in the penultimate display of page 47, should be rather than .) I have to run for now, but will return to this issue later.

OK, I think I have a fix which is not too bad. It involves treating the case separately, and using the additional constraint on in the case. The precise changes are as follows

1. After the paragraph containing (7.20), declare that we will treat the contribution first. Here we do not split off an X term, and simply aim to bound

in magnitude by .

We write and the divisor bound to crudely bound this quantity by

By the Chinese remainder theorem, the l summation is (one has to note that , which can be deduced from (7.2), (7.12), (7.13) with plenty of room to spare), so we now have

which sums to , which is acceptable since .

For the rest of section 7 we insert the condition .

2. Now that k is nonzero, we can insert an additional factor of on the RHS of (7.27), thanks to Lemma 1.6. We’ll need this factor later.

3. In Section 7.3, we save the constraint from (7.24) and combine it with the weight. Observe that as is coprime to , this constraint restricts to at most residue classes modulo . If we use (1.5) instead of (1.4), we can now insert a factor of in (7.28) and the preceding display after inserting the weight (and removing the equality symbol in the display before (7.28)). This gains us an extra factor of in the rest of the argument which should compensate for the loss of that you noted previously.

4. Some minor modifications may need to be made to the Type I arguments (either we save the factor of as in the Type II estimates, or we simply throw this improvement away and take the loss of ).

I’ll check and incorporate this version of the argument tomorrow. At first sight it certainly seems reasonable (and philosophically it would be strange if the numerology with gcd q_0 >1 does not conform to that with q_0=1…)

page 97, after the discussion of the off-diagonal contribution: “there can be cancellation between these non-negative terms” should probably be “there cannot be any cancellation between these non-negative terms”. Also, in footnote 18, I was not quite clear as to what “repeating the computation” meant… I guess you are trying to say that we should not be too fixated on the operator norm per se, but rather on the computational techniques used to estimate such norms?

around (7.4): ironically, with our new fix, the case k=0 that we considered “for simplicity” becomes the case for which we do NOT apply the method indicated! But perhaps this is OK as long as we admit that we are “lying” when giving this oversimplified presentation.

In (7.29), a factor of is missing on the RHS. The definition of H here is now very slightly different from that in the previous section, but they are equivalent up to constants; actually, for consistency, we may wish to take H equal to before Remark 7.7, rather than .

In footnote 18, I wanted to mean that, in some sense, we repeat three or four (or more…) times the same technique, more or less explicitly, instead of having stated an abstract lemma and applying it. (It is not a very important remark so it might be removed.)

I’ll take care of the other corrections. Actually, it seems the exponential sum estimate is now even better than strictly needed in terms of q_0 when keeping track of the support in n for k non-zero (restricting to few congruence classes modulo q_0) but I haven’t propagated this improvement to the main statement of Theorem 7.8 (of the ek version).

I am done reviewing Section 7. The only other remark I had is that I think the middle-condition (which is always the same) in the type I cases seems to involve sigma because one needs a lower-bound for N, namely it should be

8\varpi + 3\delta< 1/2-\sigma

instead of

20\varpi+6\delta<1.

I might have misunderstood, but in any case this condition is always weaker than the first condition, so it doesn't change anything.

The gain of q_0 was also needed for the second Type I estimate, but not (as far as I could see) for the first.

[Thanks for editing, Terry. The <‘s and >’s made quite a mess. The original was quite a bit longer. I cannot fix 5, so I will try to avoid <‘s:]

5. p 79, 4 lines under previous, another , should be *?
6. p 84, statement and proof of Lemma 6.5, (1) and (2) can be changed to (i) and (ii) for consistency.
7. p 84, line after (6.6), b should be (b,q)?
8. p 85, line after 1st display of proof of Prop 6.6, Z/qZ should be Z/pZ?
9. p 86, 3rd line of proof of Lemma 6.8, should be .
10. pp 89-90, statement of Prop 6.10, (i) and (ii) need to be in roman font to be consistent with other places. (Same for several places in Sec. 7)
11. p 90, line -3 of Remark 6.11, “Lemmas 6.9 and 6.10” should be “Lemma 6.9 and Proposition 6.10”.

I have now put the deligne.tex section in the ek subfolder. The changes there are few:

(1) I added references to some more surveys (one a slightly older one of mine, and an upcoming one by Fouvry, Philippe and myself (that will appear in the Colloquio di Giorgi series, and which will be ready soon)

(2) some exponents for conductors were a bit off, I think (this is immaterial for the applications of course)

(3) I clarified a bit subsection 8.3 which might have been a bit unclear; one or two references might be useful there, which I will take care of.

(4 I wrote the section on incomplete sums as in exponential.tex, independently of the standard asymptotic convention

(5) I gave some details in Corollary 8.25 on how to sum with the coprimality condition, since this type of restriction is not directly handled by Section 6.

For commenters

To enter in LaTeX in comments, use $latex <Your LaTeX code>$ (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy.