The elementary Selberg sieve and bounded prime gaps

This post is a continuation of the previous post on sieve theory, which is an ongoing part of the Polymath8 project to improve the various parameters in Zhang’s proof that bounded gaps between primes occur infinitely often. Given that the comments on that page are getting quite lengthy, this is also a good opportunity to “roll over” that thread.

We will continue the notation from the previous post, including the concept of an admissible tuple, the use of an asymptotic parameter going to infinity, and a quantity depending on that goes to infinity sufficiently slowly with , and (the -trick).

The objective of this portion of the Polymath8 project is to make as efficient as possible the connection between two types of results, which we call and . Let us first state , which has an integer parameter :

Conjecture 1 () Let be a fixed admissible -tuple. Then there are infinitely many translates of which contain at least two primes.

Zhang was the first to prove a result of this type with . Since then the value of has been lowered substantially; at this time of writing, the current record is .

There are two basic ways known currently to attain this conjecture. The first is to use the Elliott-Halberstam conjecture for some :

Theorem 3 (EH implies DHL) Let be a real number, and let be an integer obeying the inequality

where is the first positive zero of the Bessel function . Then implies .

Note that the right-hand side of (2) is larger than , but tends asymptotically to as . We give an alternate proof of Theorem 3 below the fold.

Implications of the form Theorem 3 were modified by Motohashi and Pintz, which in our notation replaces by an easier conjecture for some and , at the cost of degrading the sufficient condition (2) slightly. In our notation, this conjecture takes the following form for each choice of parameters :

Conjecture 4 () Let be a fixed -tuple (not necessarily admissible) for some fixed , and let be a primitive residue class. Then

for any fixed , where , are the square-free integers whose prime factors lie in , and is the quantity

In particular, since is conjecturally true for all , we conjecture to be true for all and .

Proof: Define

then the hypothesis (applied to and and then subtracting) tells us that

for any fixed . From the Chinese remainder theorem and the Siegel-Walfisz theorem we have

for any coprime to (and in particular for ). Since , where is the number of prime divisors of , we can thus bound the left-hand side of (3) by

The contribution of the second term is by standard estimates (see Proposition 8 below). Using the very crude bound

and standard estimates we also have

and the claim now follows from the Cauchy-Schwarz inequality.

In practice, the conjecture is easier to prove than due to the restriction of the residue classes to , and also the restriction of the modulus to -smooth numbers. Zhang proved for any . More recently, our Polymath8 group has analysed Zhang’s argument (using in part a corrected version of the analysis of a recent preprint of Pintz) to obtain whenever are such that

The work of Motohashi and Pintz, and later Zhang, implicitly describe arguments that allow one to deduce from provided that is sufficiently large depending on . The best implication of this sort that we have been able to verify thus far is the following result, established in the previous post:

This complicated version of is roughly of size . It is unlikely to be optimal; the work of Motohashi-Pintz and Pintz suggests that it can essentially be improved to , but currently we are unable to verify this claim. One of the aims of this post is to encourage further discussion as to how to improve the term in results such as Theorem 6.

We remark that as (5) is an open condition, it is unaffected by infinitesimal modifications to , and so we do not ascribe much importance to such modifications (e.g. replacing by for some arbitrarily small ).

The known deductions of from claims such as or rely on the following elementary observation of Goldston, Pintz, and Yildirim (essentially a weighted pigeonhole principle), which we have placed in “-tricked form”:

Lemma 7 (Criterion for DHL) Let . Suppose that for each fixed admissible -tuple and each congruence class such that is coprime to for all , one can find a non-negative weight function , fixed quantities , a quantity , and a fixed positive power of such that one has the upper bound

the lower bound

for all , and the key inequality

holds. Then holds. Here is defined to equal when is prime and otherwise.

By (8), this expression is positive for all sufficiently large . On the other hand, (9) can only be positive if at least one summand is positive, which only can happen when contains at least two primes for some with . Letting we obtain as claimed.

In practice, the quantity (referred to as the sieve level) is a power of such as or , and reflects the strength of the distribution hypothesis or that is available; the quantity will also be a key parameter in the definition of the sieve weight . The factor reflects the order of magnitude of the expected density of in the residue class ; it could be absorbed into the sieve weight by dividing that weight by , but it is convenient to not enforce such a normalisation so as not to clutter up the formulae. In practice, will some combination of and .

Once one has decided to rely on Lemma 7, the next main task is to select a good weight for which the ratio is as small as possible (and for which the sieve level is as large as possible. To ensure non-negativity, we use the Selberg sieve

where takes the form

for some weights vanishing for that are to be chosen, where is an interval and is the polynomial . If the distribution hypothesis is , one takes and ; if the distribution hypothesis is instead , one takes and .

is used for some additional parameter to be optimised over. More generally, one can take

for some suitable (in particular, sufficiently smooth) cutoff function . We will refer to this choice of sieve weights as the “analytic Selberg sieve”; this is the choice used in the analysis in the previous post.

However, there is a slight variant choice of sieve weights that one can use, which I will call the “elementary Selberg sieve”, and it takes the form

for a sufficiently smooth function , where

for is a -variant of the Euler totient function, and

for is a -variant of the function . (The derivative on the cutoff is convenient for computations, as will be made clearer later in this post.) This choice of weights may seem somewhat arbitrary, but it arises naturally when considering how to optimise the quadratic form

(which arises naturally in the estimation of in (6)) subject to a fixed value of (which morally is associated to the estimation of in (7)); this is discussed in any sieve theory text as part of the general theory of the Selberg sieve, e.g. Friedlander-Iwaniec.

The use of the elementary Selberg sieve for the bounded prime gaps problem was studied by Motohashi and Pintz. Their arguments give an alternate derivation of from for sufficiently large, although unfortunately we were not able to confirm some of their calculations regarding the precise dependence of on , and in particular we have not yet been able to improve upon the specific criterion in Theorem 6 using the elementary sieve. However it is quite plausible that such improvements could become available with additional arguments.

Below the fold we describe how the elementary Selberg sieve can be used to reprove Theorem 3, and discuss how they could potentially be used to improve upon Theorem 6. (But the elementary Selberg sieve and the analytic Selberg sieve are in any event closely related; see the appendix of this paper of mine with Ben Green for some further discussion.) For the purposes of polymath8, either developing the elementary Selberg sieve or continuing the analysis of the analytic Selberg sieve from the previous post would be a relevant topic of conversation in the comments to this post.

— 1. Sums of multiplicative functions —

In this section we review a standard estimate on a sum of multiplicative functions. We fix an interval . For any positive integer , we say that a multiplicative function has dimension if one has the asymptotic

for all ; in particular (since as ) we see that is non-negative on for large enough. Thus for instance

defined in the introduction have dimension . Dimension interacts well with multiplication; the product of a -dimensional multiplicative function and a -dimensional multiplicative function is clearly a -multiplicative function.

We have the following basic asymptotic in the untruncated case :

Lemma 8 (Untruncated asymptotic) Let Let be a fixed positive integer, and let be a multiplicative function of dimension . Then for any fixed compactly supported, Riemann-integrable function , and any that goes to infinity as , one has

Proof: By approximating from above and below by smooth compactly supported functions we see that we may assume without loss of generality that is smooth and compactly supported. But then the claim follows from Proposition 10 of the previous post.

We remark that Proposition 10 of the previous post also gives asymptotics for a number of other sums of multiplicative functions, but one (small) advantage of the elementary Selberg sieve is that these (slightly) more complicated asymptotics are not needed. The generalisation in Lemma 8 from smooth to Riemann integrable implies in particular that

and conversely Lemma 8 can be easily deduced from (12) by another approximation argument (using piecewise constant functions instead of smooth functions). We also make the trivial remark that if is non-negative and is any subset of , then we have the upper bound

for any non-negative Riemann integrable .

Actually, (12) can be derived by purely elementary means (without the need to explicitly work with asymptotics of zeta functions as was done in the previous post) by an induction on the dimension as follows. In the dimension zero case we have the Euler product

Now suppose that and that the claim has been proven inductively for . We again may decompose (14), but now has dimension instead of dimension zero. Arguing as before, we can write the left-hand side of (12) as

The contribution of the and error terms are acceptable by induction hypothesis, and the main term is also acceptable from induction hypothesis and summation by parts, giving the claim.

— 2. Untruncated implication —

We first reprove Theorem 3. The key calculations for and are as follows:

Lemma 9 (Untruncated sieve bounds) Assume holds for some and some . Let be a smooth function that is supported on , let be a fixed admissible -tuple for some fixed , let be such that is coprime to for all , and let be the elementary Selberg sieve with weights (11) associated to the function , the sieve level and the untruncated interval . Then (6), (7) hold with

for non-zero can be made arbitrarily close to (the extremiser is not quite smooth at if one extends by zero for , but this can be easily dealt with by a standard regularisation argument), and Theorem 6 then follows from Lemma 7 (using the open nature of (2) to replace by for some small ).

It remains to prove Lemma 9. We begin with the proof of (6) (which will in fact be an asymptotic and not just an upper bound).

where the implied constant in the notation can depend on . The square of this expression is then

The left-hand side of (20) is now expressed as the sum of the main term

and the error terms

for and

The main term has already been estimated as (24). From Proposition 8 we have

for , and so all of the error terms end up being , and (7) follows. This concludes the proof of Theorem 3.

— 3. Applying truncation —

Now we experiment with truncating the above argument to to obtain results of the shape of Theorem 6. Unfortunately thus far the results do not give very good explicit dependencies of on , but this may perhaps improve with further effort.

Assume holds for some and some . Let be a smooth function that is supported on , let be a fixed admissible -tuple for some fixed , let be such that is coprime to for all , and let be the elementary Selberg sieve with weights (11) associated to the function , the sieve level and the truncated interval . As before, we set

and seek the best values for for which we can establish the upper bound (6) and the lower bound (7). Arguing as in the previous section (using (13) to control error terms) we can reduce (6) to

If we crudely replace the truncated interval by the untruncated interval and apply Proposition 8 (or (13)) we may reuse the previous value

for here, but it is possible that we could do better than this.

Now we turn to (7). Arguing as in the previous section, we reduce to showing that

We can, if desired, discard the constraint here by arguing as in the previous section, leaving us with

Because we now seek a lower bound, we cannot simply pass to the untruncated interval (e.g. using (13)), and must proceed more carefully. A simple way to proceed (as was done by Motohashi and Pintz) is to just discard all less than , only retaining those in the region between and . The reason for doing this is that the parameter is then forced to be at most if one wants the summand to be non-zero, and so for the summation at least one can replace by without incurring any error. As in the previous section we then have

with given by (27), and some smooth supported on which is non-negative and non-increasing on . Then implies .

For large enough depending on the hypotheses in Theorem 10 can be verified (e.g. by setting for a reasonably large ) but the dependence is poor due to the localisation of the integral in the denominator to the narrow interval . But perhaps there is a way to not have such a strict localisation in these arguments.

98 comments

Incidentally, I heard back from Pintz, who has confirmed the
issues raised about his most recent preprint on bounded gaps between primes, but believes that one can salvage the estimates with somewhat worse explicit numerical constants.

I give up this extra because I don’t assume as much on the test function – only that it is non-negative and non-increasing, which is enough to get the inequality

for any . With a monomial choice for such as one gets the improved bound

(in fact this is an equality). But of course with the most recent sieves, is not a monomial but is instead a Bessel function. Perhaps there is still something to be squeezed out here but considering how is relatively small compared to in practice, this is unlikely to yield dramatic improvements in the final value of .

7. Somehow the proof of (12) in the dimension zero case () has disappeared, although it is needed later.

8. In the display below (10), should be . Also, I think, was not defined in this post.

9. In the display below Lemma 9 (formerly Lemma 10), the factorials should be omitted. Alternately, in the next display, the denominator should be multiplied by .

10. In (17) the exponent is missing from the end.

11. In the third display below (19), should be . In the next display, should not be squared.

12. In the display before (20), it would be better to use a different letter for , as is reserved already.

13. I don’t quite see how to get (24) from (23). In the intermediate steps we are dealing with terms that depend on . So I only see (24) as a lower bound (which is sufficient), by fixing to be large and then considering a large but fixed number of ‘s. The more ‘s we consider, the closer we get to the constant in (24). Here I also used positivity from (22).

14. In (25) and in the next display, should be on the left hand side. More importantly, in (25), the factor should be , and the summation should be restricted to coprime with . So I think the term should be detached at this point, and the terms should be bounded from above immediately, by relaxing the coprimality condition. That is, I think, the two displays following (25) are not quite right: for should really be .

15. In the last display of Section 2, a is missing on the right hand side.

Thanks for the explanation regarding #13 (and the whole blog entry). Further small comments: the second display after (25) now doesn’t parse, should be at several places, and, in Theorem 10, should be .

2. The statement of conjecture 4 does not include its dependence on and “.

3. In theorem 6, can one change the bounds 1/4 and 1/2 on and (which seems to be only convenient choices for the analysis), and in such a case does the (modified) analysis leads to a different constraint (with coefficients different than 207 and 43) on and ?

4. It seems that the above constraint (with 207 and 43) is no longer necessary in theorem 10, but in this case is it important that the integral in the denominator should be over sub-interval of [0, 1]? (in such a case, this impose a new restriction on and ).

[Corrected, thanks. As Gergely noted above, the condition is a bit arbitrary but certainly far more permissive than anything one could reasonably hope for at this juncture. The condition is necessary though; the conjecture is almost certain to fail at the endpoint (given that Elliott-Halberstam is known to fail at the corresponding endpoint – T.]

Correction to my third remark above: $omega$ and $delta$ do appear in the statement of conjecture 4! – but without any restrictions (even positiveness), also it is not clear what is w (the endpoint of the interval I).

Actually, I intend Conjecture 4 to be a family of conjectures , one for each choice of , rather than a single conjecture, so I don’t want to quantify the within the conjecture. As stated after Proposition 5, it is conjectured that is true for all and .

The conventions on w are introduced in the second paragraph of the blog post and are assumed to be in force throughout in order not to have to state "Assume is a sufficiently slowly growing function of " in every single assertion in the post.

It would be contra-productive to consider or , hence the conditions and are natural. In fact we want to consider up to level , hence up to level , with prime divisors up to level , hence the stronger condition is also natural to the whole setup. This gives automatically that integration in Theorem 10 is above a subinterval of . At any rate, decreasing only makes easier, so if a monotonic condition (like the one you quote from the online reading seminar) is necessary for small values of , then it is also necessary for larger values of . Currently we can only hope for when or so.

A few related questions (some of which have been partly raised here and there) : Is there any closed formula or at least tight approximations regarding the connection between the gap and the parameter , including the case of small values of (in the pace things are going on here, this is starting to be important!)? In this connection, can one give a rigorous explanation and good error estimates to the claimed asymptotic expression

To get an upper bound , one can already use the construction of Zhang, namely take the tuple to be consecutive primes where is the first prime larger than , as this is automatically an admissible tuple (this is a nice exercise to check) and has diameter from the prime number theorem. In the opposite direction, the Brun-Titchmarsh inequality (or more precisely, the proof of this inequality) gives the lower bound ; in fact we have the more precise inequality

for any , where the constant c can be taken as small as , thanks to the work of Montgomery-Vaughan (actually they get the slightly larger value of , the value is an unpublished work of Selberg).

Thanks for the answer and for creating the Wiki page. The information in that page is indeed useful. Two minor remarks on this page: First, it seems that the upper bound on is somewhat missing from the “Upper bounds” section. Second, there are very minor typos [ two cases of missing “)” ] in the introduction.

Another question, perhaps somewhat innocent: Is there any proof that for each natural number $k$ there exists a – admissible set? For the bounded gap project it is of course not needed to look at too large numbers (less than 40000 now), but still, the question is theoretical.

Additional parameters shall come from a new or modified argument. Such arguments are welcome, this is one of the purposes of this blog (aka polymath8 project). On the other hand, the pair is far from optimized yet, and it also unclear how small we can deduce from a given pair . Finally, for a given it is not clear how to find an admissible set of smallest diameter. In short, there are already a handful of arguments and algorithms to be optimized (and optimization is already underway!), which is the principal goal of the polymath8 project (in my opinion).

Thanks for both answers. It seems from the discussion below that some ideas have already been suggested. At least some of them may indeed help in one way or another in the optimization. However, let me add regarding these suggestions that my intuition is that more fundamental parameters (or functions) can be introduced, beyond some somewhat arbitrary constants which implicit or explicit in certain expressions. They may also help in the theoretical analysis (not just for optimization). But first a more concrete example should be suggested, and I will try to find such a one.

Is this the right place to post dumb questions about the polymath project? If yes, I’m wondering why everyone is hunting for the smallest possible tuples. Could it be that there are some larger tuples than the smallest ones found so far, but which have smaller diameters? Maybe there are systematic ways to look for them.

If one removes one or more elements from an admissible tuple, then one still gets an admissible tuple. So if one finds a large tuple with small diameter, then by removing elements one can get a smaller tuple with smaller diameter. So one may as well work with the smallest size of tuple that one can get away with (right now we know that is a safe choice to use, but I expect this number to decrease in the next week or so as our understanding of various aspects of Zhang’s argument improve).

The two frontiers of optimizing Zhang (a la Pintz/Green) for parameters are: the diagonal contribution from Weyl shifting in S13&14, and the non”singular” contribution in Section 11. For the former, as suggested by FI in their paper (see remarks to Theorem 4), an improvement perhaps could come along the lines of using Holder rather than Cauchy. What I think they mean is: from our argument you won’t beat the main from Deligne, but the secondary terms (involving interval lengths moreso than the modulus) might be improved. This is Ok for us, as Zhang saves his in the main term, and so we mainly need concern about the secondary.

The latter frontier in S11, directly applies Lemma 11 and the main of Deligne would again be difficult to beat. I don’t prognosticate so much movement there.

If nothing else, the value of should decrease somewhat via finer sieve analysis, which permits a slight decrement.

I suppose we should check there is no way of excluding the contribution from those which factor as with almost exactly for all by some other means. I don’t see why one would expect there to be. But if we’re not in this case we should get a small gain (you should be able to arrange the factorisation more critically).

5. Since the integral defining $\kappa$ in (27) should be over a sub-interval of (0, 1], we need to impose the (additional) appropriate restriction:

(note that this restriction does NOT follows from the other restrictions in theorem 10!)

[Corrected, thanks -T.]

6. Concerning the exact computation of $\kappa$ by (27), it seems that the simplest method to compute the integral in (27) is to use the binomial formula to expand the denominator of the integrand into powers of t, from which the indefinite integral is obvious, but this method is very ill conditioned numerically (if the lower limit of the integral is very close to 1) because of sign alternations by expanding (1-t) to a high power – resulting by numerical near cancellation of several (relatively) large consecutive terms.

Fortunately, there is a similar method which is much more numerically stable: the idea is to change the integration variable into and the resulting integrand gives the simpler (and numerically stable) result

Note that all the terms in the sum are positive with simple coefficients (instead of the binomial coefficients with alternating signs in the numerically unstable method).

To elaborate my previous comment, maybe there is an incredibly tedious way to make tiny gains, as follows.

If not all the prime factors of are pretty much exactly , then we can hopefully make gains by arranging factorisations suitably (I haven’t checked this carefully).

Suppose then that all the prime factors of are almost exactly .

If I calculated correctly, in the extremal case at the Type I/III boundary, we have two splittings , one in the Type I case and one in the Type III. In the Type I case the critical value of (so I calculate) is with . With our factorisation of into smooths, we were only able to guarantee with . But unless divides exactly, we can do slightly better. And presumably for the published optimal values of and , this is not the case? I haven’t checked.

Similarly one would hope to do slightly better at the Type III boundary, where we instead want with, I think, .

You can always slip one way or the other by a smidge w/o changing the best integral . I chose for simplicity, but it is really not the maximum with the Bessel zeros, considering them as a smooth function in from the approximation. This could yield us some (tiny) leeway as you indicate. Whether or not it would allow to be decremented (as an integer) is a different issue.

I plan to expand this post a bit later by fleshing out the number theoretic side to this combinatorial argument (i.e. Section 6 of Zhang). Instead of one enormous post on Zhang’s theorem, I now plan to have two subsequent posts, one on the Type I/II arguments and one on the Type III arguments, which will split the polymath8 project into five (!) independent parts:

Currently we have blog posts for parts 3,4,5, while the rather overburdened reading seminar blog post is still holding the discussion for parts 1 and 2, but I hope to send out posts to cover those areas soon…

Since this gives a simple sufficient condition for (assuming of course the positivity of these parameters!),
Another direction of optimization is to find the explicit expressions for the coefficients 207, 43 and 1/4 in the constraint as functions of free (i.e. “convenient” arbitrary) parameters used by Zhang and Pintz (if I remember correctly the number 10 was used as a parameter and also 1\4).
I think that by knowing the exact dependence of 207, 43, 1/4 on these “free” parameters (along with their necessary restrictions), we can “play” with them as ADDITIONAL “control” parameters to further minimize

I think this is what v08ltu claimed to have done on 7 June, 2013 at 11:03 pm (see https://terrytao.wordpress.com/2013/06/03/the-prime-tuples-conjecture-sieve-theory-and-the-work-of-goldston-pintz-yildirim-motohashi-pintz-and-zhang/#comment-233364). Of course v08ltu worked with the analytic Selberg sieve (Theorem 6 above), while one can also try the elementary Selberg sieve (Theorem 10 above). From Terry’s comments I anticipate that Theorem 6 gives better results here. On the other hand, improving on either theorem might lead to a better . Also, we still need confirmation from the online reading seminar team that the constraints you mention are sufficient for to be admissible. There are many details to be checked, although my feeling is that the constraints are ok, and probably we’ll even see larger pairs in the near future.

Thanks!
(BTW, I meant to identify ALL the convenient arbitrary parameters used throughout the analysis and their necessary constraints in order to replace the 207, 43, 1/4 by these “control” parameters to optimize .

I am not sure what you mean by “convenient arbitrary parameters used throughout the analysis”. At this point we have Theorems 6 and 10 above to obtain an admissible , and they utilize the parameters and . Of course these are not arbitrary parameters, they are the ones that parametrize a natural generalization of Zhang’s main result (Theorem 2 in his paper), called here. More parameters would require an even more flexible generalization of Zhang’s result, and (most likely) an even more complicated treatment to achieve it. Already the present proof (in Zhang’s paper) is rather involved!

Perhaps I misunderstood you. At any rate, the coefficients 207, 43, 1/4 do not come from any control parameters. They come from analyzing Zhang’s proof step by step to see what is needed to achieve instead of Zhang’s original . Each step works for a certain set of pairs , and the final condition describes the intersection of all these sets. Zhang’s theorem says that the pair is in the intersection, and the online reading seminar tries to see what other pairs are included.

My questions: Is it true that one can slightly change the above parameters without destroying Zhang analysis? If so, my second question is about the
coefficients 207, 43, 1/4 – are they invariant to slight perturbations in the above “arbitrary” parameters, or they depend in some way on the “arbitrary” parameters ?(if the latter possibility is true then one may try to optimize k_0 using the above “arbitrary” parameters)

1. The 1/4 here is just a convenient (additive) rescaling, in that its square is 1/2, which was the previous barrier. This allows one to measure, in terms of , by how much one has beaten sqrt.
2. It seems as though the use of the better sieves and/or weighting functions has made this parameter evaporate. In the original GY work, it was around if I recall correctly (to optimize some binomial coefficient quotient), but I don’t think it matters now.
3&9&10. These have been rewritten in terms of by Green and Pintz. It turns out that is critical, while is not.
4&5. These come about the decomposition in Section 6. Any number at least 9 (I think) will work. This changes nothing in the argument to make it larger, as you will still have the same type of sums to investigate, only possibly with more convolvedness. As Tao pointed out in a comment somewhere, if you wanted to get bounds on how large had to be in order to guarantee a tuple, then they would be more relevant (and you would likely minimize the “10”).
6. The 19 in Lemma is 2*10-1 from 4&5. The 1/4 and 2/3 are just lower/upper bounds, as he also invokes this lemma with in an interval whose size in x-powers is always somewhere in . These put very weak bounds on .The 1/4 and 2/3 come nowhere into that lemma really (you can make them 1/100 and 99/100), as you get exponential decay any extra constants in the error term do not matter.
7. I have no idea why Zhang put that condition in Lemma 11. It looks funny to me (I even underlined it). He always invokes it with or something, so it is not a problem.
8. The 292 is just from I think.

Regarding #1, I also don’t see how this display follows from the previous one, i.e. how the second display below (20) implies the next one. It seems to utilize , but this might fail even when and . Grouping the terms according to and might help.

Zhang’s main objective was to get a finite number for k_0. Along the paper he did some simplifying assumptions (some parameters equal) and some “rough” estimates along the way so he would not be bogged down with sometimes complicated “details”. That is perfectly normal in such a complicated piece of mathematics.

As I can infer from the discussions here most of these inefficiencies have now been dealt with. All that while preserving his core arguments. There are no more “natural” parameters to wiggle around. If we want more parameters we must come up with new ideas in the main argument.

As ugly as the numbers 207 and 43 look like they are not “parameters” in any way they are numbers that appear by adding and multiplying what are most probably already tight estimates of very complicated terms.

That does not mean the estimates cannot be improved it just means that the tactics needed now are different.

Here is a suggestion how to decrease in Theorem 6, namely how to replace by and more. In the previous post, (21) results from (20) by extending the summation to , and then doing a partial Möbius inversion by summing over divisors in of and . It is simpler to do a partial Möbius inversion by summing over divisors in of . Note that iff . Then we get a version of (21) and (22) where , and we save the complications right below (21). This also simplifies the argument after (22), e.g. we don’t need Cauchy, and the value of can be decreased which is good.

[…] on sieve theory, bridging the gap begin Zhang’s Theorem 2 and the parameter k_0 (see also the follow-up post). 3) Efforts to find narrow admissible sets of a given cardinality k_0 — the width of the […]

Here is another attempt to decrease in Theorem 6. By the comments below (21) in the previous post, we can restrict to in (21) and later. Then, for a given , the number of representations is instead of . More importantly, we can now more efficiently bound

by the inequality
between the geometric and arithmetic means. All this shows that Theorem 6 is valid with the smaller value

.

Note that the exponent was originally , and the factor was originally .

Oh, I think that works! So now we have whenever we can find obeying the constraints

and

where is the quantity

Hopefully this will decrease the value of somewhat from its current value of 26,024 (and maybe close to the value of 25,111 that v08ltu computed using Pintz's overly optimistic (and now retracted) value of , though probably we won't get quite that far).

p.s. It’s nice that the process of correcting an error in the original argument led to a substantive improvement in the bounds of the argument; usually it works in the opposite direction. :)

That’s good news, thanks! Using SAGE I am getting a much smaller value of , which is probably in error. Let . Can you or Terry tell me which has the largest contribution to ,
and what are the numerical values and separately for that particular ? This will help me find the error in my code.

The gain of in the exponent means that (as dominates) the effect is in the limit , which is square of the previous (there is also no in front as with Pintz either). Small sized numerics are similarly superior. Any theoretical reason why this wins another power back? Does the better weighting function help the analysis, or is the analysis just more robust now?

I think the original analysis was a bit wasteful: the product of two numbers in was upper bounded by the larger of the two. Hence if both numbers were small (which is typically the case), only the effect of one of them was exploited. The new analysis is more efficient/natural in the sense that the effect of both numbers is used.

Regarding further improvement of (and also a secondary quantity – see below), I see two ways forward, but they have the drawback of making the formulae for more complicated

(1) Starting from Gergely’s formula

,

extract out the n=1 term (which experience has shown dominates all the other terms in practice) and compute it more carefully, in particular avoiding the use of the arithmetic mean-geometric mean inequality and perhaps also avoiding the use of the decreasing nature of the Bessel function. In other words, just keep the funny weighted Bessel integral as it is. This can be computed numerically and perhaps the numerical value is significantly better than the upper bound we currently have.

(2) This is an approach briefly mentioned in the previous post: instead of working purely on the “alpha” side of things, we can also work on the “beta” side of things and not throw away some positive terms. Namely, one should be able to improve the condition

to

where is a quantity somewhat similar in nature to . Again, we might only need to save the n=1 term given that it is likely to dominate all the others. It may be that the n=1 term for can be used to cancel a large portion of the n=1 term for which would lead to quite a noticeable improvement I think (of course one can’t hope to do better than eliminating kappa entirely though, or at least I don’t think this should be the case, otherwise we could make improvements to Theorem 3 as well as Theorem 6 which seems unlikely.)

For (1), is it correct that we would essentially be interested in where the -integral goes up to the point where one of the arguments exceeds 1? I pieced this together from (30) and following from the previous post, with elements from (22) and beyond interlaced.

I am slightly confused. I am looking at the proof of Lemma 7, and it seems that one can easily modify the proof to show DHL[, 3] for exactly the same value of simply by replacing in the place of in (9). Likewise for any fixed value of in DHL[, m]. Where does this break down?

@CraigH: to get one would need to replace not by , but by the significantly larger quantity . Unfortunately it does not appear that the GPY method lets one do this, even with the best distribution hypothesis (i.e. the Elliott-Halberstam conjecture) and the best choice of sieve weights.

[For some reason, comments are not coming in in order in this post; I am unable to locate the source of this error.]

So it seems the good way to proceed in the previous post is as follows. In (21) the presence of automatically gives that and . By the -trick we can also impose and , so that . Now we have the third display below (21) with the extra condition . We get rid of this condition by applying the -trick again. Finally we obtain the improved , with in place of , by applying the arithmetic-geometric mean inequality and noting that . I hope this makes sense.

It is worthwhile to remark that the factorization is valid when . It suffices to verify this when each variable is a power of the same prime. Then the condition forces that or , and the formula is clear.

OK, I was able to salvage the first two values of . The point is that we cannot actually reduce to the case when no two of have no large prime common factor. However, we can still reduce to the case when have no large prime common factor for any where is very similar to . However by being more careful with the terms I can make those terms at least get back to where they were before the error was found; since the n=1 terms were so dominant this may end up not costing us anything in the k_0 error at all. I’ll write this up properly as a blog post now (it’s nearly time to roll over the thread anyway, especially given that this particular post seems to have something wrong with the comment ordering).

The second inequality is still available, which is good news as it captures the bulk of the gain for kappa. The coprimality is unfortunately not available from the W-trick because the sum is weighted by rather than and so the non-coprime terms have a comparable contribution to the coprime terms. However, after working things out and using some ideas from the most recent preprint of Pintz (avoiding the parts of that preprint which were in error, of course) to substitute for Buchstab iteration, I was able to get a criterion which is actually better than the one we already have, basically the previous error had the shape and I can now improve this to . See the newest post for details (but the argument certainly needs to be checked!).

[…] the claim then follows by dividing by and summing using standard estimates (see Lemma 8 of this previous post). But this claim follows from the Bombieri-Vinogradov theorem (Theorem 4), after restricting to […]

[…] would be prime. It remains to estimate the numerator and denominator of . To this end, we introduce Selberg’s functions which, with judicious weights, approximate our previous . Fix with , and put , […]

For commenters

To enter in LaTeX in comments, use $latex <Your LaTeX code>$ (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy.