In this post we will record a new truncation of the elementary Selberg sieve discussed in this previous post (and also analysed in the context of bounded prime gaps by Graham-Goldston-Pintz-Yildirim and Motohashi-Pintz) that was recently worked out by Janos Pintz, who has kindly given permission to share this new idea with the Polymath8 project. This new sieve decouples the parameter that was present in our previous analysis of Zhang’s argument into two parameters, a quantity that used to measure smoothness in the modulus, but now measures a weaker notion of “dense divisibility” which is what is really needed in the Elliott-Halberstam type estimates, and a second quantity which still measures smoothness but is allowed to be substantially larger than . Through this decoupling, it appears that the type losses in the sieve theoretic part of the argument can be almost completely eliminated (they basically decay exponential in and have only mild dependence on , whereas the Elliott-Halberstam analysis is sensitive only to , allowing one to set far smaller than previously by keeping large). This should lead to noticeable gains in the quantity in our analysis.

To describe this new truncation we need to review some notation. As in all previous posts (in particular, the first post in this series), we have an asymptotic parameter going off to infinity, and all quantities here are implicitly understood to be allowed to depend on (or to range in a set that depends on ) unless they are explicitly declared to be fixed. We use the usual asymptotic notation relative to this parameter . To be able to ignore local factors (such as the singular series ), we also use the “-trick” (as discussed in the first post in this series): we introduce a parameter that grows very slowly with , and set .

For any fixed natural number , define an admissible -tuple to be a fixed tuple of distinct integers which avoids at least one residue class modulo for each prime . Our objective is to obtain the following conjecture for as small a value of the parameter as possible:

Conjecture 1 () Let be a fixed admissible -tuple. Then there exist infinitely many translates of that contain at least two primes.

The twin prime conjecture asserts that holds for as small as , but currently we are only able to establish this result for (see this comment). However, with the new truncated sieve of Pintz described in this post, we expect to be able to lower this threshold somewhat.

In previous posts, we deduced from a technical variant of the Elliot-Halberstam conjecture for certain choices of parameters , . We will use the following formulation of :

Conjecture 2 () Let be a fixed -tuple (not necessarily admissible) for some fixed , and let be a primitive residue class. Then

for any fixed , where , are the square-free integers whose prime factors lie in , and is the quantity

and is the set of congruence classes

and is the polynomial

The conjecture is currently known to hold whenever (see this comment and this confirmation). Actually, we can prove a stronger result than in this regime in a couple ways. Firstly, the congruence classes can be replaced by a more general system of congruence classes obeying a certain controlled multiplicity axiom; see this post. Secondly, and more importantly for this post, the requirement that the modulus lies in can be relaxed; see below.

To connect the two conjectures, the previously best known implication was the folowing (see Theorem 2 from this post):

Theorem 3 Let , and be such that we have the inequality

where is the first positive zero of the Bessel function , and are the quantities

and

Then implies .

Actually there have been some slight improvements to the quantities ; see the comments to this previous post. However, the main error remains roughly of the order , which limits one from taking too small.

To improve beyond this, the first basic observation is that the smoothness condition , which implies that all prime divisors of are less than , can be relaxed in the proof of . Indeed, if one inspects the proof of this proposition (described in these threepreviousposts), one sees that the key property of needed is not so much the smoothness, but a weaker condition which we will call (for lack of a better term) dense divisibility:

Definition 4 Let . A positive integer is said to be -densely divisible if for every , one can find a factor of in the interval . We let denote the set of positive integers that are -densely divisible.

Certainly every integer which is -smooth (i.e. has all prime factors at most is also -densely divisible, as can be seen from the greedy algorithm; but the property of being -densely divisible is strictly weaker than -smoothness, which is a fact we shall exploit shortly.

We now define to be the same statement as , but with the condition replaced by the weaker condition . The arguments in previous posts then also establish whenever .

The main result of this post is then the following implication, essentially due to Pintz:

Theorem 5 Let , , , and be such that

where

and

and

Then implies .

This theorem has rather messy constants, but we can isolate some special cases which are a bit easier to compute with. Setting , we see that vanishes (and the argument below will show that we only need rather than ), and we obtain the following slight improvement of Theorem 3:

Theorem 6 Let , and be such that we have the inequality

where

Then implies .

This is a little better than Theorem 3, because the error has size about , which compares favorably with the error in Theorem 3 which is about . This should already give a “cheap” improvement to our current threshold , though it will fall short of what one would get if one fully optimised over all parameters in the above theorem.

Returning to the full strength of Theorem 5, let us obtain a crude upper bound for that is a little simpler to understand. Extending the summation to infinity and using the Taylor series for the exponential, we have

We can crudely bound

and then optimise in to obtain

Because of the factor in the integrand for and , we expect the ratio to be of the order of , although one will need some theoretical or numerical estimates on Bessel functions to make this heuristic more precise. Setting to be something like , we get a good bound here as long as , which at current values of is a mild condition.

Pintz’s argument uses the elementary Selberg sieve, discussed in this previous post, but with a more efficient estimation of the quantity , in particular avoiding the truncation to moduli between and which was the main source of inefficiency in that previous post. The basic idea is to “linearise” the effect of the truncation of the sieve, so that this contribution can be dealt with by the union bound (basically, bounding the contribution of each large prime one at a time). This mostly avoids the more complicated combinatorial analysis that arose in the analytic Selberg sieve, as seen in this previous post.

— 1. Review of previous material —

In this section we collect some results from previous posts which we will need.

We first record an asymptotic for multiplicative functions. For any natural number , define a -dimensional multiplicative function to be a multiplicative function which obeys the asymptotic

Lemma 7 Let . Let be a fixed positive integer, and let be a multiplicative function of dimension . Then for any fixed compactly supported, Riemann-integrable function , and any for some fixed , one has

Lemma 8 (Criterion for DHL) Let . Suppose that for each fixed admissible -tuple and each congruence class such that is coprime to for all , one can find a non-negative weight function , fixed quantities , a quantity , and a fixed positive power of such that one has the upper bound

the lower bound

for all , and the key inequality

holds. Then holds. Here is defined to equal when is prime and otherwise.

— 2. Pintz’s argument —

We can now prove Theorem 5. Fix to obey the hypotheses of this theorem. Let be a congruence class with coprime to for all (this class exists by the admissibility of ). We apply Lemma 8 with

the elementary Selberg sieve defined by

where

are the multiplicative functions

and

the sieve level is given by the formula

is a fixed smooth function supported on , and is a certain subset of to be chosen shortly. (The constraints and are redundant given that , but we retain them for emphasis.) We will assume that non-negative and non-increasing on . In this previous post, we considered this sieve with equal to either all of , or the subset consisting of smooth numbers. For now, we will discuss the estimates as far as we can without having to explicitly specify .

We first consider the asymptotic (5). By arguing exactly as in Section 2 (or Section 3) of this previous post, we can write the left-hand side of (5), up to errors of , as

The summand here is non-negative, so we may crudely replace by all of and apply Lemma 7 to obtain (5) with

Now we turn to the more difficult asymptotic (6). The left-hand side expands as

Now let be a small fixed constant to be chosen later, and suppose the following claim holds:

Claim 1 (Dense divisibility of moduli) Whenever and are non-zero, then either or else .

Then from the Bombieri-Vinogradov theorem (for the moduli) or the hypothesis (for the larger moduli, noting that ) and standard arguments (cf. Proposition 5 of this post) we have

for any fixed and any multiplicative function of a fixed dimension , where ranges only over those integers of the form with non-zero. From this we easily see (arguing as in Section 2 of this previous post) that the contribution of the error term is , and we are left with establishing the lower bound

up to errors of (henceforth referred to as negligible errors).

As in Section 2 of the previous post, we can write the left-hand side as

where is the -dimensional multiplicative function

So we would like to select small enough that Claim (1) holds, but large enough that we can lower bound (9) by up to negligible errors.

Pintz’s idea is to choose to be the set of all elements of with the property that

Let us first verify Claim 1 with this definition. Suppose that is non-zero, then from (8) and the support of there are multiples , of , respectively with (in particular ) and . Since , we have

and thus

Similarly for and . Multiplying, we obtain

On the other hand,

and so on dividing we have

We conclude that either is less than , or else

The latter conclusion implies that is -densely divisible. Indeed, for any we multiply together all the prime divisors of between and one at a time until just before one reaches or exceeds , or if one runs out of such prime divisors. In the former case one ends up at least as large as , and in the latter case one has reached . Next, one multiplies to this the prime divisors of less than until one reaches or exceeds ; this is possible thanks to (11) in the former case and is automatic in the latter case, and gives a divisor between and as required.

(the minus sign being to compensate for the non-positive nature of ) then we have

and thus

Note that this inequality replaces the quadratic expression with a linear expression in the truncation error , which will be more tractable for computing the effect of that error. We may thus lower bound (9) by the difference of

82 comments

Has anyone worked out the “cheap” gain k0 mentioned after Theorem 6? (or better, a fully optimized k0, but even a cheap gain will give those working on optimizing H(k0) something to play with — we’re not used to working with the same k0 for more than 2 days at time :)).

If I use the above bound and assume the -quotient really is , I think that I get that and with to satisfy the previous constraint, one can get . I should write a better program to optimize this though.

From my other comment, I now think the G-quotient is going to be closer to , where . I can try to get more precise asymptotics for the term, but perhaps it will be simpler just to compute the integral numerically (it will serve as a good sanity check to compare the numerics against the asymptotics though, and the asymptotics can give a rough idea of what region of parameter space to search in).

I am really not sure where I came up with the above numbers, there is no obstacle to making closer to 1/348. With the sech/tanh estimate (ignoring o(1)) I can take and get with , and is not of import again (make the delta-difference smaller if it is). I am not sure I trust all the numerics though (instability), but this is approaching the limit of previous analysis.

Cross-checking these computations with Maple. With and we have . Setting (say) we have

.

We need to not exceed

.

I get

which looks good so far. Now for . Using the crude bound

the exponential factor here is , which looks scary. On the other hand, the G-ratio here is predicted to be roughly

with (I made a slight mistake earlier in not having the square root and factor of 2 here but it fortunately turns out not to matter), and this is is completely negligible. In order to get an independent confirmation of the smallness of the G-ratio is, the Bessel function at takes the value , and at takes the value , so it looks abundantly clear that the G-ratio is indeed ridiculously small in this regime.

Oops, sorry about that. Well, one can hope to squeeze something out of the full version of Theorem 5 (or at least the first display after Theorem 5) in which there is also the A parameter to optimise over. One can probably pick up some other ways to make the estimate for more efficient, but at the cost of simplicity. Pintz’s own notes have a slightly different way of estimating , I can try to have a look at that instead.

There were two corrections I had to make; one (which worked against our favour) was the replacing of with its square root, but the other (which works for us) is the doubling of the exponential from to . I had thought that the two effects roughly cancelled each other out, but I guess not quite…

Currently, I can only get with and . You can note the incredible impact that suddenly has if you change “6” to “5” in the numerator of the subtrahend of . So saving a few more in could occur from its more benevolent estimation.

(by replacing with infinity and replacing with ). This extra exponential factor makes it harder to optimise in A, but it should give a significant savings, hopefully enough to push back to 5453 and maybe even a bit further.

I’m recording here some asymptotics of Bessel functions to help understand the ratio

(*)

in the regime where is fixed and is large. Here we have made the change of variable .

Write . By 9.3.23 of Abramowitz and Stegun (see e.g. this link) we have the asymptotic

for bounded , where is the Airy function. This shows that the zero has the asymptotic

where is the first negative zero of the Airy function. It also suggests that

.

Meanwhile, 9.3.7 of Abramowitz and Stegun (Debye asymptotic) tells us that

for any (this is also consistent with the exponential decay of the Airy function to the right), so if we set (EDIT: this should be ), the ratio (*) should take the form

(EDIT: this should be ) which is exponentially decaying in as claimed in the post. (Unfortunately, we cannot directly use these asymptotics to get explicit values of because we need control on the error terms, which standard sources (e.g. Abramowitz-Stegun) do not supply.)

Please note that the upper bound for the integration in the numerator is defined in theorem 5 to be (not its square), and in the presentation of Pintz argument it is defined (for a general pair) to be , so it is not clear which is the correct one.

In my last message I did not notice that the integrals are after change of variable, but I think that in the numerator should be replaced by its square root. And more importantly, the integral (at least for even ) can be precisely evaluated as a finite linear combination of squares of Bessel functions by finding the indefinite integral using the recursive formula (11.3.33) and the formula (11.3.34) in Abramowitz and Stegun.
(If there is a corresponding formula for , then the recursive formula can be used to compute the integral for odd as well.)

A minor technical point: there is one tiny way in which -densely divisible numbers are worse than -smooth numbers: the behaviour is not quite hereditary with respect to factors. If and is -smooth, then is automatically -smooth as well. On the other hand, if is merely -densely divisible, then is only -densely divisible, there is a loss of (basically because could suck up all the useful small primes that are giving the dense divisibility property). This makes the Type III analysis a little more complicated as there is a part in which one forces the frequency parameter to be coprime to the modulus by Mobius inversion, dividing both and by the common factor . However even with the worsening of by , one can check that the net power of remains favorable and the coprime case still dominates, so this is not actually an issue. (But something to look out for when we do the final writing up of the arguments.)

I verified that this works with the Maple code below (using crude upper and lower bounds on the Bessel integral after verifying that the Bessel function was decreasing from to , and was on the other side of the maximum at ). By using the bound in the display after Theorem 6 rather than perform any further estimations, there is actually a little bit of margin: is allowed to be as large as but is only of size , with dominating, , and . It seems likely that we could squeeze by one or two more by changing the free parameters a bit, but it is very tight and I didn’t try too hard to do so.

Actually it is known that prime gaps can be arbitrarily large. For instance, if n is a natural number, then are all composite, creating a prime gap of length at least . I think 1476 may be the smallest H for which the location of the first prime gap of length H is known, though.

My feeling though is that we are going to run out of improvements before we reach that level; looks like a more realistic target at this stage.

Thanks! Indeed the term “maximal prime gap” used in the OEIS entry is confusing, sorry to just quote it. It means “maximal gap for a given size integer”. So 1476 is (according to that table) the *largest* H for which the location of the first prime gap of length H is known.
Anyway, even H = 10^5 is quite exciting.
David Metzler

Please note that if is even , than by formulas (11.3.33) and (11.3.34) in Abramowitz & Stegun both integrals can be expressed (and thereby precisely computed) as a simple finite linear combinations of squares of Bessel functions! (For odd k, one needs an analogous formula to (11.3.34) in which replaces .)

Eytan: in extending the formula to all complex , one needs to be careful, e.g. one needs to check if the integral converges (hopefully locally uniformly, the difficulty coming from ). Already the extension of is tricky to all complex . Certainly everything is fine for .

I agree that the integral can’t be entire in , since for any fixed , (9.1.10) in A&S implies that the integral diverges as tends (from the right) to – which must therefore be a singular point for any analytic continuation.

Eytan: Right. Probably there is no problem for , as the integral converges absolutely and locally uniformly there. I sticked to just to be safe, because the definition beyond that range is less straightforward (according to Watson’s book). BTW I am also quite pleased and surprised that so many transparent features of Bessel functions play a role in this project. I come from automorphic forms where this phenomenon is wide-spread, but (I think) it is new for sieve theory where this project belongs.

There is another interesting “sub-multiplicative” property of (shared also by many entire functions of order ) which is very useful for improving the current bounds on the 's. (I'm preparing the details.)

Thanks for the correction and improved argument! I edited the main post accordingly. The net effect is to improve the quantity in Theorem 5 from

to

which should hopefully squeeze a bit more out of starting from the current constraint since it delays the time when the error starts exploding and so may be taken a bit larger than previously. Presumably some perturbation of v08ltu's choice of parameters

Let and , then the paragraph below (11) seems to work only when (e.g. for the first step gives nothing). For we can do the following. We have , hence there is a divisor of in the interval . Multiplying this divisor by yields a divisor of in the interval .

for (say) v08ltu’s most recent values: the latter is about times the G-ratio but the former seems to be something like , peaking at about 4200. Somehow the problem is that is so small that the double factorial in the denominator has a hard time counteracting the factors of in the numerator. (The current estimate also blows up as , but the blowup seems to be milder somehow.)

1. Let , be non-vanishing on (with the possible vanishing at .) Consider the following inequality for:

(1)

Where for each fixed is (obviously) the best (uniform) multiplier .

Clearly, a sufficient condition for to hold for each fixed , is that the function should be non-increasing (as a function of ) on .
That is, its logarithm (as a function of ) should be non-increasing on . Or equivalently, the logarithmic derivative of should be non-positive on .

It follows that the above sufficient condition is implied by

(2) is concave on .

To prove the implication, observe that for a fixed , the derivative of is where is the logarithmic derivative of .
Now (2) implies that (as a non-increasing function on ) satisfies whenever , which proves the sufficiency of (2) for (1) with .

This motivates the following

[For some reason, LaTeX is not working on the line below, I was unable to fix it – T.]

Definition: A function g \in C^1[0,1] will be called nice ( on [0,1] ) if it is non-vanishing on [0, 1) and satisfy

(3) and is concave
on

Consequently, any nice function g on [0, 1] satisfies

(4) , whenever

(4') on .

Where (4') follows from the concavity of on .

It is easy to verify the following properties

(i) A finite product of nice functions is nice.
(ii) If is a sequence of nice and analytic functions on, converging locally uniformly on to a function g, than g is (obviously) analytic and nice on .

Examples: If and , then the functions

are nice, where

Theorem: Let be an entire function of order , having its zeros in and satisfying , than is a nice function on and, consequently, it satisfies properties (4)-(4').

Proof: denote by the genus and order of g respectively. Since we have two possibilities
(i) : implying that the canonical product representation of g
is given by where denote
the zeros of , since (by the first example), each canonical factor
(i.e. ) is nice and the canonical product converges locally uniformly to on , it follows (from the above properties) that g is nice on .
(ii) : implying that the canonical product representation of g is given similarly by with the same proof.

2. Applications:

Let for, then the product representation (9.5.10) in A&S shows that f is an entire function of genus having its zeros in .
Hence by the above theorem is nice on . hence properties (4) and (4') imply

(5) whenever the 's are non-negative and .
Remark: the last limitation on the sum is unnecessary if is modified to on .

(5') for each
(with the above modification of for , (5') holds obviously for each .

Remark: it is easy to verify that
which for large is asymptotic to .

Using (5), (5') it is easy to verify that for each

Where

Since is asymptotic (for large ) to , it seems that the last exponential bound is inferior to the current bound which determines the upper bound on, similar situation exists for the new bound

relative to the current bound which determines the current upper bound on .

the situation seems better for the new bound on :

Using properties (5)-(5') of , it follows that

It is easy to verify that the last exponential bound implies the current bound with with the parameter in the integrand, but without the constant multiplier .

This is a nice observation! I’m quite surprised to see that the log-concavity holds also for , though at least in a sufficiently small neighborhood of a zero, the dominating logarithmic singularity is concave, the problem is that the logarithmic derivative is positive sufficiently close to the right of each zero, this is exactly the reason why the interval of interest should not have zeros of .
Another well known example is known to be entire, log-concave and monotonically decreasing to the right of its maximizer (which is in ) – in this case the zeros are to the left of the “interesting interval”!

In your Maple worksheet, you integrate, deltat to theta, but the formula listed by Harcos is .

[Well spotted! But I think Gergely’s formula has a typo, the integral here (which is counting primes up to ) should go up to rather than . In any case the exponential decay of the integrand renders the upper limit of the integral more or less irrelevant. -T.]

[…] The difference between this conjecture and the stronger conjecture is that the modulus is constrained to be -densely divisible rather than -smooth (note that is no longer constrained to lie in ). This relaxation of the smoothness condition improves the Goldston-Pintz-Yildirim type sieving needed to deduce from ; see this previous post. […]

recording most of the techniques we have for converting results of the form or to . Nothing new on that page, but it at least collects all the different implications that we have found in a single location (though they are all basically obsolete now that we have Pintz’s sieve).

I don’t think we need this constraint. If we replace by , then the function does not change, hence the condition of Theorem 5 remains the same, while its proof works verbatim, i.e. the original stated conclusion holds.

For commenters

To enter in LaTeX in comments, use $latex <Your LaTeX code>$ (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy.