254A, Notes 6: Large values of Dirichlet polynomials, zero density estimates, and primes in short intervals

In the previous set of notes, we studied upper bounds on sums such as for that were valid for all in a given range, such as ; this led in turn to upper bounds on the Riemann zeta for in the same range, and for various choices of . While some improvement over the trivial bound of was obtained by these methods, we did not get close to the conjectural bound of that one expects from pseudorandomness heuristics (assuming that is not too large compared with , e.g. .

However, it turns out that one can get much better bounds if one settles for estimating sums such as , or more generally finite Dirichlet series (also known as Dirichlet polynomials) such as , for most values of in a given range such as . Equivalently, we will be able to get some control on the large values of such Dirichlet polynomials, in the sense that we can control the set of for which exceeds a certain threshold, even if we cannot show that this set is empty. These large value theorems are often closely tied with estimates for mean values such as of a Dirichlet series; these latter estimates are thus known as mean value theorems for Dirichlet series. Our approach to these theorems will follow the same sort of methods used in Notes 3, in particular relying on the generalised Bessel inequality from those notes.

Our main application of the large value theorems for Dirichlet polynomials will be to control the number of zeroes of the Riemann zeta function (or the Dirichlet -functions ) in various rectangles of the form for various and . These rectangles will be larger than the zero-free regions for which we can exclude zeroes completely, but we will often be able to limit the number of zeroes in such rectangles to be quite small. For instance, we will be able to show the following weak form of the Riemann hypothesis: as , a proportion of zeroes of the Riemann zeta function in the critical strip with will have real part . Related to this, the number of zeroes with and can be shown to be bounded by as for any .

In the next set of notes we will use refined versions of these theorems to establish Linnik’s theorem on the least prime in an arithmetic progression.

Our presentation here is broadly based on Chapters 9 and 10 in Iwaniec and Kowalski, who give a number of more sophisticated large value theorems than the ones discussed here.

Theorem 1 ( estimate on large values) Let and , and let be a sequence of complex numbers. Let be a -separated set of real numbers in an interval of length at most , thus for all . Let be real numbers. Then

This estimate is closely analogous to the analytic large sieve inequality (Proposition 6 from Notes 3). The factor is needed for technical reasons, but should be ignored at a first reading since it is comparable to one on the range . The bound (1) can be compared against the Cauchy-Schwarz bound

and against the pseudorandomness heuristic that should be roughly of size for “typical” .

Proof: We first observe that to prove the theorem, it suffices to do so in the case . Indeed, if , one can simply increase until it reaches , which does not significantly affect the factor ; conversely, if , one can partition into subsets, each of diameter at most , and the claim then follows from the triangle inequality.

Without loss of generality we may assume the are in increasing order, so in particular for any .

Next, we apply the generalised Bessel inequality (Proposition 2 of Notes 3), using the weight defined as , where is a smooth function supported on which equals one on . This lets us bound the left-hand side of (1) by

for some coefficients with .

By choice of , we have

so it will suffice to show that

We now focus on the expression

When , we can bound this by , which gives an acceptable contribution; now we consider the case . One could estimate (2) in this case using Proposition 6 of Notes 5 and summation by parts to get a bound of here, but one can do better (saving a logarithmic factor) by exploiting the smooth nature of . Namely, by using the Poisson summation formula (Theorem 34 of Supplement 2), we can rewrite (2) as

which we rescale as

Since , one can check that the derivative of the phase is on the support of when , and when is non-zero, while the second derivative is . Meanwhile, the function is compactly supported and has the first two derivatives bounded by . From this and two integrations by parts we obtain the bounds

when and

when , thus on summing

We thus need to show that

But if one bounds we obtain the claim.

There is an integral variant of this large values estimate, although we will not use it much here:

for any and . (Hint: Reduce to the case when and is supported on the range . Averaging Theorem 1 gives an upper bound of roughly the right order of magnitude; to get the asymptotic, apply Plancherel’s theorem to an expression of the form where is the indicator function convolved by some rapidly decreasing function whose Fourier transform is supported on (say) , and use the previous upper bound to control the error.)

We will use the large values estimate in the following way:

Corollary 3 Let , and let be a sequence of complex numbers obeying a bound of the form for all . Let . Then, after deleting at most unit intervals from , we have

for all and all in the remaining portion of .

One can view this corollary as improving upon the trivial bound

coming from mean value theorems on multiplicative functions (see Proposition 21 of Notes 1), if one is allowed to delete some unit intervals from the range of , with the bound improving as one deletes more and more intervals.

Proof: Let be the set of all for which

for at least one choice of . By the greedy algorithm, we can cover by the union of intervals of the form , where are -separated points in . By hypothesis, we can find such that

for all , and hence

On the other hand, from mean value theorems on multiplicative functions (see Proposition 21 of Notes 1) we have

The above estimate turns out to be rather inefficient if is very small compared with , because the factor becomes large compared with and so it is not even clear that one improves upon the trivial bound (4). However, by multiplying Dirichlet polynomials together one can get a good bound in this regime:

Corollary 4 Let , and let be a sequence of complex numbers obeying a bound of the form for all . Let , and let be a natural number. Then, after deleting at most unit intervals from , we have

for all and all in the remaining portion of .

Proof: We raise the original Dirichlet polynomial to the power to obtain

where is the Dirichlet convolution

From the bounds on we have . Subdividing into dyadic intervals and applying Corollary 3 (with replaced by ) and the triangle inequality, we conclude that upon deleting unit intervals from , we have

for all and all outside these intervals. Applying (6) and taking roots, we obtain the claim.

Exercise 5 Show that if and , then after deleting at most unit intervals from , one has for all in the remaining portion of . Conclude the fourth moment bound

as . The higher moments for have been intensively studied, and are conjectured by Conrey and Gonek (using the random matrix model, see Section 4 of Supplement 4) to be asymptotic to for certain explicit constants , but this remains unproven. It can be shown that the Lindelof hypothesis is equivalent to the assertion that for all and . In a recent paper of Harper, it was shown assuming the Riemann hypothesis that for all and .

It is conjectured by Montgomery that Corollary 4 also holds for real , after replacing the factor by for any fixed ; see this paper of Bourgain for a counterexample showing that such a factor is necessary. (Amusingly, this counterexample relies on the existence of Besicovitch sets of measure zero!) If one had this, then one could optimise (5) in the range for any fixed by choosing so that , arriving (morally, at least) at a bound of the form

thus beating the trivial bound by about . This bound would have many consequences, most notably the density hypothesis discussed below. Unfortunately, Montgomery’s conjecture remains open. Nevertheless, one can obtain a weaker version of this bound by choosing natural numbers so that is close to , rather than exactly equal to :

Corollary 7 Let be such that for some , and let be a sequence of complex numbers obeying a bound of the form for all . Let . Then, after deleting at most unit intervals from , we have

for all and all in the remaining portion of .

Note that the bound (8) is not too much worse than (7) in the important regimes when is close to and when is close to .

Proof: We can choose a natural number such that , so that . Applying Corollary 4 with replaced by , we conclude (after deleting at most unit intervals from ) that

since , one obtains (8) with in place of . If instead one uses Corollary 4 with rather than , one obtains (after deleting a similar number of unit intervals) that

Exercise 8 If one has the additional hypothesis , show that one can replace the factor in (8) by . Thus we can approach the conjectured estimate (7) in the regime where is very small compared with .

In the regime when is small, one can obtain better bounds by exploiting further estimates on exponential sums such as (2). We will give just one example (due to Montgomery) of such an improvement, referring the reader to Chapter 9 of Iwaniec-Kowalski or Chapter 7 of Montgomery for further examples.

Proposition 9 Let , and let be a sequence of complex numbers obeying a bound of the form for all . Let . Then, after deleting at most unit intervals from , we have

for all and all in the remaining portion of .

Note that this bound can in fact be superior to (7) if is a little bit less than and is less than .

Proof: Let be the set of all for which

for at least one choice of , where is a large constant be chosen later. By the greedy algorithm as before, we can cover by the union of intervals of the form , where are -separated points in ; we arrange the in increasing order, so that . By hypothesis, we can find such that

for all , and hence

As before, we have

and so by the generalised Bessel inequality (with ) we may bound the left-hand side of (10) by

for some with . The diagonal contribution is . For the off-diagonal contributions, we see from Propositions 6, 9 of Notes 5 (dealing with the cases and respectively) and summation by parts that

The second term on the right-hand side may be absorbed into the left-hand side if is large enough, and we conclude that

which gives , and the claim follows.

— 2. Zero density estimates —

We now use the large value theorems for Dirichlet polynomials to control zeroes of the Riemann zeta function with large real part . The key connection is that if is a zero of , then this will force some Dirichlet polynomial to be unusually large at that zero, which can be ruled out by the theorems of the previous section (after excluding some unit intervals from the range of ). Dirichlet polynomials with this sort of property are sometimes known as zero-detecting polynomials, for this reason.

Naively, one might expect, in view of the identities and for , that Dirichlet polynomials such as or might serve as zero-detecting polynomials. Unfortunately, in the regime , it is difficult to control the tail behaviour of these series. A more efficient choice comes from the following observation. Suppose that for some and , where is large. From Exercise 33 of Supplement 3, we have

for any , where is a smooth function equaling on and zero on , and is a sufficiently large multiple of . Thus

To eliminate the terms coming from small , we multiply both sides by the Dirichlet polynomial for some fixed . This series can be very crudely bounded by , so that (after adjusting ) we have

In particular, we have

where is the sequence

In particular, for large we have

and so serves as a zero-detecting polynomial.

From Möbius inversion, we see that is supported on the range , and is bounded by . We can thus decompose the Dirichlet polynomial as the sum of expressions of the form for . Applying Corollary 7 (or Corollary 3 to deal with the cases when ), we see that for any , and after deleting at most unit intervals from , we have

for all choices of and for all and all in the remaining portion of . Adding a large multiple of to (which does not significantly affect the error , we can improve this to

(say). Summing this in , we see that

if

and is sufficiently large depending on . Comparing this with (11), we conclude

Proposition 10 Let and , and let be sufficiently large depending on . Then, after deleting at most unit intervals from , there are no zeroes of of the form with and in the remaining portion of .

For any and , let denote the number of zeroes of the Riemann zeta function (counting multiplicity) with and . Recall from Proposition 16 of Notes 2 that there are at most zeroes in any rectangle of the form with . Combining this with the previous proposition, we conclude

Corollary 11 (Zero-density estimate) Let and . Then one has

Proof: By dyadic decomposition (and reflection symmetry) we may replace the constraint in the definition of by . The claim then follows from the previous proposition with and a suitably small choice of .

Exercise 12 (Weak Riemann hypothesis) For any and , show that

for all . Conclude in particular that for any , only of the zeroes of the Riemann zeta function in the rectangle have real part greater than or less than .

Remark 13 The sharpest result known in the direction of the weak Riemann hypothesis is by Selberg, who showed that for any and . Thus, “most” zeroes with lie within of the critical line. Improving upon this result seems to be closely related to making progress on the pair correlation conjecture (see Section 4 of Supplement 4).

Corollary 11 is far from the sharpest bound on known, but it will suffice for our applications; see Chapter 10 of Iwaniec and Kowalski for various stronger bounds on , formed by using more advanced large values estimates as well as more complicated zero detecting polynomials. The density hypothesis asserts that

for any , , and , thus replacing the exponent in Corollary 11 with . This hypothesis is known to hold for (a result of Huxley, in fact his estimates are even stronger than what the density hypothesis predicts), but is open in general; it is known that the exponent in Corollary 11 can be replaced by (by the work of Ingham and of Huxley), but at the critical value no further improvement is currently known. Of course, the Riemann hypothesis is equivalent to the assertion that for all , which is far stronger than the density hypothesis; in Exercise 15 below we will see that even the Lindelof hypothesis is sufficient to imply the density hypothesis. However, the density hypothesis can be a reasonable substitute for the Riemann hypothesis in some settings (e.g. in establishing prime number theorems in short intervals, as discussed below), and looks more amenable to a purely analytic attack (e.g. through resolution of the exponent pair conjecture, discussed briefly in Notes 5) than the Riemann hypothesis.

In Exercise 32 of Notes 2, it was observed that if there were no zeroes of the Riemann zeta function with real part larger than , then one had the upper bound as for fixed . The following exercise gives a sort of converse to this claim, “modulo the density hypothesis”:

Exercise 15 (Riemann zero usually implies large value of ) Let and , and let . Show that after deleting at most unit intervals from , the following holds: whenever is a zero of the Riemann zeta function with and in the remaining portion of , we have

for some with and . (Hint: use Exercise 8 to dispose of the portion of the zero-detecting polynomial with . Then divide out by and conclude that a sum such as is large for some . Then use Lemma 34 from Supplement 3.) Conclude in particular that the Lindelof hypothesis implies the density hypothesis.

Remark 16 One can refine the above arguments to show that the density hypothesis is equivalent to the assertion that for any , one has the Lindelof-type bounds for all and all in the set with at most unit intervals removed, as . An application of Jensen’s formula shows that this claim is equivalent in turn to the assertion that for any and , the set contains zeroes of for all with at most involved. Informally, what this means is that the density hypothesis fails not just through the existence of a sufficient number of “exceptional” zeroes with , but in fact through a sufficient number of clumps of exceptional zeroes – such zeroes within a unit distance of each other.

— 3. Primes in short intervals —

As an application of the zero density estimates we obtain a prime number theorem in short intervals:

Theorem 17 (Prime number theorem in short intervals) For any fixed , we have

whenever and . In particular, there exists a prime between and whenever is sufficiently large depending on .

Theorems of this type were first obtained by Hohiesel in 1930. If the exponent could be lowered to be below , this would establish Legendre’s conjecture that there always exists a prime between consecutive squares , at least when is sufficiently large. Unfortunately, we do not know how to do this, even under the assumption of the Riemann hypothesis; the best unconditional result is by Baker, Harman, and Pintz, who established the existence of a prime between and for every sufficiently large . Among other things, this establishes the existence of a prime between adjacent cubes if is large enough.

Proof: From the truncated explicit formula (Theorem 21 from Notes 2) we have

From the Vinogradov-Korobov zero-free region (Exercise 5 of Notes 5), we see that the integrand vanishes when for any , if is sufficiently large depending on . (Indeed, the Vinogradov-Korobov region gives more vanishing than this, but this is all that we shall need.) Using this and Corollary 11, we may bound the left-hand side of (14) by

and the claim follows by choosing large enough depending on .

Exercise 18 Assuming the density hypothesis, show that the exponent in Theorem 17 can be replaced by ; thus the density hypothesis, though weaker than the Riemann or Lindelof hypotheses, is still strong enough to get a near-miss to the Legendre conjecture.

Exercise 19 Using the Littlewood zero-free region (see Exercise 4 of Notes 5 and subsequent remarks) in place of the Vinogradov-Korobov zero-free region, obtain Theorem 17 with replaced by some absolute constant . (This is close to the original argument of Hohiesel.)

As we have seen several times in previous notes, one can obtain better results on prime number theorems in short intervals if one works on average in , rather than demanding a result which is true for all . For instance, we have

Theorem 20 (Prime number theorem on the average) For any fixed , we have

whenever and . In particular, there is a prime between and for almost all in , in the sense that the set of exceptions has measure .

But this can be done by a repetition of the arguments used to establish (14); indeed, the left-hand side is

and by using the Vinogradov-Korobov zero-free region and Corollary 11 as before, we obtain the claim.

Exercise 21

(i) Assuming without proof that the exponent in Corollary 11 can be lowered from to , show that the exponent in Theorem 20 may be lowered from to ; this implies in particular that Legendre’s conjecture is true for “almost all” .

(ii) Assuming the density hypothesis, show that the exponent in Theorem 20 can be lowered all the way to zero.

— 4. -function variants —

We have already seen in previous notes that many results about the Riemann zeta function extend to -functions , with the variable playing a role closely analogous to that of the imaginary ordinate . Certainly, for instance, one can prove zero-density estimates for a single -function for a single Dirichlet character of modulus by much the same methods as given previously, after the usual modification of replacing logarithmic factors such as with instead.

However, one can also prove “grand density theorems” in which one counts zeroes not just of a single -function , but of a whole family of -functions , in which ranges in some given family (e.g. all Dirichlet characters of a given modulus ). Such theorems are particularly useful when trying to control primes in an arithmetic progression , since Fourier expansion then requires one to consider all characters of modulus simultaneously. (It is also of interest to obtain grand density theorems for all Dirichlet characters of modulus up to some threshold , but we will not need to discuss this variant here.) In order to get good results in this regard, one needs a version of the large values theorems in which one averages over characters as well as imaginary ordinates . Here is a typical such result:

Theorem 22 ( estimate on large values with characters) Let and , let be a natural number, and let be a sequence of complex numbers. Let be real numbers in an interval of length , and let be Dirichlet characters of modulus (possibly non-primitive or principal). Assume the following separation condition on the pairs for :

Then for any , we have

Note that this generalises Theorem 1, which is the case of Theorem 22.

Proof: We mimic the proof of Theorem 1. By increasing , we may reduce without loss of generality to the case . As before, we apply the generalised Bessel inequality with the same weight used in Theorem 1, to reduce to showing that

Bounding and using symmetry, it thus suffices to show that

for each .

We now focus on the expression

As before, this expression is bounded by in the diagonal case , which gives an acceptable contribution, so we turn to the case . We split into two subcases, depending on whether or not.

First suppose that . We then split the sum in (17) into sums, each arising from a primitive residue class , on which . Applying a change of variables and then using Poisson summation as in the proof of Theorem 1, one can show that each individual sum is , so the total sum (17) is . Summing over all the with , the are -separated by hypothesis, and so we obtain an acceptable contribution in this case.

Now suppose that . Again, we can split the sum in (17) into sums, each of which is equal to times

for a primitive residue class . Repeating the analysis of Theorem 1, but now noting that is not required to be bounded below, we can write this latter sum as

Summing in , and noting that has mean zero, we see that the main term here cancels out, and we can bound (17) by . The total number of possible can be bounded by , and so the contribution of this case is acceptable (with a bit of room to spare).

Remark 23 If one is willing to lose a factor of , the above proof can be simplified slightly by performing only one integration by parts rather than two in the integrals arising from Poisson summation.

We can now obtain a large values theorem in which the role of the interval is now replaced by the set of pairs , with a Dirichlet character of modulus , and an element of . Inside this set, we consider unit intervals of the form for some Dirichlet character and .

Exercise 24 Let , let be a natural number, and let be a sequence of complex numbers obeying a bound of the form for all . Let .

(i) If is a natural number, show that after deleting at most unit intervals from , one has

for all and all in the remaining portion of .

(ii) If for some , show that after deleting at most unit intervals from , we have

for all and all in the remaining portion of .

For any , , and natural number , let denote the combined number of zeroes of all of the -functions with of modulus , and , counting multiplicity of course. We can then repeat the proof of Corollary 11 to give

Exercise 25 (Grand zero-density estimate) Let , , and let be a natural number. Show that

Exercise 26 For any and , let denote the combined number of zeroes of all of the -functions with primitive and of conductor at most , and , counting multiplicity of course. Show that for any , one has

(Hint: one will have to develop an analogue of Theorem 22 and Exercise 24 in which one works with primitive characters of conductor at most , rather than all characters of a fixed modulus , and with all references to replaced by .)

As with the density theorems for the zeta function, the exponents here may be improved somewhat, with the conjecturally being reducible to for any ; see Chapter 10 of Iwaniec and Kowalski. (Of course, the generalised Riemann hypothesis asserts that in fact vanishes whenever .)

Given that the density estimates for the Riemann zeta function yield prime number theorems in short intervals, one expects the grand density estimates to yield prime number theorems in sparse (and short) arithmetic progressions. However, there are two technical issues with this. The first is that the analogue of the Vinogradov-Korobov zero-free region is not necessarily wide enough (in some ranges of parameters) for the argument to work. The second is that one may encounter an exceptional (Landau-Siegel) zero. These difficulties can be overcome, leading to Linnik’s theorem (as well as the quantitative refinement of this theorem by Gallagher); this will be the focus of the next set of notes.

If one proves simialr results about cancellations in short intervals for mobius instead of primes, do you think we can use that information to detect primes in short intervals? I have seen some recent work where they get cancellation in most intervals of size x^delta for Mobius. They use the algebraic structure of mobius ; build a bilinear structure by factoring out primes and then using standard estimates on dirichlet polynomials.

Unfortunately one needs error terms of or better in the Mobius function estimates before they become sensitive enough to detect primes (which have density , by the prime number theorem). The arguments of Matomaki and Radziwill (which I assume is what you are referring to) have worse error terms than this, indeed their first step is to throw away the contribution of the primes (as well as some other numbers without enough small prime factors) and to get the cancellation purely from small prime divisors. I think the way one should view their results (and similar results on bounded multiplicative functions) is that questions about the Mobius function and relatives (e.g. Liouville function) are in fact somewhat easier than those about prime numbers (or von Mangoldt function), at least if one is content with error terms that are worse than times the main term.

I’ve thought a bit about this issue in the past and I’ve never come to a clear understanding. It’s clear to me why the standard approach requires an error term to convert from estimates involving Mobius to primes, but it isn’t clear to me how to reconcile this with the fact that is sufficient to imply the the PNT. Do you have any insight into this?

Landau’s observation that implies the prime number theorem relies on a multiplicative identity concerning the Mobius function, namely , which allows one to upgrade the estimate to the stronger estimate , and now the error term is stronger than the trivial bound by a little bit more than a factor of a logarithm and so this bound is (in principle, at least) strong enough to discern primes. (It still takes a bit of extra manipulation to see this though, see e.g. Theorem 58 of Notes 1.) Basically, because of the aforementioned identity, there already is an estimate involving the Mobius function in the interval with an extremely good error term (in this case, the identity is exact and the error term is zero). No identity or estimate with such a good error term currently exists for the Mobius function in short intervals if is very small (e.g. less than ).

(The basic problem is that whilst has some limited closure properties with respect to multiplication, which allows one to exploit the multiplicative structure of the Mobius or von Mangoldt functions on this interval by direct elementary manipulations, the interval has no multiplicative closure properties whatsoever, and we only know how to exploit the multiplicative structure via complex analytic/Fourier analytic methods, i.e. through the zeta function. If one averages in x, one recovers a little bit of multiplicative closure that can be used, this is for instance how the method of Katai or Bourgain-Sarnak-Ziegler on establishing a little bit of cancellation in Mobius sums works. But if one restricts to the primes then one again loses all multiplicative closure properties.)

[…] to this paper, such estimates were only known (using arguments similar to those in Section 3 of Notes 6) for unconditionally, or for for some sufficiently large if one assumed the Riemann hypothesis. […]

[…] if for some fixed ; I believe this result is due to Ramachandra (see also Exercise 21 of this previous blog post), and in fact one could obtain a better error term on the right-hand side that for instance gained […]

For commenters

To enter in LaTeX in comments, use $latex <Your LaTeX code>$ (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy.