The twin prime conjecture, is, of course $H_1 = 2$, the the prime k-tuples conjecture of Hardy and Littlewood asserts that $H_2 = 6$, $H_3 = 8$ and so on. Goldston-Pintz-Yildirim showed that under the Elliot-Halberstam conjecture, $H_1 < \infty$, a result that Yitang Zhang has famously recently established unconditionally.

It is my understanding that the methods of GPY were essentially restricted when dealing with $m>1$. One of their non-trivial results was that $g_n = p_{n+1} - p_n$ satisfies $g_n = o(\log p_n)$, or in other words that $\liminf \frac{g_n}{\log p_n} = 0$, which obviously trivially follows from Zhang's result.

However, it is my understanding that even under the full Elliot-Halberstam conjecture, GPY were unable to prove even their weaker result for $m\geq 2$. Maynard and Tao (see http://arxiv.org/abs/1311.4600) have shown that in fact, that $H_m < \infty$. Apparently, this work only required the classical Bombieri-Vinogradov theorem, not even the variation of that Zhang's proof uses.

So my question is this: what shortcoming did the GPY approach have that did not allow it to extend to $m>2$, and how has Maynard's approach solved this problem?

PS: I apologize if this question is somehow inappropriate for Mathoverflow. I am an undergraduate trying to read up on these recent results by myself as much as I can, and I figured this would probably be the best place to ask my question.

2 Answers
2

The major difference is the choice of the Selberg Sieve weights. I strongly recommend reading Maynard's paper. It is well written, and the core ideas are nicely explained.

In what follows, I'll give a brief explanation of what Selberg Sieve weights are, why they appear, and what was chosen differently in Maynard's paper compared to that of Goldston Pintz and Yildirim.

Let $\chi_{\mathcal{P}}(n)$ denote the indicator function for the primes. Given an admissible $k$-tuple $\mathcal{H}=\{h_1<h_2<\cdots<h_k\}$ the goal is to choose a non-negative function $a(n)$ so that

for some constant $\rho>1$. If this is the case, then by the non-negativity of $a(n)$, we have that for some $n$ between $x$ and $2x$ $$\left(\sum_{i=1}^k\chi_{\mathcal{P}}(n)(n+h_i)\right)\geq \lceil\rho \rceil,$$ and so there are at least $\lceil\rho \rceil$ primes in an interval of length at most $h_k-h_1$. This would imply Zhangs theorem on bounded gaps between primes, and if we can take $\rho$ arbitrarily large, it would yield the Maynard-Tao theorem. However, choosing a function $a(n)$ for which equation $(1)$ this holds is very difficult. Of course we would like to take $$a(n)=\begin{cases}
1 & \text{when each of }n+h_{i}\text{ are prime}\\
0 & \text{otherwise}
\end{cases} ,$$
as this maximizes the ratio of the two sides, but then we cannot evaluate $\sum_{x<n\leq 2x} a(n)$ as that is equivalent to understanding the original problem. Goldston Pintz and Yildirim use what is known as Selberg sieve weights. They chose $a(n)$ so that it is $1$ when each of $n+h_i$ are prime, and non-negative elsewhere, in the following way:

Let $\lambda_1=1$ and $\lambda_d=0$ when $d$ is large, say $d>R$ where $R<x$ will be chosen to depend on $x$. Then set $$a(n)=\left(\sum_{d|(n+h_1)\cdots(n+h_k)} \lambda_d\right)^2.$$ By choosing $a(n)$ to be the square of the sum, it is automatically non-negative. Next, note that if each of $n+h_i$ is prime, then since they are all greater than $x$, which is greater than $R$, the sum over $\lambda_d$ will consist only of $\lambda_1$. This means that $a(n)=1$ when each of the $n+h_i$ are prime, and hence $a(n)$ satisfies both of the desired properties. Of course, we do not want $a(n)$ to be large when none, or even very few of the $n+h_i$ are prime, so we are left with the difficult optimization problem of choosing $\lambda_d$. The optimization in the classical application of the Selberg sieve to bound from above the number of prime-tuples involves diagonalizing a quadratic form, leading to

This choice is not so surprising since the related function $\sum_{d|n}\mu(d) \left(\log(n/d)\right)^k$ is supported on the set of integers with $k$ prime factors or less. Goldston Pintz and Yildirim significantly modified this, and used a higher dimensional sieve to obtain their results.

The critical difference in the approach of Maynard and Tao is choosing weights of the form

$$a(n)=\left(\sum_{d_{i}|(n+h_{i})\ \forall i}\lambda_{d_{1}\cdots d_{k}}\right)^{2}.$$ This gives extra flexibility since each $\lambda_{d_1,\dots,d_k}$ is allowed depend on the divisors individually. Using this additional flexibility Maynard and Tao independently established equation $(1)$ for any $\rho>1$, with $k$ depending on $\rho$.

In particular, the key is to show that this sum is positive for all large $N$, which means that some parts of the sum must have positive contribution. The weight $\Lambda_R (n; \mathcal{H}_k, l)^2$ is obviously positive so this implies that $\displaystyle \left(\sum_{i=1}^k \theta(n + h_i) - \log 3N \right)$ must be positive, which implies that for infinitely many $n$ the tuple $n, n+h_1, \cdots, n+h_k$ must contain at least two primes.

Note that here the counting function involves $\theta(n) = \log n$ for $n$ a prime, and zero otherwise. This is based on the premise that this function is easier to work with analytically than just the characteristic function on the primes.

However, Maynard was able to circumvent this technical difficulty and was able to work with this counting function instead

Here $\chi_{\mathbb{P}}$ is the characteristic function on the primes, and $\rho$ is a positive weight.

Edit: As pointed out below, the main difference between this and the above counting function isn't the use of the characteristic function $\chi_{\mathbb{P}}(n)$ over $\theta(n)$ (because looking at the range of $n$ be considered, one can see immediately that $\log N < \theta(n+h_i) < \log(2N)$ ). The difference, then, was that Maynard was able to obtain a positivity result for (comparing to the original counting function)
$$\displaystyle \sum_{n=N+1}^{2N} \left(\sum_{i=1}^k \theta(n+h_i) - \rho \log3N\right) w_n,$$
where $\rho$ is now allowed to be any positive number. The details of how he is able to obtain these results (or at least a sketch of the main ideas) is given in the answer below, which admittedly is much deeper than this one.

By the way, all of this is nicely explained in Maynard's preprint.
–
Andres CaicedoNov 23 '13 at 18:47

5

Summing $\theta(n)-\rho\log(3N)$ versus $\chi_P(n)-\rho$ doesn't affect the argument. The two counting functions mentioned above are essentially the same, except for the weights. The major difference lies in the choice of $\Lambda_R(n;\mathcal{H}_k,l)^2$ and $w_n$.
–
Eric NaslundNov 23 '13 at 21:07

7

I feel I should articulate less gently what Eric already said (for those skimming quickly): The above "answer" is completely wrong. The main difference lies in the choice of weights as explained in Eric's post.
–
blablerNov 23 '13 at 23:48

2

Thanks to both of you (blunt or not) for pointing out my error. Eric's answer is definitely much better and was helpful to me as well.
–
Stanley Yao XiaoNov 24 '13 at 0:11