Estimating constellations among primes - I. Uniformity

A few years ago we identified a recursion that works directly with the gaps among the generators in each stage of Eratosthenes sieve. This recursion provides explicit enumerations of sequences of gaps among the generators, which are known as constellations.

In this paper, we use those enumerations to estimate the numbers of these constellations that occur as constellations among prime numbers, and we compare these estimates with computational results. We include in our estimates the constellations corresponding to three and four consecutive primes in arithmetic progression.

For these initial estimates, we assume that the copies of a given constellation tend toward a uniform distribution in the cycle of gaps, as the recursion progresses. Our simple estimates based on the recursion of gaps and the assumption of uniformity appear to have correct asymptotic behavior, and they exhibit a systematic error correlated to length of the constellation.

Estimates of occurrences of constellations among primes

Let p and q be consecutive prime numbers. Ns(p)
represents the number of occurrences of the constellation s in the cycle of gaps
G(p#). Then the simplest estimate of the number of occurrences of s in
the interval [q,q2] assumes a uniform distribution:

Es([q,q2]) = (q2-q) Ns(p) / p#

We have compared our estimates to the actual counts for several constellations.
For reference, our tabulations can be found here.

For this simple estimate, we observe that there appears to be a systematic error
based on the number j of gaps in the constellation s. The plot below
shows the percentage error between the estimates and actual counts, plotted against
the logarithm of q2.

For this plot, we ran computations up to q=1000003.
For each constellation, the estimates grow nicely, but there is an initial phase in which the
actual counts are more erratic. We see in each graph that this initial phase damps out and
the counts and estimates continue to track each other smoothly.

For the constellations we have tried so far, the simple estimate E exhibits a
logarithmic error in q, whose coefficient depends on the number j of gaps in the
constellation. We do not yet have an explanation for this discrepancy.

To give a sense of scale, we have placed along the axis some of the values p#.
We note that the computations up to 1E+12 can all be carried out within the cycle of
gaps G(37#). That is, for our computations up to q=1000003,
the concatenation step in the recursion
only has an effect through p=37.

Log adjusted estimate

Based on our computations, we make the log adjusted estimate Ês
for the number of occurrences of a constellation s of j gaps: