February 14, 2011

An update on the status of the Polymath4 paper on finding primes. I’ve received a referee report from Mathematics of Computation on the submission, which can be found here. The referee liked the result but wanted a fair number of expository changes before he or she was willing to recommend acceptance, so the editor has asked for a revision. I will be happy to make the relevant changes, but if there are any other changes that other participants would like to make, now would be a good time to suggest them. (The most recent version of the paper can be found at the Subversion repository or at this link; see also the arXiv version.)

One change requested is to add a list of participants to the project. In analogy with what we did for Polymath1, I therefore started a “signup sheet” on the wiki at

for people to self-report their participation, contact information, and grant information for the project. There is the usual problem of trying to decide who is a “main participant” of the project, and who is a “contributor” (though I think I can safely add Ernie, Harald, and myself as participants); as with Polymath1, I will leave it to each of you to self-report what level of participation you feel is appropriate.

The paper is focused on what I think is our best partial result, namely that the prime counting polynomial has a circuit complexity of for some absolute constant whenever and . As a corollary, we can compute the parity of the number of primes in the interval in time .

I’d be interested in hearing the other participants opinions about where to go next. Ernie has suggested that experimenting with variants of the algorithm could make a good REU project, in which case we might try to wrap up the project with the partial result and pass the torch on.

The current goal is to find a deterministic way to locate a prime in an interval in time that breaks the “square root barrier” of (or more precisely, ). Currently, we have two ways to reach that barrier:

Assuming the Riemann hypothesis, the largest prime gap in is of size . So one can simply test consecutive numbers for primality until one gets a hit (using, say, the AKS algorithm, any number of size z can be tested for primality in time .

The second method is due to Odlyzko, and does not require the Riemann hypothesis. There is a contour integration formula that allows one to write the prime counting function up to error in terms of an integral involving the Riemann zeta function over an interval of length , for any . The latter integral can be computed to the required accuracy in time about . With this and a binary search it is not difficult to locate an interval of width that is guaranteed to contain a prime in time . Optimising by choosing and using a sieve (or by testing the elements for primality one by one), one can then locate that prime in time .

Currently we have one promising approach to break the square root barrier, based on the polynomial method, but while individual components of this approach fall underneath the square root barrier, we have not yet been able to get the whole thing below (or even matching) the square root. I will sketch the approach (as far as I understand it) below; right now we are needing some shortcuts (e.g. FFT, fast matrix multiplication, that sort of thing) that can cut the run time further.

Currently we are up against the “square root barrier”: the fastest time we know of to find a k-digit prime is about (up to factors), even in the presence of a factoring oracle (though, thanks to a method of Odlyzko, we no longer need the Riemann hypothesis). We also have a “generic prime” razor that has eliminated (or severely limited) a number of potential approaches.

One promising approach, though, proceeds by transforming the “finding primes” problem into a “counting primes” problem. If we can compute prime counting function in substantially less than time, then we have beaten the square root barrier.

Currently we have a way to compute the parity (least significant bit) of in time , and there is hope to improve this (especially given the progress on the toy problem of counting square-frees less than x). There are some variants that also look promising, for instance to work in polynomial extensions of finite fields (in the spirit of the AKS algorithm) and to look at residues of in other moduli, e.g. , though currently we can’t break the barrier for that particular problem.

August 13, 2009

This is a continuation of Research Thread II of the “Finding primes” polymath project, which is now full. It seems that we are facing particular difficulty breaching the square root barrier, in particular the following problems remain open:

Can we deterministically find a prime of size at least n in time (assuming hypotheses such as RH)? Assume one has access to a factoring oracle.

Can we deterministically find a prime of size at least n in time unconditionally (in particular, without RH)? Assume one has access to a factoring oracle.

We are still in the process of weighing several competing strategies to solve these and related problems. Some of these have been effectively eliminated, but we have a number of still viable strategies, which I will attempt to list below. (The list may be incomplete, and of course totally new strategies may emerge also. Please feel free to elaborate or extend the above list in the comments.)

Strategy A: Find a short interval [x,x+y] such that , where is the number of primes less than x, by using information about the zeroes of the Riemann zeta function.

Comment: it may help to assume a Siegel zero (or, at the other extreme, to assume RH).

Strategy B: Assume that an interval [n,n+a] consists entirely of u-smooth numbers (i.e. no prime factors greater than u) and somehow arrive at a contradiction. (To break the square root barrier, we need , and to stop the factoring oracle from being ridiculously overpowered, n should be subexponential size in u.)

Comment: in this scenario, we will have n/p close to an integer for many primes between and u, and n/p far from an integer for all primes larger than u.

Strategy C: Solve the following toy problem: given n and u, what is the distance to the closest integer to n which contains a factor comparable to u (e.g. in [u,2u])? [Ideally, we want a prime factor here, but even the problem of getting an integer factor is not fully understood yet.] Beating here is analogous to breaking the square root barrier in the primes problem.

Comments:

The trivial bound is u/2 – just move to the nearest multiple of u to n. This bound can be attained for really large n, e.g. . But it seems we can do better for small n.

For , one trivially does not have to move at all.

For , one has an upper bound of , by noting that having a factor comparable to u is equivalent to having a factor comparable to n/u.

For , one has an upper bound of , by taking to be the first square larger than n, to be the closest square to , and noting that has a factor comparable to u and is within of n. (This paper improves this bound to conditional on a strong exponential sum estimate.)

For n=poly(u), it may be possible to take a dynamical systems approach, writing n base u and incrementing or decrementing u and hope for some equidistribution. Some sort of “smart” modification of u may also be effective.

Strategy D. Find special sequences of integers that are known to have special types of prime factors, or are known to have unusually high densities of primes.

Comment. There are only a handful of explicitly computable sparse sequences that are known unconditionally to capture infinitely many primes.

Strategy E. Find efficient deterministic algorithms for finding various types of “pseudoprimes” – numbers which obey some of the properties of being prime, e.g. . (For this discussion, we will consider primes as a special case of pseudoprimes.)

Comment. For the specific problem of solving there is an elementary observation that if n obeys this property, then does also, which solves this particular problem; but this does not indicate how to, for instance, have and obeyed simultaneously.

As always, oversight of this research thread is conducted at the discussion thread, and any references and detailed computations should be placed at the wiki.

August 9, 2009

This thread marks the formal launch of “Finding primes” as the massively collaborative research project Polymath4, and now supersedes the proposal thread for this project as the official “research” thread for this project, which has now become rather lengthy. (Simultaneously with this research thread, we also have the discussion thread to oversee the research thread and to provide a forum for casual participants, and also the wiki page to store all the settled knowledge and accumulated insights gained from the project to date.) See also this list of general polymath rules.

The basic problem we are studying here can be stated in a number of equivalent forms:

Problem 1. (Finding primes) Find a deterministic algorithm which, when given an integer k, is guaranteed to locate a prime of at least k digits in length in as quick a time as possible (ideally, in time polynomial in k, i.e. after steps).

Problem 2. (Finding primes, alternate version) Find a deterministic algorithm which, after running for k steps, is guaranteed to locate as large a prime as possible (ideally, with a polynomial number of digits, i.e. at least digits for some .)

To make the problem easier, we will assume the existence of a primality oracle, which can test whether any given number is prime in O(1) time, as well as a factoring oracle, which will provide all the factors of a given number in O(1) time. (Note that the latter supersedes the former.) The primality oracle can be provided essentially for free, due to polynomial-time deterministic primality algorithms such as the AKS primality test; the factoring oracle is somewhat more expensive (there are deterministic factoring algorithms, such as the quadratic sieve, which are suspected to be subexponential in running time, but no polynomial-time algorithm is known), but seems to simplify the problem substantially.

The problem comes in at least three forms: a strong form, a weak form, and a very weak form.

Weak form: Deterministically find a prime of at least k digits in time, or equivalently find a prime larger than in time O(k) for any fixed constant C.

Very weak form: Deterministically find a prime of at least k digits in significantly less than time, or equivalently find a prime significantly larger than in time O(k).

The pr0blem in all of these forms remain open, even assuming a factoring oracle and strong number-theoretic hypotheses such as GRH. One of the main difficulties is that we are seeking a deterministic guarantee that the algorithm works in all cases, which is very different from a heuristic argument that the algorithm “should” work in “most” cases. (Note that there are already several efficient probabilistic or heuristic prime generation algorithms in the literature, e.g. this one, which already suffice for all practical purposes; the question here is purely theoretical.) In other words, rather than working in some sort of “average-case” environment where probabilistic heuristics are expected to be valid, one should instead imagine a “Murphy’s law” or “worst-case” scenario in which the primes are situated in a “maximally unfriendly” manner. The trick is to ensure that the algorithm remains efficient and successful even in the worst-case scenario.

Below the fold, we will give some partial results, and some promising avenues of attack to explore. Anyone is welcome to comment on these strategies, and to propose new ones. (If you want to participate in a more “casual” manner, you can ask questions on the discussion thread for this project.)

Also, if anything from the previous thread that you feel is relevant has been missed in the text below, please feel free to recall it in the comments to this thread.

July 28, 2009

The proposal “deterministic way to find primes” is not officially a polymath yet, but is beginning to acquire the features of one, as we have already had quite a bit of interesting ideas. So perhaps it is time to open up the discussion thread a little earlier than anticipated. There are a number of purposes to such a discussion thread, including but not restricted to:

To summarise the progress made so far, in a manner accessible to “casual” participants of the project.

To have “meta-discussions” about the direction of the project, and what can be done to make it run more smoothly. (Thus one can view this thread as a sort of “oversight panel” for the research thread.)

To ask questions about the tools and ideas used in the project (e.g. to clarify some point in analytic number theory or computational complexity of relevance to the project). Don’t be shy; “dumb” questions can in fact be very valuable in regaining some perspective.

(Given that this is still a proposal) To evaluate the suitability of this proposal for an actual polymath, and decide what preparations might be useful before actually launching it.

To start the ball rolling, let me collect some of the observations accumulated as of July 28:

A number of potentially relevant conjectures in complexity theory and number theory have been identified: P=NP, P=BPP, P=promise-BPP, existence of PRG, existence of one-way functions, whether DTIME(2^n) has subexponential circuits, GRH, the Hardy-Littlewood prime tuples conjecture, the ABC conjecture, Cramer’s conjecture, discrete log in P, factoring in P.

The problem is solved if one has P=NP, existence of PRG, or Cramer’s conjecture, so we may assume that these statements all fail. The problem is probably also solved on P=promise-BPP, which is a bit stronger than P=BPP, but weaker than existence of PRG; we currently do not have a solution just assuming P=BPP, due to a difficulty getting enough of a gap in the success probabilities.

Existence of PRG is assured if DTIME(2^n) does not have subexponential circuits (Impagliazzo-Wigderson), or if one has one-way functions (is there a precise statement to this effect?)

Discrete log being in hard (or easy) may end up being a useless hypothesis, since one needs to find large primes before discrete logarithms even make sense.

If the problem is false, it implies (roughly speaking) that all large constructible numbers are composite. Assuming factoring is in P, it implies the stronger fact that all large constructible numbers are smooth. This seems unlikely (especially if one assumes ABC).

Besides adding various conjectures in complexity theory or number theory, we have found some other ways to make the problem easier:

The trivial deterministic algorithm for finding k-bit primes takes exponentially long in k in the worst case. Our goal is polynomial in k. What about a partial result, such as exp(o(k))?

An essentially equivalent variant: in time polynomial in k, we can find a prime with at least log k digits. Our goal is k. Can we find a prime with slightly more than log k digits?

The trivial probabilistic algorithm takes O(k^2) random bits; looks like we can cut this down to O(k). Our goal is O(log k) (as one can iterate through these bits in polynomial time). Can we do o(k)?

Rather than find primes, what about finding almost primes? Note that if factoring is in P, the two problems are basically equivalent. There may also be other number theoretically interesting sets of numbers one could try here instead of primes.

At the scale log n, primes are assumed to resemble a Poisson process of intensity 1/log n (this can be formalised using a suitably uniform version of the prime tuples conjecture). Cramer’s conjecture can be viewed as one extreme case of this principle. Is there some way to use this conjectured Poisson structure in a way without requiring the full strength of Cramer’s conjecture? (I believe there is also some work of Granville and Soundarajan tweaking Cramer’s prediction slightly, though only by a multiplicative constant if I recall correctly.)

July 27, 2009

Problem. Find a deterministic algorithm which, when given an integer k, is guaranteed to find a prime of at least k digits in length of time polynomial in k. You may assume as many standard conjectures in number theory (e.g. the generalised Riemann hypothesis) as necessary, but avoid powerful conjectures in complexity theory (e.g. P=BPP) if possible.

The point here is that we have no explicit formulae which (even at a conjectural level) can quickly generate large prime numbers. On the other hand, given any specific large number n, we can test it for primality in a deterministic manner in a time polynomial in the number of digits (by the AKS primality test). This leads to a probabilistic algorithm to quickly find k-digit primes: simply select k-digit numbers at random, and test each one in turn for primality. From the prime number theorem, one is highly likely to eventually hit on a prime after about O(k) guesses, leading to a polynomial time algorithm. However, there appears to be no obvious way to derandomise this algorithm.

Now, given a sufficiently strong pseudo-random number generator – one which was computationally indistinguishable from a genuinely random number generator – one could derandomise this algorithm (or indeed, any algorithm) by substituting the random number generator with the pseudo-random one. So, given sufficiently strong conjectures in complexity theory (I don’t think P=BPP is quite sufficient, but there are stronger hypotheses than this which would work), one could solve the problem.

Cramer conjectured that the largest gap between primes in [N,2N] is of size . Assuming this conjecture, then the claim is easy: start at, say, , and increment by 1 until one finds a prime, which will happen after steps. But the only real justification for Cramer’s conjecture is that the primes behave “randomly”. Could there be another route to solving this problem which uses a more central conjecture in number theory, such as GRH? (Note that GRH only seems to give an upper bound of or so on the largest prime gap.)

My guess is that it will be very unlikely that a polymath will be able to solve this problem unconditionally, but it might be reasonable to hope that it could map out a plausible strategy which would need to rely on a number of not too unreasonable or artificial number-theoretic claims (and perhaps some mild complexity-theory claims as well).

Note: this is only a proposal for a polymath, and is not yet a fully fledged polymath project. Thus, comments should be focused on such issues as the feasibility of the problem and its suitability for the next polymath experiment, rather than actually trying to solve the problem right now. [Update, Jul 28: It looks like this caution has become obsolete; the project is now moving forward, though it is not yet designated an official polymath project. However, because we have not yet fully assembled all the components and participants of the project, it is premature to start flooding this thread with a huge number of ideas and comments yet. If you have an immediate and solidly grounded thought which would be of clear interest to other participants, you are welcome to share it here; but please refrain from working too hard on the problem or filling this thread with overly speculative or diverting posts for now, until we have gotten the rest of the project in place.]