The other day while discussing math, and primes specifically, the following question came to mind, and I figured I'd ask it here to see what people's opinions on it might be.

Main Question: Suppose that tomorrow someone proves that some function always generates (concrete) primes for any input. How should this affect lists such as the Largest Known Primes?

Let me give a little more detail to demonstrate why I feel this question is not entirely trivial or fanciful.

Firstly, the requirement that the function be able to concretely generate primes is meant to avoid 'stupid' examples such as Nextprime(n) which, given a 'largest known prime' P, yields a larger prime Nextprime(P). Note however that the definition of Nextprime does not actually explicitly state what this prime is, any implementation of it (in Maple or Mathematica for example) simply loops through the integers bigger than the input, testing each for primality in some fashion.

On the other hand, one candidate for such a function might be the Catalan sequence defined by:

$C(0) = 2$, $C(n+1) = 2^{C(n)}-1$

Although $C(5) = 2^{170141183460469231731687303715884105727}-1$ is far too large to test by current methods (with rougly $10^{30}$ times as many digits as the current largest known prime), and although the current consensus is that $C(5)$ is likely composite, it does not seem entirely out of the realm of possibility that someone might eventually find some very clever way of showing $C(5)$ is prime, or even that $C(n)$ is always prime, or perhaps some other concretely defined sequence.

The point is this: once one knows that every element of a sequence is prime, does this entirely negate things like the list of largest known primes? Or does the fact that $C(n)$ for $n\geq 5$ has too many digits to ever calculate all of them (instead only being able to calculate the first few or last few digits) mean that even if they were somehow proven prime it would not technically be 'known'?

Note also that in the realm of finite simple groups the analogous question is already tough to decide since there are infinite families of such groups known, but concrete descriptions (such as generators and relations or character tables) are not always available or even computable within reasonable time constraints. Likewise one could pose analogous questions in other branches (largest volume manifolds with certain constraints, etc.)

Anyhow, it seems like a reasonable question for serious mathematicians to consider, so I just want to hear what other's opinions are on the subject (and if anyone can think of a better title, feel free to suggest).

In the course of the proof of Hilbert's 10th, there is a multivariate polynomial such that all positive values it takes at integers must be primes, which is not quite what you ask about but pretty close. Anyway, such polynomials are known explicitly, but as far as I can tell (disclaimer: I'm not a specialist!), it has not had any practical impact.
–
Thierry ZellApr 25 '11 at 14:18

2

As I recall, although such polynomials are known explicitly, it is notoriously hard to find inputs which make them positive, and even then the resulting primes so far are small, making the polynomial essentially useless for discovering large primes. On the other hand, if one discovered ways of generating sets of inputs that gave really large primes, that would give an example of such a function.
–
ARupinskiApr 25 '11 at 14:21

3

What does it mean to "negate things like the list of largest known primes"?
–
Mariano Suárez-Alvarez♦Apr 25 '11 at 14:31

You haven't yet given a coherent definition of "concrete". The function which maps a positive integer $n$ to the $n$th prime number $p_n$ is about as concrete as any function I know (e.g. even I can write a computer program which performs this function). In the absence of this, I don't see a math question here, or a math-philosophy question either.
–
Pete L. ClarkApr 25 '11 at 20:13

6 Answers
6

First of all, I don't think the idea that "knowing a prime requires knowing its decimal expansion" accords well with mathematical practice. Unless I'm mistaken, the largest known primes are all Mersenne primes, and (for good reason!) are almost always written in the form p=2k-1, not by their decimal expansions. Granted, the currently-known Mersennes are small enough that one could calculate their decimal expansions in Maple or Mathematica, if for some reason one wanted to. But even if that weren't the case (say, if k had 10,000 digits), I'd still be perfectly happy to describe p=2k-1 as a "known prime," provided someone knew both k and a proof that p was prime.

On the other hand, similar to what you suggested with your "NextPrime" function, what about

p := the 1010^10000th prime number ?

Certainly p exists, and one can even write a program to output it. But is p therefore "known"? Saying so seems to stretch the meaning of the word "known" beyond recognition.

Trying to arrive at some principled criterion that separates the two examples above, here's the best that I came up with:

An n-digit prime number p is "known" if there's a known algorithm to output the digits of p that runs in poly(n) time (together with a proof that the algorithm does indeed output a prime number and halt in poly(n) steps).

(Strictly speaking, the above definition covers "known-ness" for infinite families of primes, rather than individual primes -- since once you fix p, you can always output it in O(1) time. But this is a standard caveat.)

As far as I can see, the above definition correctly captures the intuition that a prime p is "known" if we know a closed-form formula for p (which can be evaluated in polynomial time), but not if we merely know a non-constructive definition of p (for which it takes exponential time to determine which p we're talking about).

A very interesting test case for my definition is

p := the first prime larger than 1010^10000.

According to my definition, the above prime is currently "unknown", but will become "known" if someone proves the conjecture that the spacing between two consecutive n-digit primes never exceeds q(n) for some fixed polynomial q.

If you accept my definition, then a "function that always generates primes" almost certainly would trivialize largest-prime contests, since presumably it would give a deterministic way to generate n-digit primes in nO(1) time, for n as large as you like (which is not something that we currently have).

Now, maybe there are cases where my definition fails to match up with "intuitive knowability" -- if so, I look forward to seeing counterexamples!

What about f(x) = the smallest prime greater than lg x? Sieving the numbers between lg x and 2 lg x takes polynomial time, but I wouldn't say that we 'know' f(2^(10^(10^10000))).
–
CharlesApr 26 '11 at 18:51

Scott, if we except your definition then we have a "function that generates primes" both under standard ulra difficult number theory conjectures (Cramer's conjecture) but also under strong derandomization conjectures from CC. I do not see why it trivializes largest primes contests.
–
Gil KalaiApr 26 '11 at 18:54

In any case, the type of formula the OP asked about seems stronger than just "computable by a polynomial algorithm". Although CC may be applied to give a definition. The hypothetical example in the question you have an n digit prime by log*(n) arithmetic operations (exponentiation included). this looks like a rather low complexity...
–
Gil KalaiApr 26 '11 at 18:58

1

@Charles: I don't know if he just fixed this or not, but in the way it is currently written the running time is a function of the number of digits of $p$, not the digits of the input to the generator so your example is incorrect.
–
MikolaApr 26 '11 at 20:23

1

Certainly complexity theoretic definitions of that kind are nice and quite useful. (BTW, as far as I am concerned to know something with probability very close to 1 is "really" to know.) The opinion that once you show that a certain human activity represents an effort in P means that it is "trivialized" is a bit too far since it is quite plausible that all human activities represent efforts in P.
–
Gil KalaiApr 27 '11 at 6:11

I don't think in principle a sharp line can be drawn between "known" primes such as $2^{43112609}-1$ (the current record), and on the other hand "unknown" but well-defined ones like the first prime larger than a given number.

As has already been pointed out, the current record prime is still small enough to count as known according to the naive idea that we know its decimal representation. It has a few million digits, and finding them requires little extra computer time compared to testing primality in the first place.

If this situation changes due to new methods for primality testing of huge numbers (say whose digits cannot even be stored in all the world's computer memory), I guess we will fall back on an intuitive notion of when a number is known.

It will be difficult to come up with a sharp definition of known-ness in terms of computational complexity, since one can find an $n$ digit prime in polynomial time (albeit not provably deterministically) and naturally the record primes will be in the range where the degree (or even the constant) of the polynomial is crucial.

Speculating about future development of number theory and primality testing, I think the consensus is that if $C(5)$ is proved prime tomorrow, it will go to the top of the list, whereas if someone establishes that $C(n)$ is prime for every $n$, the conclusion will be that infinitely many primes are known explicitly.

Another possibility is improved primality testing of general numbers. If arbitrary numbers could be tested for primality as efficiently as Mersenne numbers, then possibly the new world record primes would consist of fancy patterns of digits, spiced with secret encodings of geek humor.

The situation is in principle different for the record twin primes. The current record is
$65516468355\cdot 2^{333333}\pm 1$, and as far as I understand, it is not rigorously known that there are any larger twin primes at all. If someone proved with some sort of sieve that an interval of larger numbers must contain a twin prime pair, then I guess the list of record twin primes would split into "explicitly" and "theoretically" known pairs.

I think you are mis-understanding the purpose of the "largest prime" list. There are areas of mathematics in which we genuinely do not know how to generate or catalogue the objects of certain types. Prime numbers do not fall into this category.

Rather, these lists basically exist as a benchmark for number-theory computer algorithms. For example, if you develop a new supercomputer, you can prove its prowess by testing primality of some bigger numbers not on the list.

Having contributed to the Largest Prime Lists in the past, I agree that much of the interest in it derives from testing faster algorithms and architectures. In light of this, it seems that if new ways of generating large primes are discovered then I am curious what others think about to what extent topics such as the Largest Known Primes lists might need to be suitably modified; i.e. would one need to distinguish whether a prime was 'discovered' in order to test the speed of a computer, or as the output of some deterministic function, or for some other reason.
–
ARupinskiApr 27 '11 at 0:05

I think that the discovery of a prime-generating algorithm would simply split the study up into a "list of largest known primes not generated by algorithm X" and "primes generated by algorithm X", much like the classification of finite simple groups.

Following this to its logical conclusion, even if we provably knew every algorithm generating $n$-digit primes in poly($n$)-time, we'd still be left with the question of which primes weren't generated by one of these functions, giving a concept of 'sporadic' prime, and it would then be these that were of interest.

I've always understood "the largest known primes" to colloquially refer to primes which (i) have decimal expansions which can be written down in a short amount of time and (ii) have primality proofs which have been double/triple checked.

I imagine that if the sequence you suggest were shown to always be prime, people would stop talking about the largest known primes. They might continue to search for ways to find lots of primes of the same large size, and improve the bound on that size.

Edit: The first version of this answer proposed a criterion based on Kolmogorov complexity, but as Scott Aaronson commented, this is not very restrictive since primes all have rather low complexity: primality not being very hard to design an inefficient test for, the n'th prime $p_n$ has complexity at most $\log n \approx \log\log p_n$. It appears that what I had in mind was more or less what he wrote in his answer.

However, it seems to me that the issue of complexity is part of what makes the question interesting: extremely large numbers should be expected to have extremely large complexity, and it is the disparity between the low complexity of some short description like C(5) and the high complexity of the digit string of C(5) itself which makes the description seem not to be concrete. More generally, one may doubt the knowability of extremely large numbers on the grounds that they cannot be written explicitly, as stated in the question.

So contrary to the claim that C(5) is not a concrete description of a number, I think that due to its low complexity it is much more concrete than its size would suggest, and its base-2 digits are easily listed; most numbers of its magnitude are wholly abstract and their digits in any base will never be known. In fact, I am not entirely sure that a prime-enumerating algorithm which computes C(5) in time polynomial in C(4) (its number of binary digits), as Scott suggests, is actually especially concrete, unless it takes an input significantly smaller than C(5) (note that C(5) is about the C(5)'th prime).

That is, a computation may be efficient without being concrete. In the spirit of Alastair Litterick's answer, I'd like to suggest that

An algorithm for listing (some infinite family of) primes which is efficient in the sense of Scott's answer is also "concrete" if the length of its output is superpolynomial in the length of its input.

More generally, I suppose it would make sense to quantify just how much larger the output is, for the purposes of probing very distant primes.

Since the question declares itself to be philosophical, I also have some philosophical thoughts on its meaning. These don't address the question in the same way as above.

I think that the word "known" should be taken with a grain of salt even in the case of the current record largest prime, even if it were expressed in base-10 digits. Looking for record-size primes is an activity outside of mathematics, just like astronomers' search for extrasolar planets is outside of geology, though if we ever went to one, we could study it geologically. With a proof in hand that there are infinitely many primes, finding a particular one is useful only if we require specific numbers for some task, like cryptography, whose execution is not even a matter of computer science once it is implemented. This is not to say that the construction of a prime search is not a matter of both mathematics and computer science: for example, testing Mersenne primes is a strategy drawn from mathematics, since it is not known that there are infinitely many of those, and doing the testing efficiently is computer science. However, successful execution of the search is neither.

In contrast, knowing a prime, or anything, requires being able to answer questions about it; better, the questions should not have known general answers. For example, "is the last digit 3?" is a fine question, but that just asks for the residue modulo 10, and Dirichlet's theorem already describes the answer to that question statistically. One might be curious about the Chebyshev bias (which residue class has the most primes up to a certain size) but that can't be settled one way or another by looking at individual examples. On the other hand, even 2 is not fully understood as a prime, since we can't say modulo which primes it is a primitive root (implicitly, for example "the ones which are 5 mod 7"). This, like Mersenne primes, is another list not known to be infinite.

Aside from conjectures about individual primes that can be tested on specific numbers, there are statistical conjectures, similar in nature to Dirichlet's theorem, which are not settled and also can't be settled by a sparse prime search. For example, one might want to know whether a particular prime $p_n$ begins a maximal prime gap (larger than any preceding gap), for which the only possible computation is of an exhaustive list of primes up to and including $p_{n + 1}$.

Suppose, though, that we had an algorithm to generate a list of all primes, in order, giving all n-digit primes in provably polynomial time. We could still not verify the Riemann hypothesis in the form that
$$\lvert \pi(x) - \operatorname{Li}(x) \rvert = O(x^{1/2} \log x)$$
unless we had a much deeper understanding of the behavior of that algorithm. Not knowing this, it would be unreasonable to say that the prime numbers are "known" as a set. And I don't think it's too high a standard to say that if all primes are "known", then the prime numbers are known as a whole.

In short, I don't think that any mere list of primes, finite or infinite, can constitute knowledge of its elements.

Ryan, your definition doesn't seem to do what you want it to. In particular, the 10^10^10000th prime is certainly "known" according to your definition, since the description that I just gave AND the decimal expansion of the number BOTH have minuscule Kolmogorov complexities! One can easily construct a small Turing machine that outputs the decimal expansion of the 10^10^10000th prime. The fact that that Turing machine will take an astronomical amount of time is irrelevant for Kolmogorov-complexity purposes! I focused on polynomial time precisely in order to rule such things out.
–
Scott AaronsonApr 26 '11 at 17:50

1

You're right, I confused the kinds of complexity. After some thought, it seems that what have in mind comes out much the same as what you said: a description (Turing machine) is cheating if it takes much longer than the size of the number it computes.
–
Ryan ReichApr 26 '11 at 18:26