The Next Prime

March 26, 2010

In two previousexercises, we had to iterate through the prime numbers. In one case, we generated a large number of primes using the Sieve of Eratosthenes, but without knowing in advance how large the sieve needed to be, and in the other case we iterated through the odd integers, checking the primality of each. Both solutions were less than attractive. In consideration of the old rule that if you do something twice you ought to build it into an abstraction, we will today write a function that, given a positive integer n, returns the smallest prime number greater than n.

Our method is to pre-compute a large number of primes and store them on disk. If n is within the bounds of the pre-computed list, it is easy to find the next prime. But if n is too large, we revert to checking individual candidates for primality. For our example we will pre-compute the primes to a million, but depending on your aspirations and your memory budget, you could adjust that number as desired.

To save memory space, we will store the pre-computed primes in a compressed data structure. Every prime number can be expressed as 30k±1, 30k±7, 30k±11, or 30k±13 for some k. That means we can use eight bits per thirty numbers to store all the primes; a million primes can be compressed to 33,334 bytes, plus a small program to load the compressed primes from disk and to manipulate the compressed data structure.

Your task is to write a function that builds the compressed data structure described above, a second function that loads it from disk to memory, and a third function that uses the compressed data structure to calculate the next prime. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

A more serious effort is this factorization of the sixty-ninth repunit R69 = (10^69 – 1) / 9. The small factors 3, 37 and 277 are quickly found by trial division. Then Pollard’s p-1 method with bounds B1 = 30000 and B2 = 600000 find the fifteen digit factor 203864078068831. Applying the p-1 method again with the same bounds finds the twenty-three digit factor 11111111111111111111111, another repunit. The remaining twenty-eight digit factor 1595352086329224644348978893 is prime.

If you are interested in the factorization of repunits, see Makoto Kamada’s web page.

A clojure solution, includes the Miller Rabin primality test, modified to use known deterministic tests when possible. To access the table, we use the fact that the expression x & -x masks out all bits in x except for the lower bit, so this is a constant time operation to get the smallest bit. We then index into a table mod 11 to convert the lowbit (which is a power of two) to the appropriate offset (1, 7, 11, 13, 17, 19, 23, or 29) I use an array of masks (based on the input mod 30,) to mask out the bits we will not consider prior to calculating x & -x. For primes > 1,000,000, I pretty much follow the reference solution.