How do $n_1$ and $n_2$ compare? I think the answers below are assuming that $n_1$ is significantly larger than $n_2-n_1$.
–
David SpeyerJul 25 '10 at 18:35

@David: Without this assumption he could just generate all primes less $n_1$ using Atkin's sieve (cr.yp.to/primegen.html) if $n_2$ is around $10^{15}$ or simply forget about it if $n_2$ is much bigger.
–
j.p.Jul 25 '10 at 20:48

1

What do you consider large and what small? E.g., is $n_1$ around 1000 bit long? And does the "large" for $n_1$ mean the same as the "large" for $n_2−n_1$? (I wouldn't try to find all primes in an interval of length $2^{1000}$, but finding primes of size $2^{1000}$ is no problem.)
–
SomeoneJul 26 '10 at 8:44

1

Is there any reason why you don't want to generate the primes $<n_1$ for the cases [1] and [2]? Otherwise you can simply use the program from cr.yp.to/primegen.html . And in your cases [3] ($n_1=2^{100}$) and [4] ($n_2 = 2^{1000}$) with always $n_2 = 2*n_1$ the intervals are much too long. You'll get around $2^{100}/ln(2^{100}) =1.8..10^{28}$ rsp. $2^{1000}/ln(2^{1000})=1.5..10^{298}$ primes.
–
SomeoneJul 26 '10 at 10:20

1

Do not use the Gries algorithm! It's actually slower than the Sieve of Eratosthenes. See Pritchard's "Improved incremental prime number sieves".
–
CharlesJul 26 '10 at 20:22

9 Answers
9

The fastest approach should be first to sieve the numbers by marking the numbers divisible by small primes (for this you should use only one long division per short prime), then to use a Fermat test to the base 2 (as it is more efficient since multiplication with 2 is a left shift) on all unmarked numbers. Finally apply a certain number of Miller-Rabin tests to all candidates passing the Fermat test to reduce your error probability to a level you can tolerate (e.g., $2^{-100}).

Don't use a Fermat test; use a Miller-Rabin test for the initial scan. It's just as fast and slightly better. If you want an error level as low as $2^{-100}$ you should use one of the Frobenius tests like Grantham's (but not for the initial test, of course!)
–
CharlesJul 27 '10 at 6:37

@Charles: A Fermat test with base $2$ takes approx. 25% less time than a MR-test, as one can replace the Square-and-Multiply algorithm for the exponentiation by a Square-8-Times-Then-Multiply algorithm if the numbers have more than 256 bit. Per bit of the exponent one needs therefore only 9/8 operations instead of 3/2 in average.
–
j.p.Jul 27 '10 at 10:34

@Charles, part II: To get an error level of $2^{-100}$ one should use the results of Damgard, Landrock, Pomerance (math.dartmouth.edu/~carlp/PDF/paper88.pdf) instead of the weaker bound $4^{-k}$ for $k$ tests (DLP show that 9 tests for bit length 512 and 2 tests for bit length 768 are sufficient). I do not know of a similar analysis of Grantham's Frobenius test, but would be interested in it.
–
j.p.Jul 27 '10 at 10:46

This was originally a comment on TonyK's answer, but it was too large to submit as a comment so I'm putting it here. I thought the numbers might have value to some.

Let's say that it takes $b^{\lg 6}/10^{10}$ seconds to test a $b$-bit number with the M-R test and $b^{\lg 24}/10^{13}$ seconds to prove primality with ECPP. Testing a range up to $n_2=2^b$ with M-R + ECPP would take about
$$\left(n_2-n_1\right)\left(\frac{b^{\lg 6}}{10^{10}}+\frac{b^{\lg 24}}{10^{13}\ln n_2}\right)=\left(n_2-n_1\right)\left(\frac{b^{\lg 6}}{10^{10}}+\frac{b^{\lg 12}}{10^{13}\ln2}\right)$$
seconds. For $n_2-n_1=2^{20}$ and $n_2$ large, this is about $1.5(\lg n_2)^{\lg12}/10^7$ seconds.

On the other hand, suppose sieving $n_1$ to $n_2$ takes $\sqrt{n_2}/10^6+(n_2-n_1)/10^{8.5}$ seconds. For $n_2-n_1=2^{20}$ and $n_2$ large, this is about $\sqrt{n_2}$ seconds.

Equating the two suggests that, for intervals about a million wide, testing each member is superior to sieving beyond about $n_2>8\cdot10^8$. Only about 12 kB of memory are needed to store all the primes up to the square root of that limit, so the fourth power trick doesn't seem viable here at all -- by the time you run out of primary memory you shouldn't be sieving at all.

@Charles: may be your example is correct and valid when the interval $[n_1,n_2]$ is small compared with $n_2$, but when the interval is relatively wide, (for example when $n_1=n_2/2$), the same demonstration can be used to prove the contrary.
–
Mohammed MareyJul 27 '10 at 18:57

I used the width you used. For any fixed interval width it can be shown that beyond some fixed point, standard sieving (Eratosthenes/Atkin) is less efficient than testing each member in the range. (If you implement the algorithms yourself, it's obvious.) But of course I intend for the argument to go both ways: when the interval is wide compared to its height, sieving is better -- often vastly better.
–
CharlesJul 27 '10 at 19:36

@Charles: Yes, you are true, the efficiency of the used algorithm depends on the value of $n_1$ and $n_2$.
–
Mohammed MareyJul 28 '10 at 21:16

If $n_1$ and $n_2$ are not too big you can use Eratosthenes sieve (see wikipedia for this).
If they are big enought you can use sieve to cross out numbers which have small prime divisors and after that check every number that wasn't crossed out and check if it is prime using any primality test (see http://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test).

Unless you have very specialized needs, you can probably apply standard sieving techniques. Here are a couple references. First, hot off the press is Kjell Wooding's Calgary thesis [1], which includes a good overview of both software and hardware sieving techniques (e.g. FPGA-based sieves, Calgary's scalable sieve, etc). Its bibliography should prove useful as an entry point into the literature. Also, here [2] is a charming classic paper by D.H. Lehmer - one of the pioneers of computational number theory. It gives a nice succinct introduction to sieving in general. Such methods have a long venerable history that stimulated much development in both number theory and computer science, e.g. google "Lehmer sieve" and you'll discover many ingenious machines devised to carry out number theoretical computations long before the dawn of the modern digital computer, e.g. via bicycles chains, photoelectric devices, etc.

You can get away with storing $n_2^{1/4}$ bits for the sieving data. Generate the primes $<= n_2^{1/4}$, and use them to generate the primes in successive intervals of length $n_2^{1/4}$ dynamically, up to $n_2^{1/2}$. The primes thus generated can be used to generate the primes in the interval $[n_1,n_2]$.

I'm assuming you're asking if we can generate primes in [n1, n2] without generating the primes before that (using a sieve), to which the answer is, no, nothing aside from testing every number between n1 and n2 for primality using something like Rabin-Miller.

Note that this will be significantly slower than sieving unless the numbers are extremely large and the range [n1, n2] is comparatively small.

NB. From what I understand Miller-Rabin is still the preferred method for something like this in practice even though it's probabilistic; AKS is much much slower and Bernstein's probabilistic version of AKS is still slower than MR.
–
Steve HuntsmanJul 25 '10 at 19:04

Of course M-R doesn't prove primality, so in some cases it's inappropriate unless followed by a proof. For example, it would be a bad choice if you were enumerating the pseudoprimes like, e.g., Richard Pinch... chalcedon.demon.co.uk/rgep/carpsp.html
–
CharlesJul 25 '10 at 19:08

Yes, Rabin-Miller is used in practice in cryptography, usually 100 times to give a probability of failure of $1−1/2^{100}$ , which is close enough to certain for practical purposes. However, if this worries you too much, you can of course use something like AKS, which is deterministic.
–
BlueRajaJul 25 '10 at 23:39

1

That is what i am asking about, to generate all primes in [n1,n2] without generating primes less than n1.
–
Mohammed MareyJul 26 '10 at 8:23

Let $S=\sqrt{n_2}$. If S is bigger than your computer's memory in bits, then you're going to have trouble generating all the primes in that range; if you insist, break the range into pieces that fit into memory, mark off the ones with small divisors, and test the remaining numbers for primality. You want to do as much sieving as possible, so fill maybe half the memory with primes and use those for sieving.

If S fits in memory and $n_2-n_1$ is not much smaller than S, use a sieve (sieve of Atkin or Eratosthenes). You may need to use a segmented version of the sieve, depending on how large $n_2-n_1$ is.

If $n_2-n_1$ is significantly smaller than S, make a bit array of (segments of?) the range, mark off small divisors, and test the remaining members for primality.

This means we should test for primality of each number in [n1, n2] which is not nice if the length of [n1 ,n2] contains for examples $2^{10}$ or $2^{50}$ numbers !!. Or we should generate all primes p < n2 which is not nice if n1 is a large number for examples $2^{50}$!!
–
Mohammed MareyJul 26 '10 at 10:16

1

@Mohammed: The distance between primes is ~ ln(n), so even if you are only looking for the primes between [$2^{50}$, $2^{51}$) ($2^{50}$ numbers), you are still looking at many terabytes of space to store all those primes.
–
BlueRajaJul 26 '10 at 16:29

1

Practically speaking, if you want to generate all primes in a large interval, you need to be able to store the primes up to the square root of the upper end of the interval in main memory. To sieve [2^50, 2^51] would require storing the 2,857,269 primes up to sqrt(2^51) which is pretty easy (11 MB using 32-bit integers). With 8 GB of main memory you could sieve an interval with upper bound of 2^68.
–
CharlesJul 26 '10 at 17:00

@Charles: Is there is a clear reference explaining in details your practical comment?
–
Mohammed MareyJul 27 '10 at 11:46

1

@Mo: You'll have to if you're using the Sieve of Eratosthenes or Atkin. But if you're looking at $[2^{1000},2^{1000}+2^{20}]$ then you won't want to store the ~244 TB of primes up to the square root. Instead you should sieve out the multiples of the first thousand primes (removing 94% of the range), M-R the remaining numbers (removing ~98% of the candidates), and prove the primality of the remainder -- or test them to sufficient confidence, as desired.
–
CharlesJul 27 '10 at 16:39

The other answers have covered the right strategies, but here are a few additional comments.

On an idealized computer with an unbounded amount of memory, the cost of sieving $[n_1,n_2]$ for the prime $p$ is $C_1(n_2-n_1)/p$ for some constant $C_1$, the average cost of a memory read & write operation. Therefore, the cost of sieving $[n_1,n_2]$ for all primes up to some bound $B$ is
$$C_1 (n_2 - n_1) \sum_{p\leq B} \frac{1}{p} = (n_2 - n_1) (C_1 \log \log B + O(1)),$$
by the 2nd Mertens Theorem. The numbers in $[n_1,n_2]$ that remain after sieving is approximately
$$(n_2-n_1)\prod_{p \leq B} \left(1-\frac{1}{p}\right) = (n_2 - n_1) \frac{e^{-\gamma}}{\log B},$$
by the 3rd Mertens Theorem. Therefore, the time for testing the remaining numbers for primality is about
$$C_2(n_2-n_1) \frac{e^{-\gamma}}{\log B},$$
where $C_2$ is the average cost of a single primality test. The optimal sieving bound $B$ is then approximately $\exp(e^{-\gamma} C_2/C_1)$.

The "constants" $C_1$ and $C_2$ are best determined empirically. In reality, the constant $C_1$ depends heavily on the length of the interval $[n_1,n_2]$ while the constant $C_2$ depends mostly on the size of $n_2$. The constant $C_2$ should vary smoothly with $n_2$, but there will be sharp increases in $C_1$ when the cache size is exceeded and a still larger increase if disk swaps are necessary. Here is a trick that will help you fit longer intervals into the cache, especially if you have multiple processors available. The cost of sieving is higher for small primes, to save some of this time, break the interval into arithmetic progressions $a + Mx$, where $\gcd(a,M) = 1$ and $M = 2\cdot 3 \cdots p_i$ is the product of the first $i$ primes. Each such progression can be sieved independently, possibly on different processors. Essentially, we're breaking the interval into $\phi(M) = (2-1)(3-1)\cdots(p_i-1)$ different progressions which each require $(n_2-n_1)/M$ bits to store in memory while the total sieving and testing times remains about the same. I have some C code that implements this strategy with $M = 2\cdot 3 \cdots 29$ if you're interested.

The fact that $\phi(30) = 8$ can also be exploited nicely to fit longer intervals into the cache.
–
j.p.Jul 27 '10 at 19:18

Good analysis. It's worth noting, though, that $C_2$ isn't really constant, in that it depends on the amount of sieving done! Suppose you use a Miller-Rabin test followed by ECPP if needed. The expected cost is cost(M-R) + cost(ECPP)/1000, since about 999/1000 are expected to fail the Miller-Rabin test. If 99% of the composites are sieved out, then the expected cost rises to cost(M-R) + cost(ECPP)/10.
–
CharlesJul 27 '10 at 20:09