If I had access to potentially unlimited CPUs and wanted to quickly check 100 million digit numbers for primality using a map-reduce architecture, how many CPUs would be necessary? Each of the mapped compute instances would perform efficient checks against the number in question with an assigned range of divisors (e.g. Instance 1: checks divisors 1-1000, Instance 2: checks divisors 1001-2000, ... etc.).

Definitions:

quickly means checking a 100 million digit number within 30-60 minutes.

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
If this question can be reworded to fit the rules in the help center, please edit the question.

The question implicitly assumes that the primality testing is by trial division, which is actually a bad primality test, even with massive parallelization. Nevertheless, if the question is roughly how many trial divisions are needed, it's about the square root of the number. Ok, so a $10^8$ decimal digit number needs about $10^{\frac{1}{2}10^8}$ trial divisions. No, no typo. With, let's say, $10^{10}$ CPUs (is this how many substantial computers exist in the world?), each one would need to do about $10^{\frac{1}{2}10^8−10}$ trial divisions. That −10 in the exponent is not helping much...
–
paul garrettJul 22 '11 at 20:17

1 Answer
1

This question is a bit unclear, still I will try to give some sort of answer.

Some intial remarks:

First, at the moment noone suceed in proving primality for a 100 million (decimal) digit number. The current record is (I believe) close to 13 million digits (in binary this would still not be 100 million).

Second, one does not do these test by trial division as your description seems to suggest.

Having said this, as a thought experiment a very rough and overly optimistic calculation:

Suppose you have a 100 million digit number. Then you will have to test whether it is divisibile by numbers of size up to its square root, that is 50 million digit numbers.
So, you test $10^{50 000 000}$ numbers.

Suppose you do your divison with only one processor instruction. I am not overly knowledgeable on processor speeds but according to the list here let's assume you do 50 GIPS so $5*10^9$ instruction per second.

Now, in an hour you will do, let's be generous, $2*10^{12}$ instructions, by our assumption divisoions.

As said, you cannot do this like this. Also note that using the approach you sketch you would effectively find a factor and thus a factorization. For factoring the current records are way smaller then the ones for primes I mentioned above. It is a major challenge to factor numbers with (low) hundreds of digits. Note that that current RSA-keys are of size a a thousand or two (maybe four) thousand bits. So (higher) hundreds to a thousand decimal digits only.

P.S. Towards the end of my writing, I saw paul garrett's comment which is similar. Perhaps the details are useful. And, sorry to those who mind, for answering the off-topic question.

I could store all of the known contiguous prime numbers and check against those and then perform trial division of odd numbers up to the square root. If I don't do trial division, how else will I divide and conquer in map-reduce?
–
Adam FJul 22 '11 at 21:40

As a first step, you could do Euclid's gcd algorithm with products of some of the primes. Even if you could do this quickly with, say, products of a million primes each time, that only reduces the exponent above by one or two orders of magnitude. There is still the problem of generating the products for use in such an endeavour. Better to read up on AKS. Gerhard "Ask Me About System Design" Paseman, 2011.07.22
–
Gerhard PasemanJul 22 '11 at 23:33

Adam F, regarding 'storing primes' to do something for 100 million digit numbers. Note that up to size x there are about x/log x primes. Thus up to 10^50000000 there are something like (taking 4 as approx for nat log of 10) 10^50000000/ (2 *10^8) = 5 * 10^4999991 primes. This is an insanely large number relative to say a Terabyte (10^12). I do not know about the specifics of Map Reduce, but a couple more general remarks/questions:
–
quidJul 23 '11 at 1:58

a. Do you want to find a prime of this size or test one specific number (if the latter I would be curious why)? For the former: there is a monetary award of the EFF of 150.000$ for the first 100 million digit prime. So, besides scientific interest, there would be a simple economic interest in finding such a prime, if it were feasible (with a cost of less than that amount). b. The largest known primes are all Mersenne primes and as such of a particular form. It also makes a difference if you want to test a generic number or one of a particular form.
–
quidJul 23 '11 at 1:58