Supercomputers And Prime Numbers: Verifying That Largest Known Prime

A few days ago I wrote about the discovery of the largest known
prime number, the Mersenne prime M(74,207,281), a 22-million-digit number. Upon hearing this exciting news, many people gave a collective shrug; the "meh"s were almost audible. I tried to point out that large primes are useful for encryption algorithms, and caught some flak for that, too. Fair enough. Numbers this large aren't useful for encryption; indeed, we only need primes with a couple hundred digits to keep things secure (until quantum computers become a reality, in which case we might need to reexamine things).

So, here's a real reason why finding big primes like this is important: it drives improvements in supercomputing. The Great Internet Mersenne Primes Search (GIMPS) software that volunteers install on their personal computers is not foolproof. There is a small chance that it returns a false positive, so potential new primes must be verified and this requires more computing power than the average person has in their living room.

One of the groups that verified that M(74,207,281) is indeed prime is an Ohio-based company called Squirrels. It's a software development firm, mostly working on mobile apps, but they also have rather unique and powerful computer clusters. The system consists of dozens of high-end consumer graphics cards, employed for numerical computations instead of gaming. Graphics cards can handle thousands of calculations in parallel, so this system is fast. It generates a lot of heat, but Squirrels uses a liquid cooling system and recycles the extra heat to warm the office.

Before getting to the how of the verification, let's talk about why. What's in it for a software company like Squirrels to use resources like this to check a calculation that, let's be honest, doesn't have a real application? I asked Sidney Keith, who works for Squirrels, and he told me "this problem is a driving problem [for us] - the system and technology we improve to make sure this verification is perfect will also help ensure the results of tomorrow's analysis of a possible cancer cure is just as correct."

Most of us assume that computers always spit out correct answers. For the problems we feed into our laptops this is generally true, but the algorithms underlying the software do have to be sound or they may fail to give reliable results. In big calculations, a small error at one stage can propagate through and lead to huge problems later on. So companies need to know that their hardware and software are reliable when it counts; warming up on a calculation that is interesting but not exactly essential is a good way to do this.

Verifying that M(74,207,281) is prime required around 34 quadrillion steps. Squirrels ran their graphics cards in parallel; every million steps or so three cards compared their results. In the case of a mismatch, the calculations get repeated until all three cards agree. For extra assurance, they ran the same calculation on each of those cards with a slightly different shift value to ensure different bits of the memory and processor wiring was used for each test. Imagine taking 456 × 456 on one card, 45,600 × 45,600 on another, and 4,560,000 × 4,560,000 on the final card, and then dividing the results by 1, 10,000, and 100,000,000 at the end to make sure they all match. This much redundancy trades speed for accuracy, but it is worth it to make sure the calculation is correct.

And, they did it twice, on two different clusters--an NVIDIA GPU in 2.3 days and an AMD GPU in 3.5 days. The original calculation using the GIMPS software on a typical PC took 31 days.

The main problem is that multiplying large numbers is expensive computationally. Given a pair of n-digit numbers, computing their product naively via the algorithm you learned in elementary school requires about n² operations. If n is large this is impractical. But there are faster algorithms that require only roughly n·log(n) operations. These are based on a Fast Fourier Transform. Fourier analysis is a technique to convert signals from the time domain into the frequency domain (and vice versa). FFT-based algorithms for multiplication of integers take as input numbers written in binary (as a computer would), consider them as polynomials in the variable B (the base, 2 in the case), compute the FFT of these polynomials, take the product of these term by term, and then compute the inverse FFT. If you want to multiply two million-digit numbers together, say, the traditional method would take around a trillion operations while an FFT method takes only around 1,000,000·log(1,000,000) ≈ 20,000,000. That's much faster.

There are FFT-based factoring algorithms, and these are what are implemented in the graphics cards in the Squirrels cluster. Primality testing on numbers such as M(74,207,281) is a way to verify the implementations of these algorithms; if problems are found they can be fixed, yielding improved precision in the calculations.

So, yes, finding a new large prime is not especially important mathematically. But the work that goes into the discovery has significant impact on the machinery and software used and may contribute to more accurate calculations on problems that really matter.

As a mathematician, though, I still just think it's cool to find big primes.

I am a professor of mathematics at the University of Florida with research interests in various areas in topology, including topological data analysis. I am also interested in undergraduate education, particularly at the intersection of the sciences and humanities. My resear...