About Me

13 May 2013

Binary to decimal conversion: how hard can it be?

Abstract

How hard can it be? It’s easy if the numbers are just a few hundred bits long. After that it gets a little tricky if you care about how long it takes. Here I calculate the biggest known prime number (trivial), which is 57,885,161 bits long, then render it to decimal (the tricky bit). I use Scheme and Python but mostly C++. My code is pitifully slow compared with the MPIR/GMP libraries.

Scheme

A while ago I heard on the news that a new biggest prime number had been found: 257,885,161-1. So I thought I'd try to generate it.

(define (prime n)
(- (expt 2 n) 1))

I found I had to be a little bit careful how I used this. I first tried

(prime 57885161)

and got no response for for about an hour, so I killed it. (I'm running this on DrRacket version 5.2.1 on a 2008 vintage iMac.) But the following form executed in under one second

C++

And so I thought, how do I do this in C++? There is no built-in support in C++ for big numbers like there is in Scheme and Python. Boost offers a kind of wrapper around GMP - I haven't tried it. But I thought it's such a simple equation, how hard could it be to calculate it using only what the language itself offers?

My first idea was to represent a number as a vector of char where each char would represent one binary-coded decimal (BCD) digit. I'd need two simple functions, one to perform a multiply by two and one to perform a subtract one. Then I would start with 2 in my vector and call my multiply-by-two function 57,885,160 times and my subtract one function once. This has the advantage that the result will already be in decimal and the disadvantage that it would take hours, if not days, to run. Imagine, the multiply by two function would have to multiply and carry up to seventeen million digits each time it was called. And in total that function will be called fifty seven million times. Bonkers.

I needed a way to represent an arbitrary length binary number. One obvious way to do that is with a vector of some scalar value so that when you need to store more bits than will fit into one element of the vector you just push another onto the end and let the bits overflow into that.

A number of the form 2n - 1 may be represented by an n-bit number where every bit is 1. For example, 24 - 1 = 15 decimal, or 1111 binary. So using our num_vec_t data structure a C++ function to calculate 2n - 1 is quite simple

(Yes, 'make_prime' is a misnomer because the function only returns a prime number for certain values of n, such as 57885161. But let's stick with this name.)

With this function I can generate the 57,885,161-bit biggest known prime number in about five milliseconds. My next problem is converting that big binary number to decimal.

From my schoolboy maths I know that if I divide a number by 10, the remainder will be the least-significant decimal digit of that number. For example, the decimal number 123 divided by 10 is 12 remainder 3. I can repeatedly extract the next least-significant decimal digit from my big binary number by dividing it by 10 over and over again until it reaches 0. The to_string() function in this program uses this idea

It takes to_string() a day to convert the 57 million-bit number to decimal. Heh. Python's three hours isn't looking so bad now and Scheme's five minutes is looking miraculous!

For every one of the 17 million decimal digits in the number to_string() will on average have to run through the inner for loop about a million times. (To begin with the number is 57 million bits long, which means the vector will contain nearly 2 million 32-bit words. By the end of the algorithm the vector is empty. Hence the average of one million inner-loops per decimal digit.)

Just to underline this: that inner loop will execute 17,000,000,000,000 times before we have an answer. If my computer could execute one iteration of the the inner-loop in a nanosecond, which it can't, it would still take 17,000 seconds or five hours to complete. As we've seen, in reality it takes 26 hours. (The good news is that this means the processor in my machine can execute those three statements in that inner for loop in about 5 nanoseconds, which I think is pretty good.)

There is a simple way to improve the algorithm. Instead of collecting one decimal digit per outer-while-loop we could collect nine. Like this

This algorithm is only slightly more complicated than the previous version but it's almost ten times quicker. I can convert a one million-bit number to decimal in about five seconds. But for our 57 million-bit number the inner loop still executes about 2,000,000,000,000 times. As you can see, it still took over four hours.

How could our original Scheme program do this in under five minutes? They use a better algorithm.

A quick look at the Racket source shows that they use the GMP library. Presumably, when I call (number->string p) the mpn_get_str() function in gmp.c is eventually invoked. It seems to require some thousands of lines of C code to do the binary to decimal conversion when you include the other GMP code used in the process. Somehow I feel disappointed that there isn't a simple algorithm that won't take me days to understand.

A better algorithm (for really big numbers)

It turns out it doesn't take days to understand the general idea: We've seen that converting one 57 million-bit number to decimal using the previous algorithm takes hours. But we also know that with the same algorithm we could convert 57 one million-bit numbers to decimal in 57 x 5 seconds, or about five minutes. So, can we split the one big number into several smaller numbers, convert them to decimal, then stitch all the results back into one? Yes we can.

Imagine it was going to take days to convert the binary representation of 123,456,789 to decimal, but would take seconds to convert 1,234 and 56,789 to decimal. We could split our number into parts by dividing it by 100,000

But this is easily overcome by requiring results to be zero-filled fixed-width: after dividing by 10n we will require the quotient and remainder to be at least n digits, filled with leading zeros if necessary. Then we'll remove any leading zeros from the final result.

If you're wondering why the quotient needs to be fixed width, it's because we will be applying this algorithm repeatedly, so the quotient might be the high part of the remainder from an earlier division. We break the number into two, then we break each of these numbers into two, and so on until we have numbers small enough to be converted to decimal with our O(n2) algorithm from prime1.cpp.

Here is the complete C++ program that uses this algorithm to print the biggest known prime

Well, that's an improvement, but not as dramatic as I hoped for. I think I'm using the same algorithm as MPIR, but it still takes hours when I hoped for minutes. I'm going to try using the MPIR library functions for multiply and divide instead of my Knuth derived ones. This turns out to be quite easy to do because my representation of a multi-precision binary number as a vector of "fragments" maps quite nicely onto MPIR's contiguous "limbs". I need only be careful to match the size of my fragments to MPIR's limbs: my Windows MPIR library uses 32-bit limbs and my OS X MPIR library uses 64-bit limbs. Bearing this in mind I modify mul() and div() in prime2.cpp to call the MPIR functions

Wow! That is the dramatic improvement I was looking for. Switching from the multiply and divide algorithms given in The Art of Computer Programming v2 4.3.1 to those used by the MPIR library made the code run 30 times faster.

What if I use the MPIR built-in binary-to-decimal function, mpn_get_str(), directly?

First, much respect to the mathematicians and programmers who developed and implemented the algorithms in GMP/MPIR.

I would like to have coded something that got a lot closer to the speed of the GMP/MPIR library code, but that would have taken a huge effort and I have other ideas bubbling up. It's time to move on.

How hard is binary to decimal conversion? It depends. For a few hundred bits the simple algorithm in prime1.cpp works well enough. Beyond that either use MPIR/GMP or be prepared to invest a lot of effort. (I hope someone will tell me about a simple, fast algorithm to do it.)