Integer Logarithms

May 7, 2010

The integer logarithm of a number n to a base b is the number of times b can be multiplied by itself without exceeding n.

The implementation of the ilog function in the Standard Prelude recently changed. The old version looked like this:

(define (ilog b n)
(if (zero? n) -1
(+ (ilog b (quotient n b)) 1)))

That takes time O(n) as it repeatedly divides by b. A better version takes time O(log n) as it brackets the logarithm between two possible values then uses bisection to calculate the solution.

Your task is to write a program that calculates the integer logarithm of a number to a given base in time O(log n). When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Initially, I bracketed the answer by squaring the base (base, base^2, base^4, …) until it was greater than ‘n’. Then did a binary search between the last two powers. I tried optimize the routine by reusing previously calculated powers of the base. In the end I came up with the routine below, which is about twice as fast as the binary search.

Repeatedly square the base until it is larger than ‘n’, saving the calculated bases and exponents on a stack.
Then, while the stack is not empty, pop a base and exponent off the stack. If the base is <= 'n', divide 'n' by the base and add the exponent to the answer. When the stack is empty, return the answer.

Clever idea, but I’m not sure about your timings. I found that the cost of handling the base/exponent stack outweighed the algorithmic improvement. It would take a very large n to actually see an improvement.

I went back and checked my timings, and I was correct about the timings for the inputs I had tried (123456789, and 987654321). However, I tried different input and my routine was slower. So I ran a test using lots of randomly selected numbers and bases. For most cases my algorithm was slower.

I failed to see why the original algorithm is of complexity O(n). It will take at most log(n) times to execute (quotient n b), right? Even if we assume that (quotient n b) is not of O(1), should we say the algorithm is of pseudo O(log(n))?

On the other hand, the loop alone in the solution that bisects the bracket range will take log(n) times in worst case, no?

You are correct that the original algorithm is O(log n). However, the revised algorithm is O(log log n) to bracket the range (it starts from b^hi = b and squares b^hi at each step) and O(log log n) to perform the bisection, for a total time of O(log log n). Here is a timing test showing the difference:

If there is a max int size (e.g. C/C++ etc.) there’s no need for binary search… Just check if the number is > wordsize/2, then wordsize/4, then wordsize/8, etc, for base 2 log, and increment a counter for the number of bits you check. (e.g. if a number is 2^16 (65536), add 16 to the log, divide the number by 65536, then check for > 256, etc.) and other logs can be scaled using the identity, log_b(x) = log2(x) / log2(b). (Though if all floating point is disallowed, you need to know what bases to use in advance to have suitable ratios ready.) E.g. (in gforth, 64 bit integers):

[…] by the code for calculating integer logarithms in the Scheme Standard Prelude. It is described in a Programming Praxis blog entry. They use the doubling technique to figure out which powers of 2 bound the logarithm, and then use […]

[…] by the code for calculating integer logarithms in the Scheme Standard Prelude. It is described in a Programming Praxis blog entry. They use the doubling technique to figure out which powers of 2 bound the logarithm, and then use […]