Computing Integer Roots

January 10, 2016

1. The algorithm

Today I’m going to talk about the generalization of
the integer square root algorithm to
higher roots. That is, given \(n\) and \(p\), computing
\(\mathrm{iroot}(n, p) = \lfloor \sqrt[p]{n} \rfloor\), or the
greatest integer whose \(p\)th power is less than or equal to
\(n\). The generalized algorithm is straightforward, and it’s
easy to generalize the proof of correctness, but the run-time bound is
a bit trickier, since it has a dependence on \(p\).

Proof. Assume it terminates. If it
terminates in step \(1\) or \(2\), then we are done. Otherwise, it can
only terminate in step \(4.2\) where it returns \(x_i\) such that
\(x_{i+1} = f(x_i) \ge x_i\). This implies \(g(x_i) = ((p-1)x_i +
n/x_i^{p-1}) / p \ge x_i\). Rearranging yields \(n \ge x_i^p\) and
combining with our invariant we get \(\sqrt[p]{n} \ge x_i \ge s\). But
\(s + 1 \gt \sqrt[p]{n}\), so that forces \(x_i\) to be \(s\), and
thus \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) returns \(s\)
if it terminates. ∎

Proof. Assume it doesn’t
terminate. Then we have a strictly decreasing infinite sequence of
integers \(\{ x_0, x_1, \ldots \}\). But this sequence is bounded below
by \(s\), so it cannot decrease indefinitely. This is a contradiction,
so \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) must
terminate. ∎

Note that, like \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\),
the check in step \(4.2\) cannot be weakened to \(x_{i+1} = x_i\), as
doing so would cause the algorithm to oscillate. In fact, as \(p\)
grows, so do the number of values of \(n\) that exhibit this behavior,
and so do the number of possible oscillations. For example, \(n =
972\) with \(p = 3\) would yield the sequence \(\{ 16, 11, 10, 9, 10,
9, \ldots \}\), and \(n = 80\) with \(p = 4\) would yield the sequence
\(\{ 4, 3, 2, 4, 3, 2, \ldots \}\).

3. Run-time

We will show that \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\)
takes \(\Theta(p) + O(\lg \lg n)\) loop iterations. Then we will
analyze a single loop iteration and the arithmetic operations used to
get a total run-time bound.

Proof. Taylor-expand \(f(\epsilon)\)
around \(0\) with
the Lagrange
remainder form to get \[ f(\epsilon) = f(0) + f'(0) \epsilon +
\frac{f''(0)}{2} \epsilon^2 + \frac{f'''(\xi)}{6} \epsilon^3 \] for
some some \(\xi\) such that \(0 \lt \xi \lt \epsilon\). Plugging in
values, we see that \(f(\epsilon) = \frac{1}{2} q \epsilon^2 +
\frac{1}{6} f'''(\xi) \epsilon^3\) with the last term being negative,
so \(f(\epsilon) \lt \frac{1}{2} q \epsilon^2 \lt \frac{1}{2}
\epsilon^2\). ∎

But this is only a useful upper bound when \(\epsilon_i \le 1\). In
the square root case this was okay, since \(\epsilon_1 \le 1\), but
that is not true for larger values of \(p\). In fact, in general, the
\(\epsilon_i\) start off shrinking linearly:

But \(f(1) = (2 - 1/p)^p/2^{p-1} - 1 = 2 \left(1 -
\frac{1}{2p}\right)^p - 1\), which is always increasing as a function
of \(p\), as you can see by calculating its derivative. Therefore, its
minimum is at \(p = 2\), which is \(1/8\), and so \(f(\epsilon) \gt
f(1) \epsilon \ge \epsilon/8\). ∎

Therefore, the total number of loop iterations is \(\Theta(p) +
O(\lg \lg n)\). ∎

Note that \(p \le \lg n\), so we can just say that
\(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) performs
\(\Theta(\lg n)\) operations. But that obscures rather than
simplifies. Note that the proof above is very similar to the proof of
the worse run-time of \(\mathrm{N{\small EWTON}\text{-}I{\small
SQRT}'}\) where the initial guess varies. In this case, the error in
our initial guess is magnified, since we raise it to the \((p-1)\)th
power, and so that manifests as the \(\Theta(p)\) term.

Furthermore, unlike the square root case, the number of arithmetic
operations in a loop iteration isn’t constant. In particular,
the sub-step to compute \(x_i^{p-1}\) takes a number of arithmetic
operations dependent on \(p - 1\). Using repeated squarings, this
computation would take \(\Theta(\lg p)\) squarings and at most
\(\Theta(\lg p)\) multiplications.

If the cost of an arithmetic operation is constant, e.g.,
we’re working with fixed-size integers, then the run-time bounds
is the above multiplied by \(\Theta(\lg p)\).

Otherwise, if the cost of an arithmetic operation depends on the
length of its arguments, then we only have to multiply by a constant
factor to get the run-time bounds in terms of arithmetic
operations. If the cost of multiplying two numbers \(\le x\) is \(M(x)
= O(\lg^k x)\), then the cost of computing \(x^p\) is \(O((p \lg
x)^k)\). But \(x\) is \(\Theta(n^{1/p})\), so the cost of computing
\(x^p\) is \(O(\lg^k n)\), which is on the order of the cost of
multiplying two numbers \(\le n\). Furthermore, note that we divide
the result into \(n\), so we can stop once the computation of
\(x_i^{p-1}\) exceeds \(n\). So in that case, we can treat a loop
iteration as if it were performing a constant number of arithmetic
operations on numbers of order \(n\), and so, like in the square root
case, we pick up a factor of \(D(n)\), where \(D(n)\) is the run-time
of dividing \(n\) by some number \(\le n\).