Contents

The primary algorithm used below to calculate the integer square root of a number n is a modified form of Newton's method for approximating roots (other correct algorithms may of course be used as well).

Declare initial approximation

Calculate next approximation

Continue calculation until desired accuracy is achieved

Compensate for error

Error may occur when the square of the final approximation is greater than n. In given case, subtract one until the final approximation is less than n.

For an explanation as to how the previous and following algorithms work, see Integer Square Roots by Jack W. Crenshaw (http://www.embedded.com/98/9802fe2.htm) - specifically, Figure 2 of that article.The fastest integer square root C algorithm yet is possibly below:

However, this is completely processor-dependent and maybe compiler-dependent as well. Another very fast algorithm donated by Tristan Muntsinger (Tristan.Muntsinger@gmail.com) is below. It's best to test each one across an expected range of inputs to find the quickest one for your specific application.

The above does not work on signed integers larger than the initial value of "place", because "root += place * 2" will result in a negative value. That is, if the input is signed it must be less than 1/2 the maximum value of that type of integer.

Note that it is almost certainly faster to convert to floating point, use the built-in floating point sqrt, and then to convert back to int, than to use any of the above functions. But, beware loss of precision with large numbers. Using the GNU C libraries (libc6 v.2.12.1), the following is roughly 6.8x faster than the above function and returns the correct result for all numbers less than (1 << 24):

Using the same libraries, integers up to (1 << 30) work if the double-precision sqrt is used instead, but this is only about 4x faster than the above algorithm. These cutoffs and timings are likely to vary depending on your CPU and OS, so be sure to test the output for all integers in the range that your program might use. But even with error-correcting code added, the built-in sqrt is significantly faster than any customized integer square root:

The int version simply casts the result of StrictMath.sqrt to an int, giving us full hardware speed.

The long version uses a trick by Programmer Olathe to get exact results from StrictMath.sqrt even though doubles only have 52 bits precision: StrictMath.sqrt is guaranteed to give the exact result or the exact result plus one. This gets us very near full hardware speed.

longs have 64 bits and Java keeps only the most significant 52 bits when casting to double, so you can lose up to 12 bits.

If the input uses more than 52 bits, the distance between squares is at least (226)2 - (226 - 1)2 = 227 - 1. So, even if you lose the maximum of 12 bits, the distance between perfect squares is much greater than a 12-bit number, and so there cannot be more than one perfect square per "zone" of longs that cast to the same double.

This leaves two possibilities.

There are no perfect squares in the zone, in which case the integer part of sqrt will correspond to the next lowest perfect square, which is great.

There is only one perfect square in the zone, in which case the integer part of sqrt will correspond to that perfect square, and that's either great or, if our input is less than that perfect square, we simply need to subtract one.

The BigInteger version uses the long version to quickly get the square root of the most significant 64 bits, and then it fills in the remaining bits of the result from most to least significant, checking whether putting a 1 in that place would exceed the input and using 0 instead if that's the case. It is designed to minimize BigInteger object creation, creating less than two intermediate BigIntegers per output bit.

This function uses int.bit_length() (available in Python 3.1) to get a better first approximation. It's much faster for large arguments. If bit_length isn't available, you can fake it by mucking around with bin().