Number Factoids

The web sites numbersaplenty.com and numberworld.info dish up a smorgasbord of facts about every natural number from 1 to 999,999,999,999,999. Type in your favorite positive integer (provided it’s less than 1015) and you’ll get a list of prime factors, a list of divisors, the number’s representation in various bases, its square root, and lots more.

I first stumbled upon these sites (and several others like them) about a year ago. I revisited them recently while putting together the Carnival of Mathematics. I was looking for something cute to say about the new calendar year and was rewarded with the discovery that 2016 is a triangular number: 2016 bowling pins can be arranged in an equilateral triangle with 63 pins per side.

This incident set me to thinking: What does it take to build a web site like these? Clearly, the sites do not have 999,999,999,999,999 HTML files sitting on a disk drive waiting to be served up when a visitor arrives. Everything must be computed on the fly, in response to a query. And it all has to be done in milliseconds. The question that particularly intrigued me was how the programs recognize that a given number has certain properties or is a member of a certain class—a triangular number, a square, a Fibonacci, a factorial, and so on.

I thought the best way to satisfy my curiosity would be to build a toy number site of my own. Here it is:

This one works a little differently from the number sites I’ve found on the web. The computation is done not on my server but on your computer. When you type a number into the input field above, a JavaScript program running in your web browser computes the prime factors of the number and checks off various other properties. (The source code for this program is available on GitHub, and there’s also a standalone version of the Number Factoids calculator.)

Because the computation is being done by your computer, the performance depends on what hardware and software you bring to the task. Especially important is the JavaScript engine in your browser. As a benchmark, you might try entering the number 999,999,999,999,989, which is the largest prime less than 1015. The elapsed time for the computation will be shown at the bottom of the panel. On my laptop, current versions of Chrome, Firefox, Safari, and Opera give running times in the range of 150 to 200 milliseconds. (But an antique iPad takes almost 8 seconds.)

Most of that time is spent in factoring the integer (or attempting to factor it in the case of a prime). Factoring is reputed to be a hard problem, and so you might suppose it would make this whole project infeasible. But the factoring computation bogs down only with really big numbers—and a quadrillion just isn’t that big anymore. Even a crude trial-division algorithm can do the job. In the worst case we need to try dividing by all the odd numbers less than \(\sqrt{10^{15}}\). That means the inner loop runs about 16 million times—a mere blink of the eye.

Once we have the list of prime factors for a number N, other properties come along almost for free. Primality: We can tell whether or not N is prime just by looking at the length of the factor list. Square-freeness: N is square-free if no prime appears more than once in the list. Smoothness: N is said to be square-root smooth if the largest prime factor is no greater than \(\sqrt{N}\). (For example, \(12 = 2 \times 2 \times 3\) is square-root smooth, but \(20 = 2 \times 2 \times 5\) is not.)

The factor list could also be used to detect square numbers. N is a perfect square if every prime factor appears in the list an even number of times. But there are lots of other ways to detect squares that don’t require factorization. Indeed, running on a machine that has a built-in square-rooter, the JavaScript code for recognizing perfect squares can be as simple as this:

If you want to test this code in the Number Factoids calculator, you might start with 999,999,961,946,176, which is the largest perfect square less than \(10^{15}\).

Note that the isSquare function is a predicate: The return statement in the last line yields a boolean value, either true or false. The program might well be more useful if it could report not only that 121 is a square but also what it’s the square of. But the Number Factoids program is just a proof of concept, so I have stuck to yes-or-no questions.

Metafactoids: The Factoids calculator tests nine boolean properties. No number can possess all of these properties, but 1 gets seven green checkmarks. Can any other number equal this score? The sequence of numbers that exhibit none of the nine properties begins 20, 44, 52, 68, 76, 88, 92,… Are they all even numbers? (Hover for answer.)

What about detecting triangular numbers? N is triangular if it is the sum of all the integers from 1 through k for some integer k. For example, \(2016 = 1 + 2 + 3 + \dots + 63\). Given k, it’s easy enough to find the kth triangular number, but we want to work in the opposite direction: Given N, we want to find out if there is a corresponding k such that \(1 + 2 + 3 + \cdots + k = N\).

Young Carl Friedrich Gauss knew a shortcut for calculating the sums of consecutive integers: \(1 + 2 + 3 + \cdots + k = N = k\,(k+1)\,/\,2\). We need to invert this formula, solving for the value of k that yields a specified N (if there is one). Rearranging the equation gives \(k^2 + k - 2N = 0\), and then we can crank up the trusty old quadratic formula to get this solution:

\[k = \frac{-1 \pm \sqrt{1 + 8N}}{2}.\]

Thus k is an integer—and N is triangular—if and only if \(8N + 1\) is an odd perfect square. (Let’s ignore the negative root, and note that if \(8N + 1\) is a square at all, it will be an odd one.) Detecting perfect squares is a problem we’ve already solved, so the predicate for detecting triangular numbers takes this simple form:

function isTriangular(N) {
return isSquare(8 * N + 1);
}

Try testing it with 999,999,997,764,120, the largest triangular number less than \(10^{15}\).

Factorials are the multiplicative analogues of triangular numbers: If N is the kth factorial, then \(N = k! = 1 \times 2 \times 3 \times \cdots \times k\). Is there a multiplicative trick that generates factorials in the same way that Gauss’s shortcut generates triangulars? Well, there’s Stirling’s approximation:

\[k! \approx \sqrt{2 \pi k} \left( \frac{k}{e} \right)^k.\]

We might try to invert this formula to get a function of \(k!\) whose value is \(k\), but I don’t believe this is a promising avenue to explore. The reason is that Stirling’s formula is only an approximation. It predicts, for example, that 5! is equal to 118.02, whereas the true value is 120. Thus taking the output of the inverse function and rounding to the nearest integer would produce wrong answers. We could add correction terms to get a closer approximation—but surely there’s a better way.

One approach is to work with the gamma (\(\Gamma\)) function, which extends the concept of a factorial from the integers to the real and complex numbers; if \(n\) is an integer, then \(\Gamma(n+1) = n!\), but the \(\Gamma\) function also interpolates between the factorial values. A recent paper by Mitsuru Uchiyama gives an explicit, analytic, inverse of the gamma function, but I understand only fragments of the mathematics, and I don’t know how to implement it algorithmically.

Fifteen years ago David W. Cantrell came up with another inverse of the gamma function, although this one is only approximate. Cantrell’s version is much less intimidating, and it is based on one of my favorite mathematical gadgets, the Lambert W function. A Mathematica implementation of Cantrell’s idea works as advertised—when it is given the kth factorial number as input, it returns a real number very close to \(k+1\). However, the approximation is not good enough to distinguish true factorials from nearby numbers. Besides, JavaScript doesn’t come with a built-in Lambert W function, and I am loath to try writing my own.

On the whole, it seems better to retreat from all this higher mathematics and go back to the definition of the factorial as a product of successive integers. Then we can reliably detect factorials with a simple linear search, expressed in the following JavaScript function:

A factorial is built by repeated multiplication, so this algorithm takes it apart by repeated division. Initially, we set \(q = N\) and \(d = 2\). Then we replace \(q\) by \(q / d\), and \(d\) by \(d + 1\), while keeping track of the remainder \(r = q \bmod d\). If we can continue dividing until \(q\) is equal to 1, and the remainder of every division is 0, then N is a factorial. This is not a closed-form solution; it requires a loop. On the other hand, the largest factorial less than \(10^{15}\) is 17! = 355,687,428,096,000, so the program won’t be going around the loop more than 17 times.

The Fibonacci numbers are dear to the hearts of number nuts everywhere (including me). The sequence is defined by the recursion \(F_0 = 0, F_1 = 1, F_k = F_{k-1} + F_{k-2}\). How best to recognize these numbers? There is a remarkable closed-form formula, named for the French mathematician J. P. M. Binet:

I call it remarkable because, unlike Stirling’s approximation for factorials, this is an exact formula; if you give it an integer k and an exact value of \(\sqrt{5}\)), it returns the kth Fibonacci number as an integer.

One afternoon last week I engaged in a strenuous wrestling match with Binet’s formula, trying to turn it inside out and thereby create a function of N that returns k if and only if \(N\) is the kth Fibonacci number. With some help from Mathematica I got as far as the following expression, which gives the right answer some of the time:

Plugging in a few values of N yields the following table of values for \(k(N)\):

N

k(N)

1

2.000000000000001

2

3.209573979673092

3

4.0000000000000036

4

4.578618254581733

5

5.03325648737724

6

5.407157747499656

7

5.724476891770392

8

6.000000000000018

9

6.243411773788614

10

6.4613916654135615

11

6.658737112471047

12

6.8390081849422675

13

7.00491857188792

14

7.158583717787527

15

7.3016843734535035

16

7.435577905992959

17

7.561376165404197

18

7.680001269357004

19

7.792226410280063

20

7.8987062604216005

21

7.999999999999939

In each of the green rows, the function correctly recognizes a Fibonacci number \(F_k\), returning the value of k as an integer. (Or almost an integer; the value would be exact if we could calculate exact square roots and logarithms.) Specifically, 1 is the second Fibonacci number (though also the first), 3 is the fourth, 8 is the sixth, and 21 is the eighth Fibonacci number. So far so good. But there’s something weird going on with the other Fibonacci numbers in the table, namely those with odd-numbered indices (red rows). For N = 2, 5, and 13, the inverse Binet function returns numbers that are close to the correct k values (3, 5, 7), but not quite close enough. What’s that about?

If I had persisted in my wrestling match, would I have ultimately prevailed? I’ll never know, because in this era of Google and MathOverflow and StackExchange, a spoiler lurks around every cybercorner. Before I could make any further progress, I stumbled upon pointers to the work of Ira Gessel of Brandeis, who neatly settled the matter of recognizing Fibonacci numbers more than 40 years ago, when he was an undergraduate at Harvard. Gessel showed that N is a Fibonacci number iff either \(5N^2 + 4\) or \(5N^2 - 4\) is a perfect square. Gessel introduced this short and sweet criterion and proved its correctness in a problem published in The Fibonacci Quarterly (1972, Vol. 10, No. 6, pp. 417–419). Phillip James, in a 2009 paper, presents the proof in a way I find somewhat easier to follow.

It is not a coincidence that the expression \(5N^2 + 4\) appears in both Gessel’s formula and in my attempt to construct an inverse Binet function. Furthermore, substituting Gessel’s \(5N^2 - 4\) into the inverse function (with a few other sign adjustments) yields correct results for the odd-indexed Fibonacci numbers. Implementing the Gessel test in JavaScript is a cinch:

So that takes care of the Fibonacci numbers, right? Alas, no. Although Gessel’s criterion is mathematically unassailable, it fails computationally. The problem arises from the squaring of \(N\). If \(N\) is in the neighborhood of \(10^{15}\), then \(N^2\) is near \(10^{30}\), which is roughly \(2^{100}\). JavaScript does all of its arithmetic with 64-bit double-precision floating-point numbers, which allow 53 bits for representing the mantissa, or significand. With values above \(2^{53}\), not all integers can be represented exactly—there are gaps between them. In this range the mapping between \(N\) and \(N^2\) is no longer a bijection (one-to-one in both directions), and the gessel procedure returns many errors.

I had one more hope of coming up with a closed-form Fibonacci recognizer. In the Binet formula, the term \(((1 - \sqrt{5})\,/\,2)^k\) becomes very small in magnitude as k grows large. By neglecting that term we get a simpler formula that still yields a good approximation to Fibonacci numbers:

For any integer k, the value returned by that expression is within 0.5 of the Fibonacci number \(F_k\), and so simple rounding is guaranteed to yield the correct answer. But the inverse function is not so well-behaved. Although it has no \(N^2\) term that would overflow the 64-bit format, it relies on square-root and logarithm operations whose limited precision can still introduce errors.

So how does the Factoids calculator detect Fibonacci numbers? The old-fashioned way. It starts with 0 and 1 and iterates through the sequence of additions, stopping as soon as N is reached or exceeded:

As with the factorials, this is not a closed-form solution, and its computational complexity scales in linear proportion to N rather than being constant regardless of N. There are tricks for speeding it up to \(\log N\); Edsger Dijkstra described one such approach. But optimization hardly seems worth the bother. For N < \(10^{15}\), the while loop cannot be executed more than 72 times.

I’ve included two more sequences in the Factoids calculator, just because I’m especially fond of them. The Catalan numbers (1, 1, 2, 5, 14, 42, 132, 429…) are famously useful for counting all sorts of things—ways of triangulating a polygon, paths through the Manhattan street grid, sequences of properly nested parentheses. The usual definition is in terms of binomial coefficients or factorials:

\[C_k = \frac{1}{k+1} \binom{2k}{k} = \frac{(2k)!}{(k+1)! k!}\]

But there is also a recurrence relation:

\[C_0 = 1,\qquad C_k = \frac{4k-2}{k+1} C_{k-1}\]

The recognizer function in the Factoids Calculator does a bottom-up iteration based on the recurrence relation:

The surprise here is that the elements of the sequence are all integers, beginning 1, 1, 1, 1, 2, 3, 7, 23, 59, 314, 1529, 8209. In writing a recognizer for these numbers I have made no attempt to be clever; I simply generate the sequence from the beginning and check for equality with the given N:

But there’s a problem with this code. Do you see it? As N approaches the \(10^{15}\) barrier, the subexpression S[3] * S[1] + S[2] * S[2] will surely break through that barrier. In fact, the version of the procedure shown above fails for 32,606,721,084,786, the largest Somos-4 number below \(10^{15}\). For the version of the program that’s actually running in the Factoids calculator I have repaired this flaw by rearranging the sequence of operations. (For details see the GitHub repository.)

The factorial, Fibonacci, Catalan, and Somos sequences all exhibit exponential growth, which means they are sprinkled very sparsely along the number line. That’s why a simple linear search algorithm—which just keeps going until it reaches or exceeds the target—can be so effective. For the same reason, it would be easy to precompute all of these numbers up to \(10^{15}\) and have the JavaScript program do a table lookup. I have ruled out this strategy for a simple reason: It’s no fun. It’s not sporting. I want to do real computing, not just consult a table.

Other number series, such as the square and triangular numbers, are more densely distributed. There are more than 30 million square and triangular numbers up to \(10^{15}\); downloading a table of that size would take longer than recomputing quite a few squares and triangulars. And then there are the primes—all 29,844,570,422,669 of them.

What would happen if we broke out of the 64-bit sandbox and offered to supply factoids about larger numbers? A next step might be a Megafactoids calculator that doubles the digit count, accepting integers up to \(10^{30}\). Computations in this system would require multiple-precision arithmetic, capable of handling numbers with at least 128 bits. Some programming languages offer built-in support for numbers of arbitrary size, and libraries can add that capability to other languages, including JavaScript. Although there is a substantial speed penalty for extended precision, most of the algorithms running in the Factoids program would still give correct results in acceptable time. In particular, there would no problem recognizing squares and triangulars, factorials, Fibonaccis, Catalans and Somos-4 numbers.

The one real problem area in a 30-digit factoid calculator is factoring. Trial division would be useless; instead of milliseconds, the worst-case running time would be months or years. However, much stronger factoring algorithms have been devised in the past 40 years. The algorithm that would be most suitable for this purpose is called the elliptic curve method, invented by Hendrik Lenstra in the 1980s. An implementation of this method built into PARI/GP, which in turn is built into Sage, can factor 30-digit numbers in about 20 milliseconds. A JavaScript implementation of the elliptic curve method seems quite doable. Whether it’s worth doing is another question. The world is not exactly clamoring for more and better factoids.

It’s a nice undergraduate exercise to show that recognizing Fibonacci numbers is in P: that is, there is an algorithm that tells if N is a Fibonacci number, whose running time is polynomial in the number of digits of N.

“A nice undergraduate exercise” is a phrase that makes some of us quake in our sandals. But in this case I think I might be able to manage (perhaps with a little help from Moore & Mertens).

First, though, the ground rules. May I assume that arithmetic with integers of size O(N) is a primitive (constant-time) operation? How about approximating irrational numbers to O(N) digits of precision?

A way to recognize Fib numbers in log(N) time is to use matrix multiplication (it’s also a nice way of deriving the closed form formula). If you take the vector

F(n-2)
F(n-1)

and multiply it by

0 1
1 1

you get

F(n - 1)
F(n - 2) + F(n - 1)

aka

F(n-1)
F(n)

So if you start with (1, 1) and repeatedly multiply by that matrix (call it M) you generate the F(n)s.

To me, this is the best “explanation” of the closed form - getting it via generating functions or somesuch always feels like magic. To get it using matrices, you find the matrices S and T such that S T S^-1 = M and T is diagonal. Then M^n = S T^n S^-1. You’ll find that T has 1+sqrt(5)/2 and 1-sqrt(5)/2 on the diagonal and multiplying things out you get Binet’s formula.

Anyway to find if X is F(N) for some N, you can keep squaring M (a cheap operation) to get M, M^2, M^4, M^8, …, Eventually M will get too big, so let’s say M^(2^k) was the last one that wasn’t too big. You can multiply in the previous powers to narrow in on N. You’re essentially filling in the binary digits on N from the most significant to the least. When you multiply by M^(2^j) if it stays under then the jth digit is 1 if it goes over then it’s 0. Eventually you find the F(n) that is <= X.

You can also do something similar directly from Binet's formula by writing some code to handle multiplying expressions of the form a+b.sqrt(5). If you play with that, you'll notice that 2*((1+sqrt(5))/2)^n has F(n) as the coefficient of sqrt(5) (the unit coefficients are the Lucas numbers, similar to the Fibonacci numbers but they start 2, 1, 3, 4, 7, …) . You can do the same repeated squaring and using the various 2^kth powers to figure out the F(n) that's <= X.

Does your program really use the isSquare test you give in the text ? I believe this would yield wrong results, unless you replace “floor” by “round”. (For large numbers the square root is likely to give a number ending in .99999999 or the like, and floor yields the next lower integer.) OTOH, the isFactorial would work using the same spirit : even if Stirling’s formula was by 10% off, don’t forget that the next factorial is away by a factor (of 5, 6, 7…). So, rounding the approximate value and then calculating the factorial (which is immediate) would well confirm whether the number is a factorial or not.

I’ve been waiting for someone to ask this question! I was skeptical too when I first wrote the isSquare function. But in fact it seems to be safe.

JavaScript’s sqrt function is supposed to adhere to the IEEE specification for floating-point arithmetic. That specification requires that the calculated square root of any finite representable number can never be less than the floor of the correct root. (My authority for this statement is The Handbook of Floating Point Arithmetic, by Jean-Michel Muller et al., Birkhäuser, 2010, pp. 265–267.) I have checked that the statement is true for all \(N^2\) with \(N \in \{1, 10^9\}\).

PS: Triangular numbers are very easy to check for (via n=k(k+1)/2 k=floor(sqrt(2n)); see OEIS for concrete code), and for Fibonacci numbers it is the same principle, via the approximation you give for F(k) which allows to get a corresponding k — maybe a little too small ; then you use binary exponentiation to compute exactly F(k), F(k+1) and if needed, the few successors until N is passed. (Exact computation of F(k) as power of the 2×2 matrix given earlier, of special type allowing to compute the squares more efficiently).