We return to ITA
Software’s “word numbers” problem, like the following:

If the integers from 1 to 999,999,999 are written as words,
sorted alphabetically, and concatenated, what is the 51 billionth
letter?

In the first part, we came up
with a problem statement. We defined a term ten9 of
type (Seminearring a, Character a) => a; that is, a
value of any type where we know how to add, multiply, and what to
do with characters. For instance, ten9 could take a
values in [String]:

As you can see, though, this specification has no chance of
working: the indexing function (!!) takes an
Int as its second argument, which one my computer
cannot represent numbers bigger than 231, which is only
around 2 billion. This limitation merely represents a practical
limitation, that you are unlikely to be able to work with lists
that are so colossally big, and indeed the computation above would
take much too long to complete.

In this post we will start to solve such problems efficiently.
We start by counting the total length of all strings in
ten9. The key idea is to think of converting lists of
strings to polynomials, where each character gets mapped to the
variable x, a string of length n gets mapped to
xn, and a list of strings get mapped to a sum.
Thus instead of a list starting

["one","two","three","four","five",...]

we get a polynomial whose first few terms are

x3 + x3 +
x5 + x4 + x4
+ …

If we evaluate this polynomial at x = 1, we get the total
number of terms, which is not very interesting. If we instead take
the derivative first, we get

3x2 + 3x2 +
5x4 + 4x3 +
4x3 + …

Now evaluation at x = 1 gives

3 + 3 + 5 + 4 + 4 + …

which is the total length of all these strings, as desired.

To implement this efficiently, we use the idea of automatic
differentiation: we can make the value and first derivative of
a function around x = 1 into a seminearring supporting
addition and multiplication. (We could include subtraction and
division if we wanted, but we do not need it here.) Conceptually,
it is best to think about this as the first two terms in the Taylor
series expansion, as we will explain more below.

We now turn to the implementation. As before, you can download
this post as a program. First we import what we did
previously.

We convert every character into the number 1. This corresponds
to the evaluation at x = 1 as explained above.

instance (Num a) => Character (Nat a) where
char _ = one

We check our work by counting the total number of terminals in
the grammar for ten9.

*WordNumbers2> ten9 :: Count
999999999

Note that we get 109 - 1 since ‘zero’ is not produced
by the grammar.

To compute the total count of all letters, we need a kind of
derivative, as mentioned above. In general, we want to keep track
of the value of a function and its first derivative. If we are
working with several variables (as we will later), the first
derivative becomes a vector of partial derivatives. In general, the
derivatives need to be a module over the scalars (the base
seminearring), with addition (a monoid structure), and
multiplication by scalars—

In our case, the units are numbers of characters, which we call
Volume.

data V
type Volume = Wrap V (Nat Integer)

Every character has length one:

instance Character Volume where char _ = one

We could change this definition to count, for instance, the
total number of pen strokes used to write all these letters.

Now we can quickly count the total number of characters in the
English words from one to a billion.

*WordNumbers2> ten9 :: Deriv Count Volume
Deriv 999999999 70305000000

For a sanity check, we check the count to a million…

*WordNumbers2> ten6 :: Deriv Count Volume
Deriv 999999 44872000

… which agrees with our previous result, but completes much
more quickly.

It’s worth thinking about how this works. In computing

*WordNumbers2> ten2 :: Deriv Count Volume
Deriv 99 854

for instance, the total length of the strings from
one to nine is used a total of nine
times, but we don’t want to count their length that many times.
Instead we count how many times they will be used, and multiply
using Leibniz’s rule.

In the next installment we will use this setup to find the 51
billionth letter of this sequence (omitting the sorting), and then
we will move on to see how to sort the sequence while maintaining
efficiency.