Sunday, September 05, 2010

Statement of the Problem
Suppose you have a real valued function on the reals, say f. We can split it into the sum of an even and odd part:

f(x) = fodd(x)+feven(x)

where

fodd(x) = (f(x)-f(-x))/2, feven(x) = (f(x)+f(-x))/2

If fodd has a power series around zero, then all of its terms must have odd powers in x. So fodd(x)/x must have all even powers and it becomes natural to 'compress' down the terms to form f'(x) = fodd(sqrt(x))/sqrt(x). (Take that as a definition of f' for this article.) We can implement this operation as a higher order function:

But there's something not quite right about this. If f is rational, then so is odd f. But the implementation of odd involves square roots. Among other things, square roots introduce inaccuracy. As square roots don't appear in the final result, can we eliminate them from the intermediate steps of the computation too?

A Better Solution
Let's use the results of last week's article to compute odd another way. We want a linear function that maps as follows:

1

->

0

x

->

1

x2

->

0

x3

->

x

x2n

->

0

x2n+1

->

xn

Here's an automaton:

If we start at 0, end at 1, and take exactly n steps, then the product of the factors we collect up along the way is given by the second column of that table. In n is even there is no such path so we collect up 0. As we're working with polynomials over the reals, rather than types, we have x1=1x and so on. We can construct a transition matrix:

0

1

x

0

We now do something like we did last time. Any time we have a function of some variable x, we replace x with the transition matrix. Our functions now take matrix values like

a

b

c

d

Any polynomial of our transition matrix always gives us equal elements along the diagonal. This is true even if we form the inverse of the transition matrix. So we don't need to store d. So now we implement a simple matrix type needing only to store three elements instead of four:

This new version has many advantages: it uses only rational functions, it's more accurate, and it's well defined at zero.

Conclusion
Automatic differentiation is just one of a family of methods that can be used to compute a wide variety of functions of real-valued functions. Essentially we're just working over real-valued matrices instead of real numbers. By using automata we can simplify the process of working out which matrices to use. (Though for the simple example above, you may have been able to guess the matrix without any other help).

(BTW I think there are hidden automata lurking in a few places in mathematics. For example, in Umbral calculus.)