Friday, February 4, 2011

Concrete Mathematics by Knuth et al is a great book but there are a couple of places where people learning by themselves can stumble, fall and get lost.

When I originally worked through the book I couldn't make head or tail of the "repertoire method" of solving recurrences. Eventually I did figure it out and sometime later I wrote up what I understood and posted it on my (then) blog. It seems that a lot of people search for "Concrete Mathematics Repertoire method" on Google and it is my second most popular blog post (The most popular post is this one, fwiw)

So I re-read the repertoire method post today and while it is correct, I can explain it better today so here goes.

(What follows may not make sense to you if you are not working through CM (be warned!). Also be warned that I have no formal training in mathematics, computer science or programming and am entirely self taught. So people with such training can probably explain things better. The following reflects only *my* understanding. That said, onwards!)

By the time you hit the repertoire method section in Chapter 1 of Concrete Mathematics, you have been taught a simple method to find closed forms of recurrences. The essential algorithm is

(a) make a table of the recurrence values R(n) for small values of n.
(b) Eyeball the table to see if you can spot a pattern,
(c) write down the pattern.
(d) Prove (or disprove) the candidate closed form's correctness by Mathematical Induction over (a subset of) the natural numbers.

You used this method, for example to solve the Josephus problem, so you know where to stand (in the circle of your idiot friends trying to commit mass suicide) so that you end up being the survivor), surrender to the Romans and become a historian for the ages).

The repertoire method makes its (first) appearance in the generalization of the Josephus recurrence. The constants 1 and -1 are replaced by alpha beta and gamma to give the more general recurrence

Your job is to find f(n) such that this recurrence is true for any values of alpha, beta, and gamma.

So how do you do this? You use the only technique you know (at this point in the book) of making a table for small values of n and eyeballing it to spot a pattern (I am too lazy to reproduce the table here, go buy the book!). You don't spot a pattern for the f(n) but you do notice that all values of f(n) follow a pattern of

f(n) = A(n)*alpha + B(n)*beta + C(n)*gamma~~~~~~~~~~~~~~[2]

This is a kind of template of what the final closed form will look like, depending of course on the values of A(n), B(n) etc

In other words if we can find functions A(n), B(n) and C(n) such that when given n they generate the right coefficients as in Table 1.12, then we have a closed form for f(n).

Something very important happened here. You broke down a problem into smaller subproblems.

You still don't know the value of f(n) but now you know that if you can find the values of the functions A(n), B(n), and C(n) you have solved f(n).

Now you can find these values with your trusted "spot a pattern and verify with induction" (the only tool you have at this point) OR you can use the (unexplained!) repertoire method.

The first bit of confusion arises because the authors do both. They guess values of all the three functions in n, A(n), B(n) and C(n) from the initial table, say (correctly) that proving that these values by induction is long and tedious and then they go ahead and prove that the guessed value of A(n) is correct by induction! (to be fair they are solving for only A(n) and not all the unknowns simultaneously but though smaller this induction is still tedious. Try it. I did.)

Or, in more detail,

a value is guessed, (that A(n) = 2^m , m coming from rewriting n as 2^m + k as in the original Josephus problem - the book uses the letter l instead of k, I use k to distinguish easily from the number 1), beta and gamma are set to zero (remember the recurrence has to hold for *all* values of alpha, beta and gamma, including the selected 1,0,0) and then the recurrence [1] is rewritten (using [2]) as a recurrence in terms of A(n).

I'll repeat that so it is clear what is happening

Step 1: Guess a value for A(n), by eyeballing.We guess A(n) = 2^m

Step 2: Select alpha, beta and gamma so that the other two functions of n, B(n) and C(n), get eliminated from [2]. You can do this by selecting alpha = 1 and beta = gamma = zero.

Step 3: Rewrite the equation [2] (i.e the observed generalized form) in terms of the selected values of alpha, beta and gamma. You get f(n) = A(n).

Step 4: Use this equation (ie f(n) = A(n)) and your chosen values of alpha, beta and gamma to rewrite the original recurrence (ie equation [1]) to get

Now the problem becomes to prove that your guess (i.e, A(n) = A(2^m + k) = 2^m) satisfies this new recurrence as expressed by [3] and [4] and [5].

The authors state "Sure enough it is true (by induction on m) that A(2^m + k) = 2^m "

The authors don't show the induction but you can work out this (tedious but not difficult) induction. Only basic algebraic manipulation is required)

Be aware that (a) the induction is on m, not n and (b) the predicate to be proven has the form P_m: (A(2^m +k) => [3] AND [4] AND [5]).

First, prove P_zero (as m starts from zero, though n starts from 1 - we need an induction on m, not n!). Then prove that P_m => P_m+1. (Weak Induction is sufficient).

So we prove (yay!!) that A(n) does = 2^m.

You could do the same for B(n) and C(n). Select values for alpha, beta and gamma to create recurrences in terms of B(n) only and then C(n) only, just as we did above for A(n), then use mathematical induction over m to prove your guesses correct.

In the book though, the authors switch to the repertoire method to find B(n) and C(n). This switch is the first confusing bit - A(n) is found using a guess + induction (the old method - eyeball, guess, use induction). But then they switch - and the repertoire method is used to find B(n) and C(n) and the initial guesses as to their values are unused!

Worsening the confusion is the fact that the repertoire method is not identified or explained explicitly at this point. As a student states in a margin note (great idea btw) "Beware: the authors are expecting you to figure out the idea of the repertoire method from seats of pants examples instead of giving a top down presentation"

The seat-of-pants example is actually enough if A(n),B(n) and C(n) are all worked out with the repertoire method.

So let us solve the whole thing with the repertoire method (no induction) and see how it works. Let us throw away the guesses about the values of A(n), B(n), and C(n). We'll assume we couldn't make any guesses for A(n), B(n) and C(n).

ok so we don't know (and we need to find out) the values of A(n), B(n) and C(n) .

The repertoire method (for this recurrence) works like this.
(1) Guess a value for f(n). (ie the guess is for the whole of [2] NOT a component A(n) as we did above!)
(2)See if you can find values for alpha, beta and gamma to validate this guess. Rewrite [1], the original recurrence in terms of your guess for f(n). See if you can find values for alpha, beta and gamma.
(3) Substitute these values (of alpha, beta and gamma), back into [2]. You'll get an equation in terms of the three unknowns A(n), B(n) and C(n).
(4)Repeat steps (1) - (3) till you have three independent equations.
(5)Solve for three linear equations in three unknowns.

Done.

Important: If you make a "wrong" guess you will end up with a useless equation like 0 = 0 or an equation that is not independent of the already derived equations and so on. If this happens, don't worry about it, try another guess till you do get three independent equations.

In detail.

I guess (the authors do too) that f(n) = 1.

Rationale for the guess: f(n) = constant is the simplest possible formulation of f(n) (just for fun you might want to try f(n) = 0)

so we have values for alpha,beta, gamma and f(n) and when we substitute back into [2] we get

A(n) - B(n) - C(n) = 1~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~[6]

ok!!! we have the first equation.

Now if we could get just two more independent equations like this we would have three equations in three unknowns (and so solve by Linear Algebra).

The authors use f(n) = n as their second guess and get A(n) + C(n) = n as their second equation.

(This is a good guess too. After f(n) = k, f(n) = n is the next rung up the complexity ladder. But in this specific example we can do better - see below)

and since they already proved that A(n) = 2^m by the "guess and use induction" method they don't need a third equation. They have two equations in two unknowns and they solve to get the solution.

But we don't have any guesses as to the values of A(n), so we can't plug that in.

We guessed at f(n) = 1 and got the equation A(n) - B(n) - C(n) = 1 and we need two more independent equations. We could re use the authors' f(n) = n guess and also cheat a bit and reuse the (non generalized) Josephus recurrence solution that we already proved that to "guess" that f(n) = 2k + 1. Then alpha = 1, beta = -1 and gamma = 1, giving us the third equation A(n) - B(n) + C(n) = 2k + 1. Three equations three variables. Solve. This gives the right answer too.

But that feels like a cheat. What if we hadn't solved the Josephus recurrence before? How would we guess f(n) = 2k + 1? We could go with f(n) = n but we can do better.

Since n = 2^m + k, we guess (our second guess)

f(n) = 2^m

and f(n) = k. (our third guess)

(Rationale for these guesses. Since n is dependent on 2^m and k why not guess with the simpler variables rather than f(n) = n, like the example in the book does? In this case this decision pays off spectacularly, giving the solutions for A(n) and C(n) directly and B(n) trivially )

These give us the equations, (just like we worked out [4] above, try it!)

Hopefully now what the authors say about the recurrence method makes more sense, though the sentence structure at this point in the book is confusing Let us take it apart - my comments in italics.

"First we find settings for parameters for which we know the solution" (the parameters here are alpha beta and gamma, the "solution"s are the various guessed values of f(n) NOT A(n),B(n), C(n). We make guesses for A(n),B(n) etc when we are using the prove by Induction method. When we use Repertoire method, we guess for f(n) and *find* A(n), B(n) etc);

"this gives us a repertoire of special cases that we can solve" (the special cases are the independent equations in the unknowns A(n),B(n),C(n) and solving them gives us the values of A(n) etc).

"Then we obtain the general case" (the solution of recurrence [1], the general value of f(n) )

"by combining the special cases" (In this case we combine the solutions of the equations which are the "special cases").

Hopefully that helped a bit.

The repertoire method pops up all over CM in various contexts, and once you grasp it is easy to identify and use. Enjoy the rest of Concrete Mathematics (which imho is a great, great book every programmer should have on his bookshelf)