Monday, July 25, 2011

A small and useless one this evening about attempts to write CPS (Continuation Passing Style) algorithms in both Scala and Clojure. Purist are going to shoot me because everybody knows that the JVM is not optimized at all for full tail recursion. So what's the purpose with CPS on JVM languages after all?
It is a matter of trying to train the object oriented eye to identify other patterns of programming code that are quite common in functional programming domains. I would like to take the opportunity to report the publication of Dean Wampler book: Functional Programming for Java Developers. I downloaded the pdf last saturday and started reading it. Quite easy to read the book unifies very cleverly both the worlds of object oriented programming in Java and of functional programming paradigm allowing the curious Java developer to extend its experience through the application of dense and essential exercises and code samples. This book deserves to become a must.

Digression achieved, let's come back to CPS. I discovered once CPS in Michael Fogus and Chris Houser book, the joy of Clojure , and to my real shame forgot about it. After all, you remember, the JVM is not fully optimized blah, blah, blah.

Then I saw a video of Chris League explaining delimited continuations and monadic programming to the New York Scala Enthusiasts Meetup. His introduction starts with continuations. I won't digg into the content of the presentation -as I am still exploring it - but I wanted to reproduce the begining of it. The example is nice.

Starting with a Clojure, the most obvious example is with the processing of the factorial of a number.
The basic "mundane" approach (expression borrowed from my favourite Clojure book) leads to the first sample:

Hey, remember the talk about the JVM not being optimized for full tail recursion ? Well we are there. We stack recursively so many frames that we outage the maximum number of stacks.

Clojure offers a workaround, using the special form recur. Combined with the accumulator technique. Basically we are storing the partial on going result, as we are invoking recursively the method, we can easily write:

Grossly, recur, returns the control back to the function call acting like a loop or a while form. (dec n) and (* n accumulator) are evaluated before the function call. The parameter accumulator in essence "accumulates" the result to come. What we are storing is a data.

Go ahead try it, far beyond 10000 if you want. It works fine. I like recur. What we accumulated here is the data in process .

But what if we accumulated the work to be done at the very end? Something like wrapping the work to be done in order to unwrap itm while meeting the breaking condition. We can do that explicitly taking control back after the execution of each recursive call. In summary, what if we were writing something like this:

What we are passing there to the recursive invocation, is the next operation to apply after the function execution. We transfer a mean of control of the result of this processing, through the application of a continuation. Something similar to a call back.
At some step, we know n, we want to process the factorial of n-1, and the continuation is what happens after processing factorial(n-1) in order to get factorial(n).
The continuation is :

(fn [value] (* n value))

the value being factorial(n - 1)

We cannot be more functional, more declarative than that.

We chain the recursive invocation, continuation after continuation:

(fn [value] (continuation (* n value)))

What we want to get back is the value of the factorial processing itself, so the very first continuation will be the id function:

(fn[n] n)

The final order execution of the whole work we have been pushing ahead will be:

some natural order indeed. The main idea is that you clearly express your intent of taking back control in code in an explicit declarative manner. This case is not handled by the recur call so naturally, and at last while executing all the pushed functions, your stack will blow.

If your algorithm can guarantee a limited use of the stack size you may have a nice declarative expression. This is striking in Scala and the resulting code patterns might seduce DSL lovers.

Still with me ? :) Take your time. No rush.

I reproduced Chris League experiment. The intent is to process the factorial of a number so here we go:

Note that you can pass the id function as a continuation if you want to get the result of your stacking recursion, but you can also transfer whatever method you want in order to manipulate the resulting list. Here we trace the method passing println.

Sunday, July 17, 2011

Aouch, two weeks without blogging about things I am learning and I have this addiction problem with the blog :). Having committed to write something, when time's up (nearly two weeks due), I feel to start not at ease specifically when thinking about all the people encouraging me.

I changed of position too soon while missing two nice opportunities, one in England and another in Ireland, two tremendous stuffs. Well a contract is a contract, and as professional (please read Uncle Bob last writings about that), I have signed, so I have to go, although J2EE seems less attractive to me as it used to be. Meeting the Actor model or functional programming seems to please me more. It's a curse, stuck in France for the next five months, but next time I will wait for the nice opportunity, six months if necessary, practicing what I like.

I took the time to read and start practicing stuff like Akka, kept going on reading Scala In Depth from Joshua Suereth. This one's rocks specifically when it comes to implicits and typing formalism. But the domain is hard, and it is a must-read-many-times book, although Joshua Suereth provides clear explanations and examples. I also began practicing Clojure thanks to this amazing book: the joy of clojure. I felt the same as I felt while I started looking for the Abelson and Sussman lectures. Some example are quite hard for the Java eye and I bumped into a wall last week.

I spent a tremendous time at the end of chapter 7, on a shortest path search algorithm aiming to help you to get out of a small maze. Interesting. I noticed once more how some of us (me included) are so bound to our client will to integrate open sources - in order not to reinvent the wheel - that we may loose our capability to reason about mandatory stuff to know like how to find a short path into a graph. By the way, I hate the expression "do not reinvent the wheel", it is a proof of laziness.

The proposed algorithm, is the so called A* search algorithm extending the Dijkstra's algorithm. I was not sure I have understood all that clearly. So I decided to lose(?) one day more and try to implement the same problem in Scala using Dijkstra's approach to search of shortest paths. Grossly explained, the A* search adds an estimate function for the sorting of remaining paths to be explored.

I took the time to re-read my introduction to algorithms about the Dijkstra's algorithm. Indeed, the maze problem is typically a shortest path problem. It is all about a question of vocabulary. What are we talking about ?

Typically we are evolving into a world - the maze - where we have to progress from one point - a spot - to the other. The two points can respectively be the entrance and the exit of the maze. Each step we are going to make is going to cost us some price. Walls, high obstacles, whatever difficulty, is going to cost us a maximum of "weight" while the cost of a step on a flat floor cost us the minimum, let us say 1 on an integer scale. The topology and the shape of the maze can then be easily represented by an array of arrays structure (matrix) like the following

What we have presented here is the structure of a Z path. The walls -of high cost - are represented by the number 99. So a natural path to leave the maze from the upper left corner to bottom right corner would be

The astute reader will have noticed that we have kept the standard x, y notations used in computer science. The x axis values increase from the left to the right, and the y values increase from the top to the bottom.

The upper left corner is represented by the Spot(0,0) while the bottom right is represented by Spot(4,4).

This working context can be mapped to a graph problem where the Spot abstractions are the nodes and the matrix (array of arrays) contains the weight of the edges linking the nodes between them. The weight (or cost) at certain position in the matrix expresses how much it would cost you to get to the position wherever you come from. The so called weight function regularly invoked in the
graph search algorithms reduces itself to a look-up of value into the matrix.

So what is a spot ? Some abstraction that matches position coordinates. As we will rambling through the maze we will have to find out for neighbors of spot using incremental deltas of position (the steps). The test describing the expected behavior is:

I chose case classes, in order to ease spots comparisons and look-up into tables, sets etc... In the spirit of functional programming, a spot incremented by a delta creates a new Spot. Before digging in to a (very) small explanation of the algorithm, a little more code. I already known I need a World abstraction where to define my costs, and maybe allowing me to find the neighbors of spot during my exploration. In order to create it I defined the following tests:

Basically I set small worlds, asserting on the cost of positions, finding the neighbors of very specific corner points. Standard unitary test for a matrix exploration.
This is where I discovered the useful Array.ofDim() Scala factory method. A wisely chosen import helps writing a self explanatory tests like the one into rightTopCornerSpot_WithFourCellWorld_ShouldHaveTwoNeighbours.
The matching implementation of a world is :

A World abstraction accepts a topology definition and exposes two query methods neighborsOf and costAt respectively providing the possible Spot neighbors for some input spot and allowing some client code to get the cost (or weight) at a certain position indexed by a Spot. We assume being able to step uniquely on horizontal and vertical positions so the selector used to identify the neighbors is composed of only four deltas:

Assumption has been made that all the arrays into the main array definition have the same size. The class is immutable. So far so good...

Now we have to find our way in the world abstractions. Graph search lays onto a sets of lemmas found by smart people like Dijkstra and these lemmas are nice guides when it comes to the search for short paths. These lemmas can be taken for granted as one can intuitively "feel" their correctness. I invite you to check some algorithm book to become familiar with them.
I got one of these "Ah-Ah" moments when I understood the following (over simplified here):

Given a weighted directed graph G, with a weight function, and let p be a shortest path, than any sub-path extracted from p will be itself a shortest path

So if at some points in our research we have found a shortest path then we are sure all the sub-path will be shortest paths too.

Looking for a shortest path will be a progressive move starting in the vicinity of the starting node. Then expanding from neighbor to neighbor, we will reevaluate when necessary an estimates of shortest path measures.

The technique used is called relaxation. Let start saying that all the shortest path to all the Spots except the starting have an infinite cost. If during our progression we can improve these shortest path estimates we will relax these infinite values. Let say that d[v] is the shortest path to the spot v from the spot s (the start) and that at some time we can estimate d[u] the shortest path from s to u a neighbor of v. Given w(u,v) the weight (or cost) of the edge (or step) between u and v than it is possible to reduce the value of d[v] this way:

We have a beginning of solution. Dijkstra's helps us. He proposed a greedy algorithm, always providing a solution The idea is to enrich a set of already identified set of found shortest paths (S) while progressively emptying a priority queue (Q) of paths to be explored. The paths to be explored are sorted versus their (corrected or not) weight (or cost) values. The paths to be explored immediately in Q are the paths with the lowest weight. The correction of the weight is progressively achieved by relaxation. Dijkstra's algorithm is grossly described by the following pseudo code

Set all paths weight to infinite
Path weight at Starting Spot is 0

S is empty
Let Q be all Spots
while Q is not empty
do let u = peek-first-in(Q)
let S = S U {u}
for all neighbors v of u
relax v with u path estimation and w(u,v)

At the very beginning all the hypothetical paths in Q are supposed to be infinite except the starting point. Each neighbor path cost estimate will be progressively corrected.
I have adapted the starting conditions. The Q of to-do paths will start containing the starting path only (no point in introducing all the path at all points with infinite costs). Will be corrected the paths to-do in Q and added new corrected paths if they are not in the queue. So at each step we need weighted Path abstractions, matching a path to a spot Spot, holding a weight estimate and a list of predecessors to the Spot. These are the tests qualifying the Path abstraction:

We start by a small world made of four Spots where the shortest path is unique. All the tests except path_InUndirectedCellWorld_ShouldHaveCorrectSize have clearly identified or forced paths. The latter one must find a path of the correct size even if there are multiple solutions. Here is the matching implementation:

There I start my exploration creating my list of path to be improved or settled with the from point. Naturally at this very moment I pass an empty Map of already done paths. The destination to will be used as a break point to terminate the process. In the find method, lays the logic targeting my exploration choices: did I find the destination ? Is my list empty ? etc..

This decision scenario is perfectly supported by a Scala case strategy application, where the breaking condition is handled by the second match. I kept recursion at the end in order to take benefit from tail recursion optimization. The recursion process lays onto the relax method implementation:

This is where the algorithm is implemented.
At that moment I am working with a specified Path located at some Spot and already weighted. I simply look up for the valid spot neighbors. A valid neighbor is a neighbor Spot not already stored in the alreadyDone Map.
In order to proceed using a fluent language I defined the following tool methods

All the purpose of the tool methods is to apply comprehensions or to create reusable first class functions so the algorithm in the relax method is more fluent. Finally one built up the lists of refreshed path to be handled and new paths to added, the whole resulting list is built up and sorted again: