Monday, August 15, 2011

Sipping some Monte-Carlo in Scala and Clojure

Howdy buddies, a new one from the SICP. The lecture's nice and suits well the learning curve of all kinds of functional programming languages. Naturally that does concern Clojure, as a Lisp-1 natural language, and Scala. I recently watched the lecture concerning the (very) few benefits of assignments. There, Gerald Jay Sussman presented a simple example of a Monte Carlo simulation dedicated to the processing of the number Pi.

First, thanks a lot to Debasish Gosh who kindly accepted to answer my questions and inspect my code too. May he be assured of all my gratitude. Exchanging ideas with a so skilled person like him was quite an experience and enlightened these times of trouble for me. I hope I have reported all the modifications he suggested, may he forgive me if I did not. I may not have found yet some Master Craftsman in Europe in order to teach me about actors and Functional programming (40 year old may be too...old), but the kind help of this person was a real support.

Back to Pi. Some approximated value of pi can be quickly computed if based on the Cesaro demonstration that the square of Pi is inversely proportional to probability that two integers chosen at random will have no factors in common. More clearly:

P(gcd(N1,N2) == 1) = 6 / π * π

Because if so, their gcd is 1. Quoting the Wikipedia article "Monte Carlo methods (or Monte Carlo experiments) are a class of computational algorithms that rely on repeated random sampling to compute their results". So running a Monte Carlo experiment in order to process the Pi number can be resumed in running a gcd calculus on random natural integer values till our result converges to some expected value.
Therefore, random number generation takes place here, if we consider running a gcd function on a series of nearly random integer. In order not to make things too complex, let's assume that the action of generating numbers can be described as the process of creating a string of numbers extrapolated from a suite taking its roots from a seed value. I picture that as a mathematical suite :

where at each step one need to cache the new generated value in order for the following one to be processed and so on. (Ok my mathematical symbolic vocabulary is limited but I do not want to copy n'paste Wikipedia)
This is a typical problem solved with the help of assignable variables. Of course some solution does exist which does not depends on some external variable but the result is much more cluttered (have a look there).

Having installed Leinengen (version 1.6.1 is very nice with repl at its top), I now have all the tools I need to challenge my code with tests. In the Clojure package cesaro.test I created a suite of small tests in a core_spec.clj file. Starting content is :

(ns cesaro.test.core-spec
(:use cesaro.core)
(:use clojure.test))

Basic. As in the cesaro.test package, in the core_spec.clj file content, my namespace is cesaro.test.core-spec. I claim there my intent to use the tools of the clojure.test namespace in order to challenge the functions inplemented in the cesaro.core namespace (probably the content of a core.clj file in a cesaro package... got it? :))

What I need is a working gcd function first. So are the very dumb tests used to create it:

The last test appeared later as I used some Euclide method so to process the gcd and had to face slight problems due to my expectation of ordered parameters :) (told you I was dumb...but still learning).
This leads me to:

The euclide-gcd function expects ordered parameters. I do not like the conception and what I hate most are the cluttered cond/if branches. This is something I intend to avoid in the future. I hate branches as they do remind me of goto. Clojure deserves better than that.
Then I need some random generator. A mechanism that would allow me to generate a seed and the derived random numbers. The tests look like:

And yes, I adopted a Marsaglia algorithm (see wikipedia previous reference). Although quite idiomatic, I find the result not as elegant as the Scheme solution where the set! form allows for the modification of a declared variable in a let form expression:

The Scheme code shape appears to be more concise. This concision is not due to the fact that I chose a Marsaglia algorithm involving two parameters instead of one, but it finds its origin in the fact that - by nature -, variables set by a let form in Clojure are immutable. The only solution I found was to use the idiomatic form of the references in Clojure, in order to both alter the references wrapped values. Any suggestion will be welcomed.

Frustrated by the two previous pieces of code, I decided to force myself to write a Monte carlo small running function, without no branching at all. A dumb test is to provide my upcoming function with always/never failing tests in order to check its limits:

The montecarlo function accepts two input parameters, the first being the number of attempts and the second the simulation to be applied. The returned result is the ratio of passed simulation versus the whole number of simulations.

No branches so? I would lie saying that that one was a piece of cake for a beginner in functional programming as I am. I had to switch into Scala, than come back to Clojure with this following precise implementation:

Of course the entry point is the montecarlo function. As I know the number of essays to run, all I need is to iterate and not recur over a range of essays:

(range essays)

Meanwhile, at each step, I can run a simulation (the purpose of the run-montecarlo function) that will update a vector of statistics, provided as [0 0] at the very beginning:

from-start [0 0]

The first element being the number of passed essays and the second the number of failed ones. The reduce form (the equivalent of the Scala foldLeft) aims to produce a vector of statistics. With reduce, one can produce whatever he wants, even another list, from the driving list. The purpose of the with-result-updates function is to create a lambda function instance (so a procedure) capable of producing a vector of the deltas to be applied whether a simulation has succeeded or not:

[1 0] on success

[0 1] on failure

One instance of the procedure is created once, so is the hashmap embedding the possible results:

{true [1 0] false [0 1]}

We close onto one immutable single instance of the hasmap, embedded into the frame (context of execution) of the created procedure. The principle of closing applies too in the montecarlo function where we both close onto the previous described procedure instance and the simulation to be applied:

Works nice. What about Scala ? Scala helped me to find the no-branch version of the montecarlo method. For the trained Java eye, Scala is a pleasant bridge to take in order to embrace good functional programming habits to be adopted in any other functional programming language. Don't misunderstand me. Clojure and Scheme overwhelm me with thrilling sensations each time I do practice them. They also show me I was rambling in the dark before.
So, going back again to Scala (did I say I bought the Scala T-shirts and Teddy bear? ), I first had to write Random number generator tests. This was an opportunity to try Specs2:

Here the result is provided as 3-Tuple, returning the number of essays, the number of passed essays , then the number of failed essays. I suffered only on the map method application after zipping the resulting lists. The compiler seemed in need of some help with explicit typing in order to infer the returned type. The astute reader will note that we used the same trick as in Clojure, storing the increments values definitions for the two success/failure scenarii into a Map. I have all the tools I need to run a Cesaro test:

Well, living without branches is not easy, but worthwhile because it helps in learning how to use and reuse functional programming bricks. But I also learnt I have to digg into the RNG code so to find a smarter way to generate my numbers in Clojure. Maybe a stream oriented approach would be better...
Nice, got to finish chapter 11 of the Joy of Clojure and read one chapter more in Gul Agha's Actors book.