Saturday, September 19, 2015

My beef with Schema arose while working through a set of programming problems that are published online. I was interested in using those problems to familiarize myself with the idioms of Scheme and Haskell, with which I am less familiar. The exercise in question asks for the sum of all Fibonacci numbers that do not exceed 10000 and are divisible by 2. For the purposes of this post, the Fibonacci sequence is defined by a recurrence relation:
\[
\begin{align}
F(0) &= 0 \\
F(1) &= 1 \\
F(n) &= F(n - 2) + F(n - 1). \\
\end{align}
\]
In an imperative language such as C, this problem can be solved with a program similar to the following one. The logic of this program mimics how a person might solve the problem with a pencil, paper, and a pocket calculator: it keeps a running total of the even Fibonacci numbers as they are computed.

This is the kind of code I could write in my sleep, which is why I am interested in exploring the functional languages. In a purely functional language such as Haskell, there is (loosely speaking) no state. Consequently, the standard looping construct is recursion. The code example below makes use of the convenient fact that the even Fibonacci numbers can be described by a recurrence relation of their own:
\[
\begin{align}
F(0) &= 0 \\
F(3) &= 2 \\
F(n) &= F(n-6) + 4F(n-3). \\
\end{align}
\]
It will be helpful to discuss this function line-by-line.

Base case. The sum of Fibonacci numbers divisible by 2 and not exceeding 0 is 0.

Base case. The sum of Fibonacci numbers divisible by 2 and not exceeding 1 is 0.

Recursive case. For limits greater than 1, uses a helper function to accumulate the sum.

Definition of the helper function that accumulates the sum. The arguments to the helper function are the previous even Fibonacci number $F(n-6)$, the current even Fibonacci number $F(n-3)$, and the sum up to this point.

If the next even Fibonacci number exceeds the limit, then the sum of the sequence is the sum up to this point.

If the next even Fibonacci number does not exceed the limit, then the accumulator function is invoked recursively with the current Fibonacci number, the next Fibonacci number, and the new total.

Recurrence relation for the even Fibonacci numbers, as above.

I can write a very similar algorithm in Scheme. Like in Haskell, the function uses a helper function to accumulate the sum of the sequence recursively.

So far, so good. I've worked out how to code a the solution to the problem recursively in two different functional programming languages. Now, let's get to the fun stuff. The solution to this problem can be expressed much more succinctly by defining a sequence of even Fibonacci numbers and using the takeWhile function to get the entries that do not exceed the limit.

The list of even Fibonacci numbers evenFibs is constructed beginning with the first two even Fibonacci numbers: $F(0)=0$ and $F(3)=2$. Once the first two members of the sequence are present, the rest of the sequence is populated using zipWith. In functional programming, the zip function takes two lists and returns the corresponding pairs in the list. For example, zip([1, 2], ['a', 'b']) returns [(1, 'a'), (2, 'b')]. The function zipWith calls a projection function on each corresponding pair and adds the result of the projection function to the list instead. For example, zipWith (+) [1, 4] [3, 2] returns [4, 6], with the projection function being addition ($+\colon \mathbb{R} \times \mathbb{R} \rightarrow \mathbb{R}$). The list evenFibs is constructed by zipping the list with itself so that each element is paired with its previous element. All you need to do is specify the recurrence relation for even Fibonacci numbers as the projection function to zipWith, and you can generate as many items in the sequence as are needed.

Once the sequence of even Fibonacci numbers has been defined, the takeWhile function can be used to pull numbers from the sequence while they do not exceed the limit. The first parameter to takeWhile, <= limit, is a predicate to check that an entry in the sequence does not exceed the limit. The sum function is then used to compute the sum of the entries extracted by takeWhile.

This approach works for many programming languages that draw on the functional tradition, including ones like C# and Python that are more often used in an imperative style.

Now, let's try writing this in Scheme. Unfortunately, in Scheme sequences of undefined size (like the Haskell lists and C# and Python generators) are a different data type than sequences of defined size. The latter are called lists, and are the most common data structure in Scheme. The former are called streams. I expected I could just define a Fibonacci stream and apply the Scheme take-while function to it. MIT Press has a nice article explaining how to define a Fibonacci stream, so I stole some code from there.

Unfortunately, the many functions in Scheme that operate on lists will not work on streams, so you have to use the stream versions instead (stream-take-while, stream-fold, etc.).

This is quite disappointing. Unlike Haskell, Python, and C#, Scheme needs to define two libraries with identical semantics depending on whether you need to work with a sequence of determinate or indeterminate length. Consider this the first reason that scheme sucks. The second reason is the ubiquitous linked list data structure. Adding an item to the end of a linked list is an $\Theta(n)$ operation. This means that building a list like the Fibonacci sequence from a recurrence relation is a $\Theta(n^2)$ operation. To get around this limitation, you have to construct the sequence backwards, and then reverse it.

Wednesday, February 4, 2015

Survey data is well known to be an imperfect measure of the truth. The many pitfalls of survey data mean that the numbers must be taken with some healthy skepticism.
Evidence-free assertions that black people are three times more dishonest than white people in answering surveys about drug use do not a healthy skepticism make.

@voxdotcom These figures based on self-reported drug use are notoriously unreliable. I thought it was your job to report "context"?