Implementing the Sieve of Eratosthenes with Functional Programming

Programming interviews often include a “Fizzbuzz” test, some simple programming task that is intended to establish that the candidate can, in fact, program something trivial. It isn’t intended as a deep architectural dive or as an algorithms challenge. Under ideal, non-pressure situations it shouldn’t take more than fifteen minutes to complete.

One common exercise is to print out some prime numbers, perhaps a hundred or a thousand. We’ll write a JavaScript program to do that, and we’ll implement the well-known Sieve of Eratosthenes algorithm.

functional iterators

Our approach will rest on functional iterators, stateful functions that you call over and over again to get values. Functional iterators return undefined if and when they have no more values to provide. Functional iterators are handy for two reasons: They decouple the mechanism of iterating over a data structure from what you do with its values, and they are also handy for dealing with data structures that may be so large as to be unwieldy, including infinitely large data sets.

To implement our Sieve of Eratosthenes, we’re going to use a few of the tools provided by allong.es, the JavaScript Function Utility Belt. Let’s get started.

unfolding

All implementations of the Sieve rest on filtering numbers that are divisible by any smaller prime number. To be true to the spirit of the Sieve, you must emulate the behaviour of filtering all of the numbers going forward, not iterating forward and checking each number against a list of primes.

So the first thing we’ll need is a collection of natural numbers. Since we don’t know how many numbers we’ll need, we might as well start with all of the numbers from two upwards. That’s very easy to write as an iterator:

As you can see, we have a stateful function that returns 2 the first time you call it, and thereafter returns the next largest natural number.

NumbersFromTwo is a specific case of a very general pattern: Starting with a seed, generate an iteration where each value generated is calculated by applying a function to the previous value. When you start with a list, collection, or iteration and “reduce” it to a value, computer scientists call that a “fold.” And the reverse operation, its dual, is called an “unfold.”

For a very simple case like this, we don’t really need to be explicit about our unfolding, but for the sake of seeing how things work, let’s use a helper function from the allong.es library and rewrite it as an unfold:

unfold returns an iterator. The first time you call the iterator, it returns the seed. Thereafter, it applies the function to the previous value to determine the next value to return. It may not seem like much of a win for a very short function, but making an unfold explicit like this also makes it easier to reason about the iterator than if you wrote a stateful function by hand.

Now we have a function that iterates over the natural numbers starting with 2. How would we “sieve” it?

accumulate with return

In The Drunken Walk Programming problem, we were introduced to the accumulate function. It works a lot like a fold, reduce, or inject method: It iterates over a collection, accumulating a state along the way. The special sauce in accumulate is that instead of waiting for the end of the collection and returning a single value, accumulate returns an iterator that iterates over the state value as it iterates over the list. For example:

Instead of returning the sum of the array as a single value, this iterator created with accumulate iterates over the running total.

accumulate is a terrific tool, but it does conflate the state with the value returned. Thus, you often find yourself writing a map from the accumulated state to the values you want. In some cases this can be either expensive or unclear, so allong.es provides a function that separates the concerns of accumulating state from generating values.

sieving numbers

Let’s consider what we want to do to “sieve” all the multiples of a number from the natural numbers. Essentially, we want to count as we go along. If our number is “5,” we count 1, 2, 3, 4, “stroke out!” 1, 2, 3, 4, “stroke out!” We could also choose to count down, as in 5, 4, 3, 2, “stroke out!” 5, 4, 3, 2, “stroke out!” We’ll go with counting down,although you’re welcome to rewrite it counting up and see that we get the same results.

Thus, we want to maintain a counter as we iterate through the numbers. But what do we want to return? If the counter is greater than one, we return the number. But if our counter reaches one, we return some special flag indicating we’ve stroked the number our and reset our count to our number.

Obviously, this could be much easier: We can take the modulus of the number directly with a simple map. But that isn’t the literal Sieve of Eratosthenes. And gosh-diddley-honest, we are going to implement this algorithm literally.

the final step

With an unfold, we start with a seed and transform it with every iteration into the next value to be returned. That, in a nutshell, is what we’re going to do with an iteration over the numbers from two: transform it with every iteration by sieving it.

So we expect to see something like this in the middle of our code:

remainingNumbers=SieveMultiplesOf(remainingNumbers,nextPrimeNumber);

However, we don’t actually want to return the sieved numbers with each iteration, we want to return each prime. So we need an unfold that separates state from the return value. Hmmm. accumulateWithReturn is just like accumulate, only it separates state from return value. Could it be that unfoldWithReturn is just like unfold but separates state from return value?

Our algorithm takes advantage of the fact that find from allong.es calls the iterator until a value matches the predicate function. It will thus skip over all of the false values representing multiples that have been crossed out.

The first number not crossed out is always the next prime to return. So, we return the new state consisting of the remaining numbers with multiples of the new prime number crossed out, and the new prime number as our return value.

bonus

Eratosthenes actually advocated an optimization of this algorithm. I left it out initially because it changes SieveMultiplesOf in such a way that it isn’t as easy to verify its behaviour separately because it’s coupled to the way numbers are taken off the front of the iterators:

post script

I actually failed that particular interview test when I took it. I can’t tell you why, it was “Just one of those days,” I guess. But I’ve never forgotten the fact that no matter how simple the test, interviews are high-pressure situations where anyone can “choke.” Well, maybe not anyone. But I certainly can.