We are hiring

Expanding reducers

When playing with a new bit of language, it can be helpful to restrict the problem space to an old, well understood algorithm. For me at least, learning one thing at a time is easier! For this post, It’ll be prime sieves, and I’ll be exploring clojure reducers.

A quick recap, the sieve of eratosthenes is a not-maximally-non-optimal way of finding primes. It’s usually expressed as follows:To find primes below n:
generate a list of n integers greater than 1
while the list is not empty:
take the head of the list and:
add it to the output
remove all numbers evenly divisible by it from the list
In clojure, something like:(defn sieve
([n] (sieve [] (range 2 n)))
([primes xs]
(if-let [prime (first xs)]
(recur (conj primes prime) (remove #(zero? (mod % prime)) xs))
primes)))
(sieve 10)
;= [2 3 5 7]
Which is fine, but I’d like it lazy so I only pay for what I use, and I can use as much as I’m willing to pay for. Let’s look at lazy sequences. Luckily for us, there is an example of exactly this on the lazy-seq documentation, which we slightly modify like so:(defn lazy-sieve [s]
(cons (first s)
(lazy-seq (lazy-sieve (remove #(zero? (mod % (first s))) (rest s))))))
(defn primes []
(lazy-seq (lazy-sieve (iterate inc 2))))
(take 5 (primes))
;= (2 3 5 7)
So now we have a nice generic source of primes that grows only as we take more. But is there another way?

A few months ago Rich Hickey introduced reducers. By turning the concept of ‘reducing’ inside out the new framework allows a parallel reduce (fold) in some circumstances. Which doesn’t apply here. But let’s see if we can build a different form of sieve using the new framework. First a quick overview (cribbing from the original blog post):

Collections are now ‘reducible’, in that they implement a reduce protocol. Filter, map, etc are implemented as functions that can be applied by a reducible to itself to return another reducible, but lazily, and possibly in parallel. So in the example below we have a reducible (a vector), that maps inc to itself to return a reducible that is then wrapped with a filter on even? which returns a further reducible, that reduce then collects with +.(require '[clojure.core.reducers :as r])
We’ll be referring to r here and there – just remember it’s the clojure.core.reducers namespace(reduce + (r/filter even? (r/map inc [1 1 1 2])))
;= 6
These are composable, so we can build ‘recipes’.;;red is a reducer awaiting a collection
(def red (comp (r/filter even?) (r/map inc)))
(reduce + (red [1 1 1 2]))
;= 6into uses reduce internally, so we can use it to build collections instead of reducing:(into [] (r/filter even? (r/map inc [1 1 1 2])))
;= [2 2 2]
So here’s the core of ‘reducer’, which “Given a reducible collection, and a transformation function xf, returns a reducible collection, where any supplied reducing fn will be transformed by xf. xf is a function of reducing fn to reducing fn.”(defn reducer
([coll xf]
(reify
clojure.core.protocols/CollReduce
(coll-reduce [_ f1 init]
(clojure.core.protocols/coll-reduce coll (xf f1) init)))))
And we can then use that to implement mapping as so:(defn mapping [f]
(fn [f1]
(fn [result input]
(f1 result (f input)))))
(defn rmap [f coll]
(reducer coll (mapping f)))
(reduce + 0 (rmap inc [1 2 3 4]))
;= 14
Fine. So what about sieves? One thought is we could build up a list of composed filters, built as new primes are found (see the lazy-seq example above). But there’s no obvious place to do the building, as applying the reducing functions is left to the reducible implementation. Another possibility is to introduce a new type of reducing function, the ‘progressive-filter’, which keeps track of past finds and can filter against them.(defn prog-filter [f]
(let [flt (atom [])]
(fn [f1]
(fn [result input]
(if (not-any? #(f input %) @flt)
(do
(swap! flt conj input)
(f1 result input))
result)))))
(defn progressive-filter [f coll]
(reducer coll (prog-filter f)))
And we then reduce with a filtering function that is a function of the current candidate and one of the list of found primes (see the #(f input %) bit above)(into [] (progressive-filter #(zero? (mod %1 %2)) (range 2 10))
;= [2 3 5 7]
It’s nicely lazy, so we can use iterate to generate integers, and take only a few (r/take, as it’s operating on a reducer):(into [] (r/take 5 (progressive-filter #(zero? (mod %1 %2)) (iterate inc 2))))
;= [2 3 5 7 11]
Or even(def primes (progressive-filter #(zero? (mod %1 %2)) (iterate inc 2)))
(into [] (r/take 5 primes))
;= [2 3 5 7 11]
You get the idea.