Twitter waterflow problem and loeb

The Waterflow Problem

In Fig. 1, we have walls of different heights. Such pictures are represented by an array of integers, where the value at each index is the height of the wall. Fig. 1 is represented with an array as [2,5,1,2,3,4,7,7,6].

Now imagine it rains. How much water is going to be accumulated in puddles between walls? For example, if it rains in Fig 1, the following puddle will be formed:

776543221

Fig. 2

No puddles are formed at edges of the wall, water is considered to simply run off the edge.

We count volume in square blocks of 1×1. Thus, we are left with a puddle between column 1 and column 6 and the volume is 10.

Write a program to return the volume for any array.

My Reaction

I thought, this looks like a spreadsheet problem, and closed the page, to get on with my work. Last thing I need right now is nerd sniping.

This is expectedly the fastest algorithm in this page, clocking in at a mean of 128.2953 us for a random vector of 10000 elements.

But I still thought my spreadsheet idea was feasible.

My approach

In a similar way to Philip Nilsson, I can define the problem as it comes intuitively to me. As I saw it in my head, the problem can be broken down into “what is the volume that a given column will hold?” That can be written like this:

volume0 = 0

volume|S|-1 = 0

volumei = min(lefti-1,righti+1)−heighti

Where left and right are the peak heights to the left or right:

left0 = height0

lefti = max(heighti,lefti-1)

right|S|-1 = height|S|-1

righti = max(heighti,righti+1)

That’s all.

A visual example

An example of i is:

776543221

We spread out in both directions to find the “peak” of the columns:

776543221

How do we do that? We simply define the volume of a column to be in terms of our immediate neighbors to the left and to the right:

776543221AXB

X is defined in terms of A and B. A and B are, in turn, are defined in terms of their immediate neighbors. Until we reach the ends:

776543221AXYB

The ends of the wall are the only ones who only have one side defined in terms of their single neighbor, which makes complete sense. Their volume is always 0. It’s impossible to have a puddle on the edge. A’s “right” will be defined in terms of X, and B’s “left” will be defined in terms of Y.

But how does this approach avoid infinite cycles? Easy. Each column in the spreadsheet contains three values:

The peak to the left.

The peak to the right.

My volume.

A and B below depend upon eachother, but for different slots. A depends on the value of B’s “right” peak value, and B depends on the value of A’s “left” value:

776543221AB

The height of the column’s peak will be the smallest of the two peaks on either side:

776543221

And then the volume of the column is simply the height of the peak minus the column’s height:

776543221

Enter loeb

I first heard about loeb from Dan Piponi’s From Löb’s Theorem to Spreadsheet Evaluation some years back, and ever since I’ve been wanting to use it for a real problem. It lets you easily define a spreadsheet generator by mapping over a functor containing functions. To each function in the container, the container itself is passed to that function.

So as described in the elaboration of how I saw the problem in my head, the solution takes the vector of numbers, generates a spreadsheet of triples, defined in terms of their neighbors—exept edges—and then simply makes a sum total of the third value, the volumes.

It’s not the most efficient algorithm—it relies on laziness in an almost perverse way, but I like that I was able to express exactly what occured to me. And loeb is suave. It clocks in at a mean of 3.512758 ms for a vector of 10000 random elements. That’s not too bad, compared to the scanr/scanl.

This is was also my first use of lens, so that was fun. The cloneLens are required because you can’t pass in an arbitrary lens and then use it both as a setter and a getter, the type becomes fixed on one or the other, making it not really a lens anymore. I find that pretty disappointing. But otherwise the lenses made the code simpler.

Update with comonads & pointed lists

Michael Zuser pointed out another cool insight from Comonads and reading from the future (Dan Piponi’s blog is a treasure trove!) that while loeb lets you look at the whole container, giving you absolute references, the equivalent corecursive fix (below wfix) on a comonad gives you relative references. Michael demonstrates below using Jeff Wheeler’s pointed list library and Edward Kmett’s comonad library:

I think if I’d’ve heard of this before, this solution would’ve come to mind instead, it seems entirely natural!

Sadly, this is the slowest algorithm on the page. I’m not sure how to optimize it to be better.

Update on lens

Russell O’Connor gave me some hints for reducing the lens verbiage. First, eta-reducing the locally defined lens l in my code removes the need for the NoMonomorphismRestriction extension, so I’ve removed that. Second, a rank-N type can also be used, but then the type signature is rather large and I’m unable to reduce it presently without reading more of the lens library.