Finding closure on my summer vacation

If you were looking for a heart-warming story of coping with loss, I’m afraid you’ll have to look further. I’m not talking about that kind of closure but rather this one.

When I was supposed to be enjoying good weather and good beer, I found myself thinking about functional programming instead. Now, that’s a big topic that I’ll let you explore on your own if you like, but one aspect of it that I thought I could apply to my work was closures. So, what’s a closure? An example would help:

def f(x): def g(y): return x + 1 return g

add2 = f(2)

Now, whenever I call add2 in my program, I can add two to something else, like this:

print “two plus two equals”, add2(2)

Let’s walk through what happens when I call function “f”. You’ll see that the first thing f does is define function g. However, g uses (in its return statement) the value ‘x’ that was passed to f. Essentially, that value of x is bound to the function g that is defined. After the definition of g, the value of g (that is, the reference to the function) is passed back to the caller of f. At that point, I could delete f if I wanted to, since what I will really use is the result of f — a custom function.

OK, this is a toy example. What about something we can actually use? Well, in my latest commits to read.py, I’ve done just that. I added several iterator functions that are built using closures. Open it up and look around line 770. There you should see the definition of a closure “_f,” that defines a function “g” that is returned to the caller. In this case, “g” is an generator function (because of the “yield” statement). When “_f” is called in lines 782-789, it is given an argument which is another function in the same module — one that returns a cursor over the corresponding object type (people, families, events, …, etc.). However, those functions take an argument, as all normal methods do, of the object instance (usually called ‘self’). So, the function “g” that “_f” defines uses that argument to pass to the cursor function.

Starting at line 791, you should see another closure defined and used. I used the same name since the name of the closure function is not important, (nor is the name of the function defined inside it). In fact, I could (and maybe should) delete “_f” when I’m done with it, just to be tidy.

So what’s the advantage of this approach? For one thing, it is succinct, and allows for a compact expression that you can see at a glance. In these cases, I’ve replaced 8 methods (plus a possible helper method) with one closure definition and a bunch of function calls. The common code is self-contained and leaks no attributes to the namespace around it (a good practice of functional programming).

I bet you’re wondering what the catch is. Well, there are a couple:

1. You can’t override the closure function or anything inside it. Since this happens at the definition of the class, nothing overridden in a derived class is available for use when the closure is called. That means that I cannot override “get_person_cursor” in a derived class and expect the override to be used when my closure is called.

2. Since you are creating a custom function with each call to the closure, you generate somewhat more object code. In contrast, with the approach used in other methods in this module, I would define separate methods for each object type (like get_person_cursor) and call a helper function from within it (__get_cursor in that case). If the closed function is short (like these ones are), you probably come out about even; if the closed function is long and involved, that would mean several almost-identical functions in the object code and you might wind up with a larger file after compilation.

Where to next? For the moment, I’d just like to leave it out there and see what happens. Look for breakages, etc. Down the road, this is just one more tool to help us build good code. Within the domain of functional programming, there are lots more.