There has been a great deal of interest in closures lately, driven in great part by the fact that there is talk of adding some form of anonymous functions to the Java. Most of the time, people talk about “adding closures” to Java, and that prompts a flurry of questions of the form “what is a closure and why should I care?”

The discussion around closures tends to go on and on about the “closing over” of free variables and only lightly touch on the biggest change to Java: functions as first-class objects with a lightweight syntax for creating them. Making it easy to do something basic like define a new function is more than just a little syntactic sugar: it makes it easy to do new things with functions that were impractical when you needed a lot of boilerplate to make anything work.

Without understanding functional programming, you can’t invent MapReduce, the algorithm that makes Google so massively scalable.

I’m going to try to explain first class functions using Ruby (it is possible to write code that does exactly the same thing using the current Java feature set, however the result is so wordy that it obscures the basic idea being presented: call it accidental complexity, or perhaps yellow code.)

Ruby is a good language for demonstrating features that ought to be in Java. Like Java, Ruby uses squiggly brace syntax. Like Java, everything in Ruby is an object—whoops, Java has primitives. Okay, like Java, functions are represented as objects.

(The Java convention is to name things in lowerCamelCase, but we’ll ignore that. If you need to print this essay on a dot-matrix printer you may want to make some changes first.)

In Ruby you write the function as:

add_two_integers = lambda { |a,b| a + b }

Later on, when you want to call your function in Java, you write:

add_two_integers.call(35, 42);

And if you like semicolons, you write the exact same thing in Ruby:

add_two_integers.call(35, 42);

You can do the same thing with multiplication:

multiply_two_integers = lambda { |a,b| a * b }

First Class Functions

In the examples above, functions look a little like methods. The Java version is obviously implemented as a method. But what we did in both cases was assign the resulting function to a variable. In Java, assigning a method to a variable is not particularly easy (it is possible using reflection).

Anything that can be assigned to a variable is a value. If it can also be passed as a parameter or returned from a method (or function), we say it is a first class value. Functions as first class values, or first class functions, are very interesting. For example, what can we do passing a function as a parameter to another function?

Hmmm. Well, I am breaking a cardinal rule of selling something. We’re talking about shiny new toys without identifying a problem to be solved. Let’s talk about my favourite problem: writing the same thing more than once, violating the DRY principle.

What do they both do? Pretty much the same thing: they accumulate the result of some binary operation over a list of values. adder accumulates addition, and multiplier accumulates multiplication. You could call this a “Design Pattern.” If you did that, you would use the exact chunk of code everywhere. I would call that retrograde. Didn’t our predecessors invent the subroutine so we could eliminate writing the exact same piece of code over and over again?

Why can’t we do the same thing? Well, we can. A subroutine does the same thing over and over again, but it takes different parameters as it goes. What is different between adder and multiplier? Ah yes, the adding and multiplying. Functions. What we want is a function that takes a function as a parameter.

Well, we said that with first-class functions, functions are values and can be passed as parameters. Let’s try it:

This is much better. When functions can take functions as parameters, we can build abstractions like folder and save ourselves a lot of code. Note that this would be a lot harder to read if we had to surround all of our functions with Object boilerplate in Java. That’s one of the key reasons why ‘syntactic sugar’—making it brief—is a big win.

And you know what? Functions are values, not just variables that happen to hold functions. These work just as well:

There’s just one problem (actually two, but I’m saving one for later): everywhere you use our new folder function, you need to remember that add_two_integers needs a default value of zero, but multiply_two_integers needs a default value of one. That’s bad. Sooner or later you will get this wrong.

What we need is a way to call folder without having to always remember the correct initial value. Should we extend our understanding of a function to include a default initial value for folding? If we’re thinking in Java, maybe our IFromIAndI interface needs a getDefaultFoldValue? I think not. Why should a function know anything about how it’s used? And besides, as we build other abstractions out of functions we’ll need more stuff.

If we aren’t careful, we’ll end up implementing the Visitor pattern on functions, and all of our brevity will go out the window. No, what we want is this: in one place we define that folding addition starts with a default value of zero and in another place we say we want to fold, say, [1, 2, 3, 4, 5] with addition. Then when we want to fold something else with addition, like [2, 4, 6, 8, 10], we shouldn’t have to say anything about zero again.

Adding Curry

What we need is a function that folds addition. Didn’t we say that functions are values that can be returned from functions? How about a function that makes a folding function? We should pass it our initial value and our binary function, and it should return a function that performs the fold without needing an initial value as a parameter:

Actually, there’s a far simpler way to avoid having to remember the default value when you want to fold over addition. But let’s just play along so that we don’t have to come up with an entirely new set of examples to demonstrate the value of functions as first-class values.

Functional programmers (as opposed to the rest of us dysfunctional programmers) will recognize this as currying our folder function. Currying is when a function takes more than one parameter and you combine one of the parameters and the function to produce a function that takes fewer parameters.

Here’s a currying function in Ruby:

curry = lambda { |fn,*a| lambda { |*b| fn.call(*(a + b)) }}

(This is an improvement on an earlier version, thanks to Justin's comment.)

So you can use our new function to create an increment function out of our adder and a treble function out of our multiplier:

If you are ever asked, “what good is currying?,” I hope I’ve given you an example you can use to explain why currying matters, and why people do it all the time (possibly without explicitly naming it). Although it doesn’t look like much when looking at trivial examples like functions that multiply by three, it’s much more useful when creating folders and mappers where you want some of the parameters to remain constant.

Composition

Our examples combined functions and non-functions to create new functions. Here’s an example from a recent post, Don’t Overthink FizzBuzz, where I give a method for composing two functions. The idea is that if you have multiple functions that each take one argument, you can combine them using compose. I also have a method that generates functions, carbnation:

The simple explanation of how it works is that carbonation generates functions that replace every so many elements of a list with a printable string. Compose composes any two or more methods together. So if you want to print out 100 numbers, but replace every third number with “Fizz,” every fifth with “Buzz,” and all those that are third and fifth with “FizzBuzz,” you generate a function for each replacement, compose them together with compose, and then map the numbers from one to one hundred to the resulting überfunction.

When you look at this today, it seems weird and unreadable by Java standards. I wonder if adding first-class functions with simple syntax to Java will lead the Java community to a place where code like this will not appear out of place?

Just one more thing

So we started by saying that people are getting hung up on what makes a closure a closure, and there has been less emphasis on the benefits of using functions as first-class values. Did you notice that our folder function actually includes a non-trivial closure?

If you look at the fold_with_acc function, it makes use of binary_function, a variable from its enclosing lexical scope. This is not possible with the current version of Java: if you translate this to Java, when you make fold_with_acc and anonymous inner class, you will have to copy binary_function into a final member to use it. It simply won’t compile if you try an idiom-for-idiom translation, even adding explicit types.

And then if you look at the anonymous function it returns, lambda { |list| fold_with_acc.call(default_value, list) }, that anonymous function uses default_value,another variable from the enclosing lexical scope. Once again you will have to fool around with final variables to make this work, or perhaps declare full-fledged object with constructors.

(If you try writing this simple example out in Java, you quickly find yourself inventing a lot of classes or interfaces. And they have some complicated types, like a function taking an integer and a function taking two integers, returning a function taking a list of integers and returning an integer.

After twenty minutes of that, you understand why the ML and Haskell communities use type inference: If the types are that complicated, it’s incredibly helpful to have the compiler check them for you. Yet if the types are that verbose, it’s incredibly painful to write them out by hand. Even if your IDE were to write them for you, they take up half the code, obscuring the meaning.

You also get why the Ruby on Rails community doesn’t care about type checking: types for CRUD applications are way less complicated than types for first-class functional programs.)

That’s Interesting

Part of the interest in closures is in simplifying the syntax around functions, and part of the interest is in the way that access to enclosing scope would simplify a lot of code. There’s a whole debate around the value of simplification in a world where all serious languages are Turing Equivalent.

I hope you’re convinced, by now, that programming languages with first-class functions let you find more opportunities for abstraction, which means your code is smaller, tighter, more reusable, and more scalable.

For me, simpler is just nicer until something reaches a certain tipping point: when it becomes so simple that the accidental complexity of using it goes away, I will start using it without thinking about it. Tail optimization is like that: as long as recursion is slower than iteration and sometimes breaks, I have to think about it too much. But when I’m not burdened with “except…” and “when performance is not a factor…” it becomes natural.

And then something interesting happens. It changes the way I look at problems, and one day I see a whole new way to do something that I never saw before. Functions as first class values are definitely one of those things that change everything.

Further Reading

If this has whet your appetite for more, Structure and Interpretation of Computer Programs is the book on higher-order functions and how they can be used as building blocks to create more elaborate abstractions such as object-oriented programming.

The Seasoned Schemer devotes an entire book to the uses of functions. Although the examples are in Scheme, the language is dead simple to learn and the techniques in the book can be applied to Ruby and Java (or at least to a future version of Java where you do not need functors).

The second edition of Programming Ruby is an indispensable guide. Even if you will not be using Ruby immediately, pick it up and discover why so many people are lauding the language's simple, clean design and powerful Lisp-like underpinnings.

“Why would you want to do that? Because that way your code is more flexible and more reusable. Instead of writing ten similar functions, you write a general pattern or framework that can generate the functions you want; then you generate just the functions you need according to the pattern. The program doesn’t need to know in advance which functions are necessary; it can generate them as needed. Instead of writing the complete program yourself, you get the computer to write it for you.”

It’s worth reading even if you have no intention of using Perl: the ideas span languages, just as SICP is worth reading even if you don’t use Scheme at work. And be sure to read Higher-Order JavaScript and Higher-Order Ruby. They translate HOJ’s ideas to other languages.

From Functional to Object-Oriented Programming: “OO allows a traceable connection between the conceptual design level and the implementation level. Concepts have names, so you can talk about them, between programmers and architects.”

HOF or OOP? Yes!: “First-class functions are a natural fit with OO, as evidenced by their presence in OO languages that aren’t glorified PDP-11 assemblers with some OO stuff bolted on the side.”

But Y would I want to do a thing like this?: “To truly learn a new tool, you must not just learn the new tool, you must apply it to new kinds of problems. It’s no good learning to replace simple iteration with first-class functions: all you’ve learned is syntax. To really learn first-class functions, you must seek out problems that aren’t easily solved with iteration.”

I have no idea how you manage to post so often. Stuck in a .Net world, teaching my teammates OOP and the value of unit testing, your posts about dynamic languages, functional programming, and solving hard problems are a welcome break. Thanks.

I've been reading SICP on-line, and watching the 1981 lectures. I'll probably bring some SICP with me when I go to "Essential ASP.Net 2.0" training in a few weeks.

I've actually been thinking about this post for a long while, ever since I reposted my 'closures in ruby" post and recognized how lame it was.

Then, when I posted my FizzBuzz solution, I wanted to expand that into an explanation of how it worked. But I realized that would have strayed from the core message of being careful about taking such a test and administering it, so I saved my notes from that post and used them here.

I guess I really only have one or two posts, and what I actually do is expound on different aspects of those same, tired ideas over and over again.

Some of your posts are good. I like most of them, but could you do me a favour and not spread lies such as MapReduce being a functional construct? It's a flow based construct. The functional version is extremely limited and actually flies in the face of functional programming most of the time by requiring the use of I/O, thus breaking the functional paradigm.

MapReduce is an argument against functional programming, not for it. To be fair, the majority seems to not understand this just to promote functional programming.

Some of your posts are good. I like most of them, but could you do me a favour and...

Actually, no. This musician does not take requests. But the nice thing is this: you can write your own blog, and you can add comments to my blog.

I like it best, quite frankly, when people do both: they write a post on their own blog with their own perspective and drop a comment on my blog with a link so that readers can see what they have to say. Everybody wins!

Now about MapReduce: I didn't say anything, so perhaps you want to talk to Joel. I quoted him saying Without understanding functional programming, you can’t invent MapReduce.

That is probably true. Let's start with the fact that MapReduce is based on the mapping and folding operations. They came from functional programming.

It's true that the parallelization comes from data flow programming, but Joel's statement is that understanding map and reduce (which is transitive folding) is a necessary precondition to inventing MapReduce.

MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key.

Note: there are two pieces, a programming model and an implementation. I think the programming model is functional: the phrases "map function" and "reduce function" are highly suggestive, don't you agree?

The only problem I have with this explanation is that it has the same problem most closure explanations have: It doesn't do something fundamantally hard to do without closures, so you end up with people wondering why closures are a better solution than "unrolling" the closure. (Not quite the right idea, but close.)

I wrote Practical Closures to try to fix that gap, but you could easily "steal" the ideas for this or a future post. :) I tried to find a minimal closure example that is also still very hard without closures.

I think this post comes closer than most (which miss this by a mile), but I still think you'll end up with a lot of programmers wondering why they'd want to compose three functions for FizzBuzz when they could write it all out in a relatively "if" statement. (I understand that the idea extends, but based on reactions I've seen, including when I try to teach this to my coworkers, I don't think people tend to get it without an example that they basically can't "unroll".)

Not that your colleagues (or mine!) are nasty brutes hiding under bridges, but all of this is moot if you don't have someone identifying an urgent problem they need solved.

I said: when it becomes so simple that the accidental complexity of using it goes away, I will start using it without thinking about it... And then something interesting happens. It changes the way I look at problems, and one day I see a whole new way to do something that I never saw before.

That is not a sentiment that will ever appeal to the "just show me why what I'm doing doesn't work" crowd. They don't have a problem to solve.

If you absolutely must try to "sell" them on what I would call basic knowledge of programming, I advise waiting patiently until you spot an opportunity "in the wild," a chance to refactor some of their code.

A well-timed "Did you ever consider re-architecting the class hierarchy so that we have a hierarchy of Functors operating on a few base object classes rather than this hotch-potch of Visitors and interfaces?" might do a lot more than any example code to spur fresh thinking.

At the end of the day, some people operate with minds that are open by default, some operate with minds that are closed by default.

I'm sure you can understand that I am not dismissive of the latter, even though I choose to write for the former.

Thank you for spurring more thought on this.

Working on improving my own skills and the skills of my colleagues is important, and your question has helped me examine my approach and assumptions.

The folding operation is hard. But for any given (simple, non-composite) fold, you can write a for loop.

It is true that it is a qualitative difference to be able to write a fold, and composite functions together, etc. But you have to get past that first step to get to the point where you understand it. By the time you've written an example that is really advantageous over a simple for loop, the example is too complicated to be followed.

I'm not sure "I can do that with a for loop" really qualifies as "trolling"; in some sense, it's a perfectly valid objection. I think it's acceptable to ask that a "new" feature actually do something useful and not merely be a respelling of an existing feature, and it's incumbent upon the person pushing the "new" feature to explain why it really is better than this other straightforward thing over there.

(Part of the source of this particular example was exactly a real-world solution. We were outputting something in PDF and HTML, and despite the radically different paradigms for creating a PDF (command-based with our library) and HTML (string-concatenation, ultimately), the closures solved a problem nicely that no amount of for-loops or OO design could have. I ended up cutting a discussion of this since it seemed redundant.)

Also, while that example may have come from some experiences at work, the post is intended for the web at large. Your "Don't feed the trolls" stuff is great for selling specific people are groups, but when you're targeting "people in general", dealing with specific objections and advantages is all you have, so far as I can see.

it's incumbent upon the person pushing the "new" feature to explain why it really is better than this other straightforward thing over there.

First, I'm not "pushing" anything. What do I care what other people think? Or how they go about their programming careers? Why should I "push" someone out of their comfort zone?

My weblog is a recitation of things that interest me and that I have found useful.

Functional programming is not new. Here is this thing that is something like fifty years old, is accepted by virtually everyone who takes the trouble to learn how it works as being extremely useful, yet the programming community in the large has trouble accepting its value.

Sure, there is one argument that maybe it isn't really useful.

There's another argument that programming has some cultural issues where the most useful things are not always the most popular.

I have accepted the latter explanation. So what is the point of trying to prove it is useful?

I have no evidence that programmers in general have any inclination to even try new ideas or tools (although they will spend all day explaining why what they are doing is expedient).

In the end, it will come down to the "open vs. closed mind" bifurcation.

Someone with an open mind looks at something and immediately wonders "how can I use this to my advantage?"

Someone with a closed mind looks at something and wonders "what are all of the ways this thing is inferior to what I already know?"

I'm all for finding better ways to share ideas with the former group, but I'm not so confident in my understanding of programming or my writing skill to think I can have any impact whatsoever on the latter type of person.

The purpose of my post was to explain what closures and first-class functions are. In my mind I was imagining a reader who has only the most cursory exposure and wonders what all the fuss is about.

My hope is that such a person might read this and say "Ok, I get what this is, I get what Java programs might look like if this is added to the language."

If I'm really lucky, perhaps such a reader might one day be looking at a problem and think "Hey, this looks familiar, maybe I can write a method for this class that returns a function..."

I would personally consider this essay a win if just one such person emails me one day to say "thanks, here's how I used what I read."

I do appreciate your pushing me to make my writing even better. I think the problem here is that I may have to modest a goal with this blog.

In many ways I have given up on certain types of evangelism.

I find it very depressing to talk to a certain type of person about this programming thing that I find such a source of daily joy and wonder.

But I find it uplifting and inspiring to talk to another type of person, the one who says, "Yes, I had that same experience, and I did use it this one time at work, and that was fantastic." They aren't a 100% academic or a dreamer, but they haven't closed all avenues of learning down.

I understand what you are saying. I suppose "push" is the wrong word; I should have said "explain".

I've mostly given up on overt evangelism as well; I've realized (after many years) that my weblog is more for the overflow of things that I can't necessarily talk about with anybody in "real life". (Right now I'm short on co-workers; not jobless, fortunately, just short on coworkers.)

But hopefully sharing will still help somebody.

Actually, in a series of posts I hope to start here in the next couple of weeks (got some technical groundwork on my blog to lay down first), I'm going to start by explaining my belief that you really can't teach anybody anything about programming. You can only sensitize them to problems that they may encounter in the future, and give them somewhere they can start thinking about the problem. But this alone can be valuable; I sure wish I had more blogs around when I was in school! The few there were helped me a lot.

It's hard to sell closures and first-class functions to programmers who haven't already used them. (Heaven knows, I've tried.) It's not that these programmers are too close-minded to "get" closures (although some are), but rather that the human imagination is too limited to render the (very pleasant) realities of programming with a "new" tool as powerful as closures without filling in the gaps with what it already knows. As a result, closures and first-class functions end up being imagined as syntax-sweetened anonymous inner classes and thus lose many of their real benefits. Why do I need another syntax for anonymous inner classes? the OO programmer wonders – and then moves on.

The best way to demonstrate the benefits of closures and first-class functions is to have programmers use them for a few months. Then try to take these new tools away. When the programmers reach for your throat, you can be confident that they "got" the benefits. As they go for the strangle-hold, you can shout, "See! I told you you would like them!" ;-)

Admittedly, such a demonstration doesn't fit into the blog format very well.

As a new Ruby user and an old-school OOP developer the concept of closures remains one of the hardest ideas for me to grasp with functional programming. I look at the examples you provide and immediately, innately convert them to methods in a class in my head.

Thanks for the explanation though, it gives me much to study (along with Jeremy Bowers' post) and hopefully at some point it'll leap from 'academic exercise' to 'practical tool'.

I completely agree that "simpler is just nicer until something reaches a certain tipping point". I don't often make use of new programming techniques as soon as I learn them because I'm already accustomed solving problems using different idioms. However, once I stumble on something that works really nicely with my fancy new construct, I start to see uses for it everywhere.

I noticed that you mention your curry function only works on binary functions due to a limitation in the way Ruby parses arguments passed to lambdas. Have you considered using the 'splat' operator? Including a splat before the final parameter to a function will cause it to soak up any number of remaining arguments into an array (which will be empty if there are none). Unless I misunderstand your point, you could use splat to create a curry function that takes functions of arbitrary arity:(I apologize in advance for the contrived example)

"Actually, no. This musician does not take requests." - reginald braithwaite

Wow! I never thought I'd see the day where someone was a troll on his own blog. I was just asking not to spread lies. If lies are your thing, then I'll know what to expect from now on.

BTW, writing a blog entry is utterly pointless unless you're fishing for links. The response belongs here. I never said you were the source of the lie. I said you're spreading it. I also never said anything about parallelism although that's a good point too. It's about functionality. The concept of MapReduce comes from flow based programming, not functional. You don't need functional programming to understand it either.

I'm following your proposal to take the discussion to my own blog... Only that much here: Going back 20 years to SICP is not moving Java forward. Today, closures are great not because you can iterate over lists and add integers, but because they allow better domain-specific modelling (where a DSL might be an example of "can use without thinking", as you say).And maybe maybe closures allow you to whip up a write-only application faster - this may be relevant for RoR-style webapps.

the name refers to 'fizz' while 'fizz is only an argument. ;-) At that level of abstraction, it is highly ironic. I guess it is on purpose but it does not serve well the readability cause.

So.. it comes down to having a sense of whimsey. Guilty as charged. A colleague just showed me some production code with a method called "hydrate" that turned a Java interface into an anonymous object.

I had to read the code to figure out what it did, the name didn't immediately suggest its purpose.

But you know... anything involving reflection and proxying almost always must be read to be understood, so a name like "doInstantiateAnonymousPOJOFromInterface" would not have been that helpful.

it is not very clear what 'n' and 'i' represent (ubiquity ?)

This is simultaneously a well-understood idiom and an anti-pattern. Some people hate Fortran-like names, others do not.

Do you think it would me more readable if we chose names like "numberForKeepingTrackOfOurPositionInTheList" and "nameOfTheItemInTheList"?

If so, grep is your friend. The code will do the same thing either way.

the function seems to be coupled to compose. I mean without 'compose' I do not believe you would have written carbonation that way.

The function is actually coupled to the premise that you might want to carbonate lists that are not consecutive integers.

For example, roman numerals, or words written out like "three", "four."

Is this a good idea? Of course not. The title of the post is, after all, "Don't Overthink Fizzbuzz." I was deliberately trying to think of a plausible but inadvisable example of overgeneralizing a program.

In this case, we have made it more complex than necessary to accommodate variations involving lists that are not integers and other modulii such as seven or eleven.