Clojure — Treat code as data [Pirates of the JVM]

When we launched the Pirates of the JVM series, we promised we would put the spotlight on each and every programming language in the JVM universe so here we go. Next stop: Clojure. We talked to Mark Engelberg, Clojure trainer, about the advantages and disadvantages of this language, its core principles and more.

We are sailing across the Functional Ocean and you should be able to see Clojure clearly now. The Functional Ocean has a lot of interesting attractions such as Frege, Erjang, Eta, Lux and Clojure. The latter is a dynamic, general-purpose programming language, which combines the approachability and interactive development of a scripting language with an efficient and robust infrastructure for multithreaded programming. In short, it treats code as data.

Click on the map to see the entire infographic

Clojure — Background information

JAXenter:What was your motivation for working with Clojure? What does Clojure have to offer that other languages fail to provide?

Mark Engelberg:I was drawn to Clojure for three reasons:

First, Clojure comes with a rich assortment of immutable (also known as persistent) data structures woven into the fabric of the language: lists, lazy lists, vectors (similar to arrays), associative maps (similar to hash tables), sets, sorted maps, sorted sets, and queues. This means that updating data creates a new copy which efficiently shares unchanged information with the source data, rather than destructively changing the original. This makes complex programs vastly easier to write and reason about, simplifies equality, and allows any object to be used safely as the key in a map.

I have come to really appreciate Clojure’s exceptional ability to interoperate with Java libraries

Second, Clojure is a dynamically typed language – one of the fastest, most performant dynamically-typed languages around. I’ve always felt the most productive when programming in dynamic languages. Dynamic languages tend to have less boilerplate, less ceremony, and make it easier to adapt to changing requirements throughout the lifespan of a project. Also, in dynamic languages, it is generally easier to create, exchange, and consume data “over the wire” with other languages and other systems. Clojure offers type hinting and specs (similar to schemas) to address some of the performance and type safety concerns that come with dynamic typing.

Third, Clojure’s emphasis on immutability really shines when it comes to writing concurrent programs. A stateful entity can be modeled as immutable data inside a mutable box. You can look in the box safely at any time (non-blocking reads) and be sure that the data inside is in a consistent state. Updating the stateful entity is just a matter of swapping out the contents of the box. Clojure provides a wide variety of constructs with well-defined concurrency semantics so you can choose the option that most perfectly suits your needs, such as atoms, agents, volatiles, refs, futures, promises, and channels. You choose based on the level of coordination you need, whether you want synchronous or asynchronous behavior, and whether you need a way for consumers to apply backpressure which limits producers. Picking and using the right “box” for your data is far simpler than traditional, manual lock-based approaches to concurrency.

Although it wasn’t part of the initial draw of Clojure for me, I have come to really appreciate Clojure’s exceptional ability to interoperate with Java libraries and to compile to Javascript with the Clojurescript compiler to be able to share code across both client and server.

JAXenter: Can you describe the core principles of the language?

Mark Engelberg: The most important core principles, as I discussed above, are immutability and dynamic typing. Another core principle is the idea of REPL-driven development, i.e., developing and testing your code interactively. Let’s look at all this in action. Here’s a Clojure function that increments the age of any kind of data that has an age field:

(defn increment-age [data]
(update data :age inc))

But more importantly, let’s try this function out at the REPL:

=> (increment-age {:name "Jim", :age 25})
{:name "Jim", :age 26}

One key thing to note here is that Clojure eschews encapsulating data behind objects with private fields, getters and setters. There’s little benefit to hiding information when consumers can’t destructively change it. Most data can be represented as a plain associative map of key-value pairs (similar to JSON), and Clojure provides us with a uniform API for retrieving and updating this data.

Notice how, even after calling the increment-age function, the value of p remains unchanged. Our function increment-age is, therefore, a pure function which will always return the same output for a given input, making it much easier to test and be sure it is correct. In Java, only some types of data (e.g., numbers and strings) can be manipulated safely, non-destructively like this. In Clojure, all data has this nice property.

The key to understanding Clojure

JAXenter: What would a typical program with Clojure look like?

Mark Engelberg: The key thing to understand in order to read Clojure programs is that it is in the Lisp family of languages and uses prefix notation, meaning that the function always comes first in a function application, surrounded by parens. Instead of `f(x)` for function application, we say `(f x)`. Instead of `2+3` we say `(+ 2 3)`. Parens are also used to express lists, and this duality between code and data makes it easy to use meta-programming facilities like macros and eval.

Programs typically begin with a namespace declaration which lists the functions that are needed from other namepsaces. Data surrounded by [] is a vector, and #{} indicates a set. `def` gives something a name within the namespace and `defn` defines a function. `for` is Clojure’s sequence comprehension syntax.

Here’s a fun little program I wrote recently to find all the ways to combine numbers with the basic arithmetic operations in order to reach a certain target number. The expressions are built and printed, of course, using prefix notation.

Sample usage for finding all the ways to reach the target number 24, using the numbers 3, 4, 5, and 5.

JAXenter: For what kind of applications/use cases is Clojure well-suited? For which ones it is not?

Mark Engelberg: Clojure is an exceptional tool for building complex information systems, especially those that interact with other languages and other systems. Because of its great interop with Java, it’s also great for extending Java libraries with dynamic-style flexibility and productivity. It is especially popular for web development, data analysis, and big data.

Clojure is especially popular for web development, data analysis, and big data.

Clojure shares Java’s weakness on number-crunching applications that require native speed. For those sorts of applications, you may be better off using a language that makes it easy to drop down to C or assembly level for numeric-intensive performance.

What’s next for Clojure?

JAXenter: What is the current state of the language?

Mark Engelberg: Clojure has been a stable, mature platform since its initial 1.0 release in 2009. Clojure continues to evolve with an emphasis on stability and backwards compatibility. Clojurescript is a robust variation of Clojure for compiling to Javascript and building client-side applications. The big focus of the next release, version 1.9, is specs, a way to specify the schemas for data flowing through your functions.

JAXenter: How about your plans?

Mark Engelberg: Clojure continues to be the best fit for my work needs. I plan to continue to use Clojure for a long time to come.

The ones I most frequently recommend to beginners are Clojure Programming and Living Clojure. The online siteis a fun set of programming challenges which help you learn the capabilities of Clojure’s many built-in functions.

Mark Engelberg has been an active member of the Clojure community ever since Clojure turned 1.0, and is the primary developer of math.combinatorics, math.numeric-tower, data.priority-map, ubergraph, and a co-developer of instaparse. He creates logic puzzles and games, using Clojure as a “secret weapon” to build his own puzzle development tools. His latest work is a line of programming-themed puzzle games for kids, produced by Thinkfun and slated to arrive in toy stores later this year. For 18 years, Mark has taught computer science and functional programming to a mix of new and experienced programmers — young kids, teens, and adults. He helped organize the 2016 Seattle ClojureBridge.