Clojure’s unsung heroics with concurrency

Eric Normand · Updated August 9, 2018

Clojure has a good reputation for concurrency. People write Clojure
programs that work on hundreds of threads, all safely reading and
writing to the same memory. People know about the immutable data
structures and the STM. But there's something going on at a much
deeper level that is really hard to get right in Java. It has to do
with the optimizations the JIT will run on your code.

To undertand what I'm referring to, let's look at a series of
optimizations that the JIT can do.

Very simple and easy. And it works. But then you run this a lot, and
what happens? It stops working! Sometimes, you get an infinite
loop. When you debug it, it always works. But after running for about
5 minutes, it goes back into the infinite loop.

When something like that happens, it's often the JIT. The JIT will
optimize code that is run frequently. The debugger will use the
unoptimized bytecode and interpret it. If it doesn't happen during
debugging, but does happen after the JIT has had a chance to run, the
JIT could be the culprit.

Let's step through the optimizations the JIT is allowed to do.

The first thing is called inlining. We can inline the call to isDone().

That's great. It avoids a method call. The next thing it can do is caching. If a value in the heap is accessed more than once, the JIT
is allowed to cache that value on the heap using a local variable.

That's great! It avoids costly memory fetches. But wait! Something has
changed. Before, we were checking the value of a's field every time
through the loop. We were expecting another thread to change the value
at some point. But now it's only checking once. So the JIT has turned
this into an infinite loop! This was hard for me to believe at first,
but it's true.

Java defaults to the sequential case. To avoid this problem, you
have to put a "memory barrier" to tell the JIT that it can't inline
this value. In this case, the proper keyword to use is volatile. Any
time a value will be accessed by multiple threads, you should use
volatile.

Did you know that? I certainly didn't before I did some research. I
wrote Java code for years and I never used volatile. Before you run to
your Java project to make sure you're using volatile correctly,
crying over years of wasted debugging time looking for those
heisenbugs (like I did), let me finish about Clojure.

Clojure simply makes a different tradeoff: assume everything will be
accessed by multiple threads. While most things in Clojure are
immutable (and so can be cached), the things that can change (atoms,
refs, vars, etc.) are done with the correct memory barriers and locks.

I hear people talking about immutable values and STM. But I don't hear
so much about this correct use of memory barriers in the core
implementations. But what it means is that Clojure is much safer for
threading than Java, without having to think about it. Yet another
reason Clojure is a Better Java.

The JVM is complicated, but Clojure makes it easier. There's stil a
lot to know, though. That's why I made JVM Fundamentals for Clojure. It's a video course
with more than 5 hours of lessons about stuff I use all the time as a
professional Clojure programmer.

There's one last thing I'd like to discuss, and those are the dreaded
Clojure stacktraces. Next time!

Footer CTA

Get the newsletter for free

The PurelyFunctional.tv Newsletter is a weekly email to inspire functional programmers.

Enter your email address to receive emails about Clojure and Functional Programming. These include the weekly newsletter and other great offers. You can unsubscribe any time.