Clojure is a great programming language, but a recurring complaint one keeps
hearing from developers hacking on Clojure code is that debugging can be
unpleasant. First of all, I agree! Debugging Clojure code can be more daunting
on average than, say, debugging Python code. This is mainly due to two reasons:

Clojure's Java legacy. Clojure is compiled to Java bytecode, which has some
terminology and idiosyncracies Clojure programmers aren't always familiar
with. These terms tend to pop up in stack traces and cause confusion (e.g.
IFN).

Clojure - being a Lisp - has a certain code structure which is different
from, say, a more common imperative coding style. Rather than being a
sequence of statements, Clojure programs tend to involve long call chains of
nested expressions. Where only part of an expression fails, it's often
non-trivial to figure out why.

In this post I want to share some notes from my own experience debugging Clojure
programs.

Dealing with Clojure's cryptic exceptions

The first problem with Clojure's runtime exceptions is that we usually don't get
to see the full stack trace by default. Let's say we have this silly,
nonsensical, function in a file called sample.clj:

(defn foo[n](cond (> n40)(+ n20)(> n20)(- (first n)20):else0))

Then to try how it works, we load the file into the REPL and type the following
[1]:

Uh oh. There are two problems here. First, what does this error message mean?
What's ISeq and what's java.lang.Long? Second, it's not clear where it
is actually failing (thanks for that pointer to RT.java though, Clojure!)
Let's address the second problem first. The magic incantation to show the stack
trace of the last exception is calling the pst function:

This is much better because at least some files in this trace are familiar.
core.clj is not ourcore.clj, it's Clojure's core library. But
sample.cljis our file, and we can infer that on line 10 we call
clojure,core/first and something goes wrong. Line 10 happens to be:

(> n20)(- (first n)20)

So now things become more clear. The call (first n) must be bad, and bad in
a way that tries to coerce clojure into creating an ISeq from a Long.
In other words, we're passing a number into a function that expects a sequence,
and this is, indeed, bad. Learning to map from Clojure values and types to the
JVM's expectations will take time and grit - especially if you (like me) don't
have much Java experience. I suggest doing a bit of reading on Clojure/Java
interoperability, and about other Java-isms Clojure inherits; it ain't pretty,
and you may not always want to use it, but being familiar with the terms can go
a long way in deciphering cryptic stack traces.

This exception also tells us it comes from clojure.lang.Numbers.gt from
which we can infer it's the > operator that is complaining. But imagine for
a second that we had two forms with the same operator on that line:

(cond (> c40)(* (+ n20)(+ m30)))

If we got a NullPointerException about an addition, we wouldn't know which
one fails. Luckily, Clojure comes with a very useful module that helps
debugging - tools.trace. In this
particular case, we'd use the trace-forms macro which tells us which
nested form (expression) is failing. We can modify our function to be:

Neat, huh? trace-forms breaks the form it traces to all the nested forms
and reports precisely which one failed - propagating this information upwards
towards the top form [2]. trace-forms is very useful when errors manifest
as exceptions.

Unfortunately, this isn't sufficient for all cases. Our foo wasn't designed
to handle nils, and the bug here is in the place where the nil came
from. This may be quite a bit removed - and not on the same stack trace - from
where foo is invoked. We'll get an exception when foo is called, but the
real challenge is to find where the nil came from. More generally, bugs
that manifest as thrown exceptions are the easier kind of bugs. The more
insidious bugs hide in programs that run just fine end-to-end but compute
slightly incorrect results.

Tracing and logging

This gets us into the more general domain of debugging, where the tricks and
tools programmers use are as varied as the bugs hiding in our programs. When it
comes to debugging, I'm firmly in the printf camp; I rarely prefer debuggers
over printf-based debugging [3], and Clojure is no exception. In fact, due
to the way Clojure programs look (nested forms), I find that debuggers are even
less useful in Clojure than in other languages. On the other hand, Clojure's
macros make it possible to trace / print stuff in a very nice way.

For example, I find that it's useful to be able to turn debugging printouts on
and off frequently. So I have this trusty code in my utilities:

This optional verbosity is perfect when you're in the middle of a furious bug
hunt, adding printfvs in many places in your code. with-verbose can
turn verbose logging on selectively and control the amount of debugging spew
[4].

This example brings us back to the tools.trace library, which provides
another awesome tool that helps trace function calls (the bread and butter of
Clojure programs). Enter trace-vars. After importing it, all we need to do
is invoke it on any functions we want traced; for example:

Note how easy it to see what calls what. Quite often, bugs are uncovered simply
by carefully studying the chain of function calls some input tickles in our
code, and trace-vars is a very low-effort method to enable this kind of
debugging.

Deeper tracing inside cond forms

Tracing function calls is great, but sometimes insufficient. It's not uncommon
to have cond forms in functions, and sometimes it's pretty hard to know
which condition was actually "taken" (this isn't always easy to infer from the
return value of the function). We've seen how to explore where exceptions come
from with trace-forms, but exceptions are just one kind of problem. The more
difficul problem arises when the code throws no exceptions but still produces a
wrong value.

I've mentioned how Clojure's macro superpowers let us write very powerful
debugging tools. What follows is another example.

Consider this toy code:

(cond (> 1020)(+ 1020)(> 2010)(- 2010):else200)

It happens to return 10 since the second condition fires. But suppose it stands
for a much more complicated cond where it's not obvious which condition was
taken and where the return value came from. How do we go about debugging this?

Well, we can always add a printfv into every result expression (possibly
wrapping in a do form) and see what fires. This would work, but it's quite
tiresome, especially for large conds. To do this automatically, we can
write the following macro:

Note the printout before the return value of 10: condv (> 20 10) - it shows
us exactly which condition was taken.

Conclusion

While beginning Clojure programmers may find the debugging experience
challenging, I believe that with some effort and perseverance it's possible to
get used to the unusual environment and even reach new levels of productivity
by developing a set of debugging tools and techniques.

In this endeavor, Clojure's macro capabilities are an extremely powerful ally.
Coupled with a fast edit-rerun cycle in the REPL, such tools can turn Clojure
debugging into a much less painful activity.

The astute reader will notice a slight discrepancy between our code and
the output of trace-form. We don't have an if form, or do we?
Quiz: what does cond expand to? Complex interactions between macros
and functions is yet another reason debugging Clojure code is sometimes
hard...

This method is only recommended when the debugging prinouts are destined
to be eventually eliminated from the code. For more permanent logging
with more verbosity controls, consider using a proper logging library
like tools.logging.