;; A couple of chaps have asked me to write a clojure macro tutorial, to explain
;; my debugging macro:
(defmacrodbg[x] `(let [x# ~x] (println '~x "=" x#) x#))
;; Which I use to print out intermediate values in functions.
;; Here's an example function that we might want to debug:
(defnpythag [ x y ] (* (* x x) (* y y)))
;; And here is a version enhanced to print out its thought process as it runs
(defnpythag [ x y ] (dbg (* (dbg (* x x)) (dbg (* y y)))))
(pythag 4 5)
;; (* x x) = 16
;; (* y y) = 25
;; (* (dbg (* x x)) (dbg (* y y))) = 400
;; 400
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; I'm going to try to imagine that I didn't know how to write dbg, and had to
;; go at it by trial and error, to show why it is as it is.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; The problem that dbg solves:
;; Often the best way to debug code is by adding print statements.
;; Consider our old friend factorial
(defnfactorial [n]
(if (< n 2) n
(* n (factorial (dec n)))))
(factorial 5) ;; 120
;; How would we watch it at work?
;; This modified version prints out the value of every recursive call:
(defnfactorial [n]
(if (< n 2) n
(* n (let [a (factorial (dec n))]
(println"(factorial (dec n))=" a)
a))))
(factorial 5)
;;(factorial (dec n))= 1
;;(factorial (dec n))= 2
;;(factorial (dec n))= 6
;;(factorial (dec n))= 24
;;120
;; So now we can watch the stack unwind. This gives us confidence in the inner
;; workings of the function.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; The problem with this solution is that I've had to do a fair bit of typing to
;; change the function into a version that prints out its intermediate values.
;; First, let's give n a global value so that we can evaluate fragments out of
;; context:
(defn 5)
;; Here's the original function again (re-evaluate the definition)
(defnfactorial [n]
(if (< n 2) n
(* n (factorial (dec n)))))
;; Specifically, what I had to do was change
(factorial (dec n))
;; into
(let [a (factorial (dec n))] (println"(factorial (dec n))=" a) a)
;; Which is an expression which not only evaluates to 24 when n=5 , like the
;; original did, but which prints out (factorial (dec n))= 24 at the same time.
;; Notice that the phrase (factorial (dec n)) has to be repeated in the code.
;; Every time I would like to examine the value returned by an expression as my
;; program runs, I have to make this complicated but mechanical change. Even
;; more annoyingly, I have to tell the compiler the same thing twice.
;; Any time you find that you have to do too much typing and perform mechanical
;; repetitions, you will also find difficulty in reading, and potential for
;; error. It is always to be avoided.
;; As Larry Wall said, the chief virtue of a programmer is laziness.
;; This simple repetitive task should be as easy as changing
(factorial (dec n))
;; to
(dbg (factorial (dec n)))
;; Normally, when one spots a common pattern like this, one makes a function.
;; But a function to do what we want here is problematical, because we need the
;; source code as well as the evaluated value of (factorial (dec n)) We might
;; try something like:
(defndbgf [s x]
(println s "=" x)
x)
;; And use it like this:
(defnfactorial [n]
(if (< n 2) n
(dbgf "(* n factorial(dec n))" (* n (factorial (dec n))))))
;; That's a bit better, but I'm still changing
(* n (factorial (dec n)))
;; into
(dbgf "(* n factorial(dec n))" (* n (factorial (dec n))))
;; Which is less error prone, but still repetitive.
;; The reason that we need to hold dbgf's hand like this, telling it what to
;; print out in two different ways, is that a function's arguments are evaluated
;; before the function is called.
;; If we want to write
(dbg (* n (factorial (dec n))))
;; and have it work as we intend, then we need to take control of when
;; (* n (factorial (dec n))) is evaluated.
;; And this is the problem that macros solve.
;; We need to work out how to write:
(dbg (* n (factorial (dec n))))
;; And get:
(let [a (factorial (dec n))] (println"(factorial (dec n))=" a) a)
;; Instead.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Generating Code
;; Now because lisp code and lisp data are very similar things, we can easily
;; write a function which will generate the code that we want:
;; Let's define:
(defndbg-code [s]
(list 'let ['a s] (list 'println (list 'quote s) "=" 'a) 'a))
;; Which is just a function that takes some code and gives back some code.
;; We can call this function on little pieces of code, to get other little
;; pieces of code
(dbg-code 'x)
;; (let [a x] (println (quote x) "=" a) a)
(dbg-code '(* x x))
;; (let [a (* x x)] (println (quote (* x x)) "=" a) a)
(dbg-code '(* n (factorial (dec n))))
;; (let [a (* n (factorial (dec n)))] (println (quote (* n (factorial (dec n)))) = a) a)
;; Nothing 'macro' has gone on yet! This is just a function, taking advantage of
;; lisp's ability to easily deal with the lists and trees and vectors that make
;; up lisp code.
;; But the function generates exactly the code that we'd like the compiler to
;; substitute in.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Macros
;; Now how shall we turn our code-generating function into a macro?
;; Just change defn to defmacro:
(defmacrodbg-1 [s]
(list 'let ['a s] (list 'println (list 'quote s) "=" 'a) 'a))
;; Now it's a macro!
;; Let's try it out:
(defnfactorial [n]
(if (< n 2) n
(dbg-1 (* n (factorial (dec n))))))
(factorial 5)
;; (* n (factorial (dec n))) = 2
;; (* n (factorial (dec n))) = 6
;; (* n (factorial (dec n))) = 24
;; (* n (factorial (dec n))) = 120
;; 120
;; Bingo!
;; When the compiler sees a macro, which is just a function that returns some
;; code, it runs the function, and substitutes the code that is returned into
;; the program.
;; It is like programming the compiler to be an apprentice programmer who will
;; write out the tedious bits longhand for you.
;; We can even ask the compiler what it sees when it expands dbg-1:
(macroexpand-1 '(dbg-1 x))
;; (let [a x] (println (quote x) "=" a) a)
(macroexpand-1 '(dbg-1 (* x x)))
;; (let [a (* x x)] (println (quote (* x x)) "=" a) a)
(macroexpand-1 '(dbg-1 (println x)))
;; (let [a (println x)] (println (quote (println x)) "=" a) a)
(macroexpand-1 '(dbg-1 (inc x)))
;; (let [a (inc x)] (println (quote (inc x)) "=" a) a)
(macroexpand-1 '(dbg-1 (* n (factorial (dec n)))))
;; (let [a (* n (factorial (dec n)))] (println (quote (* n (factorial (dec n)))) "=" a) a)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; So have we won??
;; We have certainly solved our problem as we originally conceived it, but there
;; are potentially a couple of difficulties with our solution, which I shall
;; cover in a later post.
;; Neither are terribly likely to occur in practice, but it is better to write
;; bug-free code than almost-bug-free code.
;; The chief virtues of a programmer are laziness and paranoia.
;; Million to one chances have a way of coming up nine times out of ten.
;; Also, compare our solution with the actual dbg macro:
(defmacrodbg-1 [s]
(list 'let ['a s] (list 'println (list 'quote s) "=" 'a) 'a))
(defmacrodbg[x] `(let [x# ~x] (println '~x "=" x#) x#))
;; There are obviously similarities, but the complete version looks more like
;; the generated code, and is thus easier to write and to read once you've got
;; the hang of the weird `, #, and ~ things.
;; And the second version also solves the two potential difficulties which I
;; haven't explained yet!
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Things to play with to check you've understood:
;; I. Find the source for clojure.core/when
(clojure.repl/source when)
;; figure out what it does.
;; II. What is wrong with:
(defmacrodbg-oops [s]
(list 'do (list 'println (list 'quote s) "=" s) s))
;; III. Can you improve dbg so you can write (dbg * 3 2) rather than
;; (dbg (* 3 2)), saving a couple of brackets every time you use it?
;; IV. Using the methods above, can you write a for loop. We'd want
(forloop [i 1 10]
(print i)
(print (* i i)))
;; to expand to:
(loop [i 1]
(when (<= i 10)
(print i)
(print (* i i))
(recur (inc i))))
;; V. (hard) If you managed to do IV above, and write forloop, then congratulations! ;; You have
;; understood this post. But I bet your solution suffers from either or both of
;; the subtle problems that I mentioned above. Can you work out what they are?
;; Hint. How many times does the finish condition get evaluated as you go
;; through the loop? How would you make sure that only happened once? What if a
;; name you'd used in the macro clashed with a name used in the code where the
;; macro is?
;; Both problems are, in fact, easy to solve. See part II.

22 comments:

Scattering print statements throughout your code is a Bad Thing; you either have to keep removing them from your code after dev (hurting you later if you need to debug), or end up with no granularity control over the information being emitted and unnecessarily hurt performance.

Nice post you have written. I already know lisp macros (from a long time ago... On Lisp & ANSI Common Lisp) and after asking Which Programming Language Should I Learn Next? in my blog, the answer I gave myself (and most readers gave) was Clojure. Thus every bit of clojure knowledge I get from the web is pretty welcome! Thanks!

Do notice that it isn't finished yet. Clojure's quasiquote does a good job of making gensyms automatic (and almost compulsory!) and the clever read-time resolution of names means that clojure can have lisp-style macros without being a lisp-2. Really the best of both worlds. That will be part 2.

Ruben, checked your post and I notice you're a mathematician. My degree was in maths.

Obviously I think you should learn Clojure, but I'm biased because it's my current favourite.

Others you might want to look at are:

Python. A lovely design which fits your head and doesn't try to force you into one paradigm. It actually feels a lot like clojure. Both of them have nicely integrated modern data structures.

Clojure's more functional, Python's more imperative.Clojure's a lisp, Python's more C/Algol.

On both those counts, I think mathematicians feel more happy with the first style over the second.

Ocaml, (or standard ML). For those who like compile time type checking. A bit more restrictive than lispy-pythony dynamic languages, but wonderful fun once you get the hang of the type system, and blazingly fast.

Haskell. An experiment in what it would be like to have a purely functional language. Great for mathematics. Not so good for webservers.

Purely functional code is hard to write for some iterative problems like numeric integrators (I tried to code a Runge-Kutta-Fehlberg 7-8 integrator in Lisp, and it doesn't quite fit the language... too many progns needed).

For me, functional-ness depends more on the problem that being a mathematician... but indeed, I feel better writing functional code, if I can.

Ruben, indeed! The classic example where functional is horrible and mutable is good is a program with a random number generator. It's more that the "feel better .... if I can" feeling seems to happen more often with mathmos.

I'm surprised that a Runge-Kutta solver would be a case here though. Maybe there's something I'm missing about the algorithm. If you post or link to imperative code I'll see what I can do!

I have only a C example, not in my GoogleCode (not worth it, I just use it for teaching), but it is a standard procedure where you call repeatedly something, and have several straight sums. It can be "functionalized", at the expense of speed. The problem is this ;)

Oh, well Clojure is unlikely to be as fast as C for that sort of thing, but I don't think it would be a functional vs imperative question.

I would think it could be as fast as Java. And I would also think that an OCaml or Haskell version could be as fast as C. I may be dissing the JVM here. I'm told it's pretty good these days.

Can I have a look? I haven't tried optimizing something for speed in Clojure and it would be nice to see how it looked (almost certainly not as nice as the C version or even the Java version, but I'd be interested in finding out).

Just out of interest, why do you need an ODE solver to be fast? I would have thought it would be fast enough in Javascript! Now PDEs are a different question, obviously....

Jumping forward to the present day, there's a number of web frameworks for Haskell, including Happstack, Snap, Yesod, and others you'll find on Hackage.

GHC Haskell has excellent concurrency support; see for example chapters 24, 27, and 28 of Real World Haskell. The next version of GHC will have a new I/O manager [pdf] which should provide better scalability on web-like workloads.

Of course, the "common wisdom" is that Haskell is an "experiment" and is unsuitable for real-world tasks. The facts are irrelevant; if you repeat something often enough it's accepted as true.

Unfortunately it appears to have been eaten by some weird process in blogger. It did get e-mailed to me though, so I'm posting it for him:

"Haskell. An experiment in what it would be like to have a purely functional language. Great for mathematics. Not so good for webservers."

Really? Simon Marlow wrote about a high-performance webserver in Haskell over ten years ago. Here's a more recent version of his paper. See also the classic paper "Tackling the awkward squad".

Jumping forward to the present day, there's a number of web frameworks for Haskell, including Happstack, Snap, Yesod, and others you'll find on Hackage.

GHC Haskell has excellent concurrency support; see for example chapters 24, 27, and 28 of Real World Haskell. The next version of GHC will have a new I/O manager [pdf] which should provide better scalability on web-like workloads.

Of course, the "common wisdom" is that Haskell is an "experiment" and is unsuitable for real-world tasks. The facts are irrelevant; if you repeat something often enough it's accepted as true.

Good luck! If you can get IV to work then you can say that you understand what macros are.

If it helps, I can't get the second half of this damned tutorial I said I'd write to work. What I thought was simple and obvious is turning out to be a bitch to explain clearly. Which probably means that I don't understand it as well as I thought I did.

Could you please clarify this part as well(println '~x "=" x#) ;; what happens when we do '~x given say x = (* 5 6). I assumed ~x wud force it to become '30 giving 30. But as you point out it prints (* 5 6)=30.