Clojure's Approach to Polymorphism: Method Dispatch

The word polymorphism derives from Greek where poly means many and morph means form. In programming languages, polymorphism is the idea that values of different data-types can be treated in "the same" way. Often, this is manifested via class inheritance where classes in a hierarchy can have methods of the same name. When such a method is called on an object, the correct code is executed at runtime depending on which specific class the object is an instance of. There are various ways of achieving polymorphic behavior, however, and inheritance is only one such way.

This article helps to understand how Clojure approaches polymorphism. We'll start by looking at method dispatch -- starting with the commonly available single dispatch, followed by double and multiple dispatch. The mechanism of dynamic method dispatch is a fancy way of saying that a when a method is called, the name of the method is mapped to a concrete implementation at runtime.

This article is an excerpt from an Early Access Edition of the book "Clojure in Action" (Manning Publications; ISBN: 9781935182597), written by Amit Rathore.

Single and Double Dispatch

Languages such as Java only support single dispatch. What this means is that the type of receiver (the object on which a method is called) is the only thing that determines which method is executed. To understand this and the next couple of paragraphs, we'll use a commonly cited but well-suited example of a program that needs to process an abstract syntax tree (AST). Let's imagine our AST is implemented in a Java-like OO language and is modeled via classes shown in Figure 1.

Figure
1. A simplistic hierarchy representing an AST. Each node is a subclass of a generic SyntaxNode and has functions for various tasks a compiler or IDE might perform.

When the IDE needs to format the source code represented by such an AST, it calls format on the it. Internally, the AST delegates the format call to each component in a typical composite fashion. This walks the tree and ultimately calls format on each node in the AST.

This is straightforward enough. The way this works is that, even though the AST is made up of different kinds of concrete nodes, the operation format is defined in the generic base class and is called using a reference to that base class. Dynamically, the receiver type is determined and the appropriate method is executed. If any of these methods were to accept arguments, the types of those would not influence the determination of which method that needs to execute.

Single dispatch leads to possibly unexpected situations. For instance, consider the following code defined inside some sort of Receiver class.

The compiler would complain that there is no such method as collectSound that accepts an instance of the Animal class. The reason, to reiterate, is that thanks to Java and single dispatch, only the type of receiver is used to determine which method to call. The type of the argument is not used at all; leaving the compiler to complain that there is no method defined that accepts an Animal.

The way to satisfy the compiler would be to add another method collectSound(Animal a). This would make the compiler happy, but would not do the desired job of dispatching to the right method. The programmer would have to test for the object type and then dispatch again to appropriate methods (which would need to be renamed to collectDogSound and collectCatSound). It might end up looking like this:

This is highly unsatisfactory! The language should do the job of dispatching methods not the programmer, and this is the limitation of single dispatch. If the language supported double dispatch, the original code would work just fine, and we wouldn't need what we are about to do in the next section. Later on, we'll see how Clojure completely side-steps this recurring issue by giving complete control to the programmer.

The Visitor Pattern (and Simulating Double Dispatch)

In a language that resolved methods through double dispatch, the types of receiver and the first argument would be used to find the method to execute. This would then work as desired.

We can simulate double dispatch in Java and other languages that suffer from single dispatch. In fact, that is exactly what the visitor pattern does. We'll examine that here and later see how it is not needed in Clojure.

Consider the program we mentioned earlier that needed to process an AST. The implementation from there included things like checkValidity and generateASM inside each concrete implementation of SyntaxNode. The problem with this approach, of course, is that adding new operations like this is quite invasive, as every type of node needs to be touched. The visitor pattern allows such operations to be separated from the data object hierarchy (which, in this case, is the SyntaxNode hierarchy). Let's take a quick look at the modification needed to it, followed by the new visitor classes.

Figure
2. Simulating double dispatch requires modification. The accept method is a somewhat unclear but required method in each node, which will call back the visitor.

Figure 3 shows the new visitor classes.

Figure
3. Each visitor class does one operation and knows how to process each kind of node. Adding new kinds of operation involves adding new visitors.

The infrastructure method accept(NodeVisitor) in each SyntaxNode is required because of the need to simulate a double dispatch. Inside it, the visitor itself is called back using an appropriate visit method. Here's some more Java code showing an example involving AssignmentNode:

If the language natively supported double dispatch, all of this boilerplate code wouldn't need to exist. Further, the NodeVisitor hierarchy needs to know about all the classes in the SyntaxNode hierarchy, which makes it a more coupled design.

Now that we know about double dispatch, the obvious question is -- what about triple dispatch? What about dispatching on any number of argument types?