Friday, May 1, 2009

Earlier I said that tail recursion allows a program access to a more
powerful computation model. I'm referring to Hewitt's Actor model.
At its core, the Actor model is extremely simple. Computation agents
called ‘Actors’ interact only through message-passing.
The full Actor model allows for multiple asynchronous agents, each
acting independently. On a single processor computer, it can be
much more convenient to use a synchronous, serial Actor model in which
only one Actor at a time is running and control of the processing is
passed along with the messages.

One important feature of the Actor model is that messages are
unidirectional. When an actor sends a message it does not require a
reply as part of the protocol. If the actor needs a reply or wishes
to handle a reply, it sets up a callback. The callback can have a
well-known name that any actor can use in a message, or it can be an
unnamed piece of code (a lambda expression) that shares the context of
the sender.

Since messages are unidirectional, they don't require a
‘return address’. The message may have a return address
as part of a callback, but that's a protocol decision, it isn't
built-in to the message itself. The recipient of the message might
simply ignore the callback (it is allowed to).

The recipient of a message can pretty much do anything it wants. Of
course we expect the program to be written such that something useful
happens, but the recipient of a message is allowed to do things like

compute something

change its state

send a message

create new actors

Or any combination of the above. For example, upon receiving a
message, an actor could compute something based on the message
contents, create a new actor, and resend the message (with the newly
computed value) on to the newly created actor. This kind of arbitrary
delegation is a key feature of the actor model. Once a message is
sent, the sender is completely out of the picture (although he could
put his return address in a callback). The message recipient is
allowed to delegate all or any part of its action to other actors.
Control structure emerges from the patterns of message passing.

Function calling = message passing (almost)

A procedure is like an actor. Nothing happens until you invoke it
(send it a message), it can compute something, change its state (if it
is a closure or has access to something mutable), and call other
procedures (create new actors and send them messages). But a
procedure will be a limited form of actor if it cannot fully delegate
its computation to another procedure.

In virtually every computer language, a procedure call has an implicit
continuation (informally, the caller). ‘Returning a
value’ is ‘invoking the continuation’. Whether a
procedure can fully delegate to another depends on how the language
manages the implicit continuation. If the language supports proper
tail recursion, then when a procedure tail calls another, the implicit
continuation is passed directly along to the callee. If the language
does not support proper tail recursion, then the implicit continuation
is not passed to the callee. Instead, a new
implicit continuation is allocated that contains (wraps
around) the old one. The wrapper doesn't actually do
anything in the tail call situation, it just takes up space and
possibly retains pointers to reclaimable storage, but you can't get
rid of it because it is built in to the language.

If you cannot pass the implicit continuation on to the callee, then
you cannot fully delegate the computation. If you cannot fully
delegate, then you cannot fully model message passing with function
calls. On the other hand, if you can fully delegate, then a function
call is exactly the same passing a message (in our synchronous, serial
actor model).

Tail recursion unifies message passing and function calling.
This is the reason tail recursion is cool. The fact that this allows you
to write a loop as a tail call is sort of a side benefit. (It directly
follows because all control structure can be modeled as message
passing, but as I mentioned repeatedly, I have nothing against looping
constructs per se.)

With tail recursion, you have message passing embedded in your
language. By embedded I mean that you don't have to directly
manipulate message objects and inject and extract values from them.
That happens automatically as part of the function call syntax.

Without tail recursion, you have to implement message passing as a
separate, ‘bolted on’, feature. You will have explicit
message objects to allocate, initialize, enqueue, poll for,
destructure, and deallocate. The control structure that emerges from
the message passing patterns will interfere with the control structure
in your code (you'll have to structure all your code around the
‘driver loop’).