Wednesday, September 11, 2013

Actors are overly nondeterminstic

Actors are useful as a low-level concurrency primitive, but as hasbeenwrittenelsewhere, they aren't the best high level programming model for writing concurrent programs.

Let's look at the problems with actors from another angle. Yes, actors are all about side effects, side effects don't compose, etc, etc, but more specifically, the actor model takes the wrong default of starting with an overly nondeterministic programming model, and then attempting (via some very first-order, out-of-band mechanisms that don't compose) to prune away this nondeterminism in order to obtain predictable performance. In particular, due to its excessive nondeterminism, the actor model lacks predictable tools for controlling queueing and memory usage. Where have we heard this sort of argument before? The exact same problems were noted with threads:

Threads... are wildly nondeterministic. The job of the programmer is to prune away that nondeterminism. We have, of course, developed tools to assist in the pruning. Semaphores, monitors, and more modern overlays on threads... offer the programmer ever more effective pruning. But pruning a wild mass of brambles rarely yields a satisfactory hedge.

To offer another analogy, suppose that we were to ask a mechanical engineer to design an internal combustion engine by starting with a pot of iron, hydrocarbon, and oxygen molecules, moving randomly according to thermal forces. The engineer’s job is to constrain these motions until the result is an internal combustion engine. Thermodynamics and chemistry tells us that this is a theoretically valid way to think about the design problem. But is it practical?

We could almost do a search-and-replace to turn the paper into an argument against using actors! Even though an actor (generally) mutates only local state, these side effects are observable to any piece of code capable of messaging the actor, and observability of side effects propagates transitively. So while actors avoid a class of bugs that arise when using preemptive multithreading with mutable state for communication between threads, they don't really change the overall programming model of having unrestricted mutable state, with the possibility of completely different emergent behavior based on nondeterministic interleaving of actor message processing. We have the same general class of errors to worry about and the same problems with reasoning about our code.

Let's look at an example. Suppose we have two actors, P and C, in a producer/consumer relationship. P produces values and sends them to C. In the actor model, at any point during executionP may issue a C ! msg. This means the memory usage of our program is an emergent runtime property, since every C ! msg results in a message queued in the mailbox of C, and any number of messages may pile up depending on the relative rates of processing of P and C. Likewise, the latency of the overall pipeline is very much an emergent property, and depends on the nondeterminism of actor scheduling. Yes, there are workarounds: we can build an ad-hoc protocol specific to P and C, or we can have the actor scheduler implement some global policy for congestion control (like making message send cost a function of the target's current mailbox size), and so on, but approaches like these are band-aids that don't scale. Like locks and semapahors, the deeper issue is a flawed underlying conceptual model.

There's a kind of general principle here which the actor model violates. Any decision affecting a program's asymptotic runtime or space complexity should be under explicit programmer control, using a clear and composable programming model. Actors violate this principle by making a key decision affecting memory usage (when to sequence a message send, and what sort of nondeterminism to allow) an emergent runtime property, and forcing us to encode these decisions using more hoc protocols, which aren't really first class or composable.

Purely functional approaches to concurrency are explicit. There are no side effects, we have to explicitly sequence the effects of our program, and if we want nondeterminism, we ask for it explicitly. So, rather than starting with unrestricted nondeterminism, with the possibility of unbounded memory usage depending on relative processing rates and what interleaving we get, we start with a fully deterministic model and add nondeterminism only as needed and where helpful.

Here's an example from scalaz-stream. For background, the scalaz-stream library represents sources, sinks, effectful channels, and pure stream transducers using a single basic type, Process:

Process[F,O]: A process which emits O values, and which may issue requests of type F[R] for any choice of R. Where F forms a Monad, this can be thought of as an effectful source of A values.

Sink[F,O]: A sink is represented as a source of effectful functions, or a Process[F, O => F[Unit]]

Channel[F,I,O]: More generally, a Channel is a source of effectful functions, or a Process[F, I => F[O]].

Wye[I1,I2,O]: A pure stream transducer of two inputs. Emits values of type O, and may request values from the left, from the right, or from both, allowing for nondeterminism.

Process1[I,O]: A pure stream transducer of one input. Emits values of type O, and may request values from the input of type I.

Tee[I1,I2,O]: A pure transducer of two inputs. Emits values of type O, and may request values from the left or from the right.

To feed a source, src: Process[F,String] to a snk: Sink[F,String], we write src.to(snk). This is fully deterministic--we repeatedly read an element from src, send it to snk, and await a response from snk, halting if either src or snk halts. If we want to allow nondeterminism, we can do so, but we do so explicitly, using a more general combinator like connect: src.connect(snk)(wye.boundedQueue(20))

We're still connecting src to snk, but we're now using a first-class object, a Wye to handle the queueing strategy. We happen to be using the very simple queueing strategy, wye.boundedQueue(20), which will run the src and snk concurrently, blocking on snk only if 20 elements have enqueued unanswered at the snk. But the Wye type is an arbitrary stream transducer and we can build up more complex logic for deciding what sort of nondeterminism is allowed (block if more than 20 elements enqueue unanswered OR if four consecutive values are received from src that satisfy some predicate, and so on). The decision about nondeterminism, which in turn affects the memory usage of our program, is encoded as a first class object, and we can compose and abstract over these objects using the usual powerful tools of FP.

Related problems with Future

So to summarize, actors are overly nondeterministic. Message sends are actions with side effects, rather than first class values we explicitly sequence and whose nondeterminism we explicitly control. Hence we leave an important property of our program, namely the queueing and memory usage, to various out-of-band mechanisms that aren't visible to the typechecker.

The Scala standard library Future type has a similar problem. The standard library Future type represents a running computation, rather than a description of a computation. This unfortunately means that the actual parallelism and hence runtime complexity one obtains for a Future[A] is an emergent runtime property which even depends on arbitrary details of how various functions that produce futures are factored. This is something we talk about in chapter 7 of the book, when we work through a design exercise of a library for describing parallel computations.

In contrast, the scalaz.concurrent.Task type is a purely functional description of a (possibly) parallel computation. Rather than nondeterminism being a runtime property, determined by the order that strict futures are created, Task is completely deterministic, generally instantiated lazily, and allowed points of nondeterminism are supplied explicitly using the functions of the Nondeterminism typeclass. It's a more explicit model, but that's not a problem. With the usual tools of FP, we can abstract over any common patterns we uncover when building Task values.

22 comments:

Thanks, Paul, for this post which correctly highlights some of the runtime properties of actor programs. The criticism you formulate with respect to exposing the changing behavior of an actor to the rest of the program is a valid one, but it is also essential. The same is true for the issue that in general memory usage is not predictable as far as the model is concerned.

These issues are symptoms of the fact that actors are not overly non-deterministic, they are the essence of concurrency and therefore by nature as non-deterministic as they can be. This is the foundation upon which the model is built, as Hewitt expresses in his paper from 1973. The ability to share mutable information between independent processes is deeply linked to concurrency and this is also the reason why concurrency is always impure.

I would reformulate your conclusion a little bit: using the actor model exclusively (i.e. everything is an actor, the “universe of actors” model in Gul Agha’s thesis) is certainly not the right approach, which is another way of saying that concurrency should not be the default. Actors are very good at modeling independent and distributed processes, though, so they can and should be used wherever these qualities are needed. Independence also includes isolation of failures and thereby enables supervision, a concept which again is fundamentally linked to concurrency (because at least two entities—the client and the supervisor—will be communicating with the target).

In the end it looks like we largely agree about the state of concurrency right now, but of course we choose to implement solutions with a different focus, which I think will create a fertile ground for improvements to grow upon. Keep up the good work!

Hi @Roland. Thanks for your comment. I think there is probably a lot we agree on. Where I think we may differ is that IMO actors should be relegated to a low level concurrency primitive, rather than something that forms part of the public API that any working programmer actually uses. As a concurrency primitive, they are quite useful. Beyond that I think we need something better.

You may not agree, but I place actors at almost at the same level as, say, a CountDownLatch. Perhaps a little more high level, but no one would ever tout a CountDownLatch as the solution to high-uptime, concurrent programs, nor would anyone want a public API that exposed CountDownLatches. A CountDownLatch can/should just be an implementation detail for something else that a programmer will actually use. But actors are "marketed" as much more than just a low-level primitive.

My experience so far with more purely functional approaches to concurrency is that FP can get you a very long way. E.g. scalaz-stream can already express lots of the use cases that people might otherwise express with actors. The main gaps I see remaining are in the area of distributed execution and supervision. I don't think the story there has been totally worked out, but there does not seem to be any fundamental barrier. Of course, I could definitely see how a high level, distributed, purely functional stream processing API might use actors as an implementation detail!

I think like ten years ago, it might have seemed a lot more plausible to people that a domain like concurrent and distributed programming might require use of side effects, and that probably turned a lot of people off on deeply investigating FP. But meanwhile, motivated FP folks have continued discovering more and more techniques for organizing large programs without side effects, and now most of these techniques are known in the FP community and there are very few gaps. It's now known, for instance, that there are nice purely functional models for handling I/O, something that people were probably more skeptical of ten years ago. Perhaps the overall effort of FP will run out of steam in some area and we'll settle for a programming model with side effects, but I don't think that's very likely if history is any indication! :)

Emergent behavior from concurrency seems hard to avoid in the end -- we're trying to harness asynchrony after all. And there are ample ways to constrain asynchrony, join patterns on Futures, etc.

I hope there's a way to combine some of the scala-stream ideas with the single threaded execution model of actors. I'd sure like to have the composability that typed interfaces provide. Channels and Tees and transducers look very nice. I'll spend some more time looking at scala-stream.

@Paul: Of course you are deliberately exaggerating a bit when you compare actors to latches ;-) The Actor Model was made famous—as you surely know—by a rock-solid telecommunications platform released by Ericsson in 1995, built on Erlang and achieving 99,9999999% uptime thanks to actors and supervision. The language itself is purely functional with a built-in actor construct for concurrency and it is renowned for its ability to express large and reliable systems, e.g. the Riak data store.

So, I do contend that Actors are suitable for writing high-uptime concurrent programs, it is just not the only ingredient you need for that: within your actors you want to be as pure as you can make it (to reap the usual benefits), and what is missing today—but actively being worked on—is type-safe composition on top of them.

@Roland - the fact that Ericsson and others have built reliable software using the actor model only shows that patient people (often with lots of financial backing) can build complex artifacts using primitive technology. You know, the Egyptian pyramids were built using manual labor and thousands of slaves, and they were pretty impressive... maybe we should go back to that approach next time we have to build a skyscraper! :-P

Anyway, I'm being flippant here, but you see my point. People produce amazing things all the time using technology that could be vastly better. So what? People are still writing amazing stuff using C and C++ for god sake! Look at modern video games, for instance. What I want to talk about is how to make things better, not about how we can still get things done using approaches that we *know* have problems.

I'm only exaggerating a little bit when I put actors at the level of a latch. To the extent actors are responsible for the success of the software you mention, they have about as much to do with that success as use of *loops* and *integers* does--they are low level primitives used to build higher level functionality. It's great actors can be used to implement higher level functionality and so forth. (And it's also great that I can use loops and integers for lots of different purposes!) But there's a lot of marketing that goes on which results in actors getting all the credit for these platform's successes. What is really happening is that people are using actors to build higher-level things, and if we're being honest, I'm not sure actors really deserve the credit.

Just look at Akka itself. It's a huge library, and I'm sure it has a lot of useful stuff, but if you are just talking about the actor model itself, well that is a trivial and tiny primitive. In scalaz, the entire implementation of actors is like 100 LOC. Akka has been successful because of all the *other* stuff you've built on top of the basic actor model.

This blog post has very good and valid observations about the actor model and a few very troubling ones.

The runtime characteristics and lack of composability comments are spot on and a very good description of the issues one may face. However, there seems to be a confusion about the different kind of states and side effects we have in our software, specially in the comparison with threads.

The issue with threads, which is highlighted in the paper linked, is about observable non-determinism in memory access/manipulation. When you are using threads, the memory is usually shared in between threads, so changes to memory done in one thread are observable in another thread unless you use an exclusion mechanism. This issue is clearly visible with the code shown in the paper (section 4). That code, designed with an actor model, would not show the issues mentioned in the paper. That's what makes the comparison with the paper very distracting.

Of course changes in an actor state can be observed by other components of our system but it is only the state representation the actor is willing to expose. When many actors are interleaving execution, one cannot see the non-deterministic state in the remaining other actors.

Finally, the changes of an actor are not generally globally observable. You say:

"Even though an actor (generally) mutates only local state, these state changes are globally observable to any piece of code capable of sending a message to that actor."

There is a huge "if" in this sentence which is that the actor needs to be able to send a message to the other actor in the first place! So the actor state is only global if the actor is global and, of course, we want to limit this state exposition and its side-effects as much as we can.

About all the rest, as others said, "actors are not overly non-deterministic, they are the essence of concurrency and therefore by nature as non-deterministic as they can be". If you have channels in a concurrent environment, you can't tell the order the messages arrive in the channel. That's non determinism. Even the messages in channel are state! And you will have state in the entities pushing and popping messages from the channel as well and it is your responsibility to expose just the state that needs to be observed by the system being designed.

I agree that actors are too low level for many data flow problems. Take a simple example where you want to apply a function on two arguments produced by two different asynchronous senders (i.e. you know that each sender will produce exactly one message each, but which argument is produced first is non-deterministic). This is a typical synchronization point in a data flow graph. AFAIK there is no simple way to do this with actors and instead you have to implement your own solution in the receiving actor.

You can easily build totally deterministic, but still partly asynchronous, data flow graphs, but implementing them using actors is not trivial.

All too often I see the producer/consumer example for actors, and it is a terrible nonsensical example.

Light weight actors should be created along the true point of concurrency (representing the external process, file handle, socket, user, etc).

For a web server for example, you create one actor per incoming connection. That actor represents "truth" for that resource it represents. For a bulk file processing system, you create one actor per file.

Even if we go with the (already noted, terrible) producer / consumer example, the problems have absolutely nothing to do with actors. Queues exploded in actor free systems all the time because some idiot just keeps shoving stuff into it. If the queue exists in the form of a "mailbox" on an actor or in a giant enterprise queue system... same problem either way.

Also, for a reasons I can't exactly fathom (you imply but don't explicitly state) that you believe that actors can't be done synchronize, which of course, they can. Actor gets message, replies, repeats. By placing an actor in the middle, you can even create a buffer and worker pool.

I find the conclusion that purely functional programming approach solves all your problems grossly erroneous. For example, take a look at how Haskell (hard to argue for a more functional language) implements "sparks" under the hood:

http://community.haskell.org/~simonmar/papers/multicore-ghc.pdf

From what I can tell, nondeterminism and memory consumption abound, and that is because functional programming abdicates all control to compiler and run time.

Notice that the mechanism described in the paper violates the principle of Any decision affecting a program's asymptotic runtime or space complexity should be under explicit programmer control, using a clear and composable programming model.

On the claim that actors don't compose. One of the linked posts supporting the claim states:

"Actors don’t compose. By default actors hard-code the receiver of any messages they send. If I create an actor A that sends a message to actor B, and you want to change the receiver to actor C you are basically out of luck. If you’re lucky I anticipated this in advance and made it configurable, but more likely you have to change the source. Lack of composition makes it difficult to create big systems out of small ones."

That's just stupid, as in "lacking intelligence or common sense." It's absurd.

Don't hard code the actor. Provide a continuation, such as a "customer" that has to be sent with every message. The problem here is one of wrong abstraction. I've seen the tendency to try to somehow compose the actors instead of making composable message protocols. To steal Alan Kay's repeated calls, it's not about the actors themselves but the about the actor messages. This is not a lack of composition, it's an equivalent of using a global magic variable in an object oriented system.

To address another referenced article: http://james-iry.blogspot.com/2009/04/erlang-style-actors-are-all-about.html

That is a really bad example of having a system that requires transactions, not understanding that, and then building a straw man using actors to demonstrate that actor model fails. I would love to see a "correct" implementation of that specific example (i.e. no Alice and Bob state tracking) in _any_ language.

Mr. James Iry demonstrated that he was at the time (2009), in that particular instance, a poor designer. He did not demonstrate that actor systems are flawed.

"Actors don’t compose. By default actors hard-code the receiver of any messages they send. If I create an actor A that sends a message to actor B, and you want to change the receiver to actor C you are basically out of luck."

Let me try:In the text above replace 'Actors' witha) 'Classes'result: The statement is true. If we import and instanciate class B in A we cannot change that to C without rewrite.b) 'Functions'result: The statement is true.If I directly call another function B from function A I cannot change that to call function C instead without rewrite.

"If you’re lucky I anticipated this in advance and made it configurable, but more likely you have to change the source. Lack of composition makes it difficult to create big systems out of small ones."

This statement is true. In all cases.

If you're lucky I made my OO based on DI.If you're lucky I based my FP on HOF.If you're lucky I made my Actors (or the message protocol) configurable.

> Queues exploded in actor free systems all the time because some idiot just keeps shoving stuff into it. If the queue exists in the form of a "mailbox" on an actor or in a giant enterprise queue system... same problem either way.

In this post, I am not comparing actors to queues, I am comparing purely functional concurrency vs actors using side effects. In the purely functional approach, the default is totally deterministic, with all actions as values that must be explicitly sequenced. Any queueing and other nondeterminism is introduced very explicitly. The nondeterminism is part of the programming model, rather than an emergent runtime property. I think this is the 'correct' default.

I don't know of any actor system where message sends return effects as values that are explicitly sequenced using standard monadic combinators. So I think you are probably referring to something else when you say that actors systems can be synchronous. Maybe you can explain more what you meant.

In any case, even if you are explicitly sequencing message sends, you still have the problem that message sends are 'fire and forget', they return the moral equivalent of 'IO Unit', which is bad for abstraction and further composition.

@Tristan -

> I find the conclusion that purely functional programming approach solves all your problems grossly erroneous. For example, take a look at how Haskell (hard to argue for a more functional language) implements "sparks" under the hood...

The spark API is not really purely functional, and it's certainly not the only way of doing parallelism in Haskell - 'a par b' is arguably starting evaluation of 'a' as a side effect of evaluating 'b'. But you can certainly have a very explicit, purely functional API for doing parallelism in Haskell, see for instance this post by Simon Marlow.

Incidentally, there is absolutely nothing wrong with implementing a pure API using (unobservable) side effects under the hood. Functional programmers do that all the time! The key is whether the side effects of the implementation are observable to outside code.

Regarding your point that you can build composable message protocols atop actors, sure. I can also build a distributed system in assembly language, and I can build higher level systems using preemptive threads and locks. You're making Turing tarpit style argument. Actors are a low level construct, and to get anything done, you end up trying to build various abstractions around actors. Unfortunately, much like assembly language, actors don't facilitate abstraction very well. Message sends still just return Unit, so the type system does not provide any help whatsoever in ensuring that whatever high level protocol you've worked out gets followed. You end up working in a more first order way, with "design patterns" rather than actual reusable code abstractions. See my comment below to Anonymous

> To address another referenced article: http://james-iry.blogspot.com/2009/04/erlang-style-actors-are-all-about.html ...That is a really bad example of having a system that requires transactions, not understanding that, and then building a straw man using actors to demonstrate that actor model fails.

The point of that article was not to design a great actor system. It was to show with a simple example that actors can have state which is mutated by message sends, and that these side effects are observable to programs that send messages to the actor. Stated more technically, message sends are not referentially transparent, and the actor model should not therefore be considered purely functional. You should probably comment on his post, not mine, though. :)

> "Actors don’t compose. By default actors hard-code the receiver of any messages they send. If I create an actor A that sends a message to actor B, and you want to change the receiver to actor C you are basically out of luck"

> b) 'Functions' ... If I directly call another function B from function A I cannot change that to call function C instead without rewrite.

Okay, that is an interesting argument. Let me now dismantle it. :) Functions have typed inputs and a typed result. Both the input and the output are the basis for further abstraction, and importantly, we have retained type information for both. So functions don't hardcode two key pieces of information -- where to obtain their inputs, and what to do with their output. Of course, we can generalize functions, make them higher order, etc, and make them even more reusable, but the key is that functions in their most basic form already provide hooks for further composition.

Sending a message to an actor, on the other hand, returns nothing. Or perhaps it returns some sort of untyped Future. In any case, we've tossed out type information, which means we get little help from the typechecker when attempting to build other abstractions atop actors. This in turn leads to a very first order style, because you are discarding one of the key tools (a type system) that makes abstraction actually possible and usable.

To make a pretty direct analogy, imagine if in Java if all functions returned void and you were forced to run side effects on the function arguments to 'return' anything. In theory, maybe you could build some more composable conventions for writing programs, but the type system won't help you at all (it won't tell you whether you've invoked all the effects you intended to), and we can all see how silly this would be. Actors are just like that.

My comment about sparks had to do with the criticism of "Any decision affecting a program's asymptotic runtime or space complexity should be under explicit programmer control, using a clear and composable programming model." and not a type system argument. Regardless whether side effects are observable or not, the implementation of the functional paradigm violates the principle above outlined as a shortcoming of actors. Sparks illustrate that functional paradigm is no different. Type systems need have nothing to do with this highlight.

RE: James Iry.. well of course actors are not purely functional, actors have internal state. He still designed a transactional system and then ignored transactions and showed race conditions.

RE: "actors are a low level construct". I find that assertion difficult to reconcile with my experience of modeling services (as in service oriented architectures) as actors and seeing others do the same. It's difficult to claim that an entire service is a "low level construct."

Even though sparks are implemented in Haskell, they are not really representative of the 'the functional paradigm'. So pointing to them as evidence that 'the functional paradigm' has the same problems I ascribe to actors is not really fair. It is possible to have a nice, purely functional API for describing parallel computations in Haskell. See for instance the Simon Marlow post I linked to. We also develop a similar library in Scala in chapter 7 of the book. The implementation, by the way, uses side effects internally, but exposes a purely functional API (unlike the actor model).

> RE: James Iry.. well of course actors are not purely functional, actors have internal state.

Okay, thank you for acknowledging that. That was really the main point of his post, I'd say.

> He still designed a transactional system and then ignored transactions and showed race conditions.

With purely functional programming, there are no race conditions. You sequence the effects in exactly the order you want, and inject nondeterminism explicitly for performance where this doesn't affect correctness. The possibility of race conditions is to me a sign that the programming model was overly nondeterministic by default, and this must now be 'undone'. Of course, the type system provides no help in telling you where and how to undo this excessive nondeterminism. (Whereas in the functional model, the type system will indeed force you to make some decision about the order of sequencing any effects.)

> RE: "actors are a low level construct". I find that assertion difficult to reconcile with my experience

Okay, fair enough. I will just say that unless you've used an alternative, what you are using is going to seem as good as it gets. People programming in C for the first time probably thought C was high level (compared to writing assembly). From my personal experience with purely functional approaches and actors, the functional model feels much higher level and I have better tools for abstraction. I will probably start posting some more examples of different functionality using scalaz-stream (possibly side by side with an actor-based implementation), and I think that might make things more clear. :)

> I will probably start posting some more examples of different functionality using scalaz-stream (possibly side by side with an actor-based implementation), and I think that might make things more clear. :)

That would be great. I've spent a lot of time in actors and for better or worse they are intuitive and I don't have problems creating high level abstractions with them. But as you pointed out, it may well be the case of having internalized the right patterns. So maybe I just got lucky (or unlucky, depending on the point of view). I've also spent time doing functional programming, and "get it", and I'm a big fan of not having state.

On a side-note, there is a taxonomy of actors to choose from parameterized on state and behavior.

People have mostly issues with "serialized actors", the ones that mutate state and change behavior. But as in any good system design, these should be found only/primarily at the boundaries of the system to present an interface to something non-functional (for example, to buffer requests but present the same address for follow-on messages to arrive at), internally to the system, function and value actors seem to me to result in more understandable systems.

// end side-note :)

I am very much interested in making actor programming easier, if you do decide to post some examples of scalaz-stream with actor implementations (or at least outlines of them) side-by-side that would help me in better understanding where the friction points are and perhaps then they could be addressed with a better programming surface.