In-Depth Async in Akka .NET: Why We Need PipeTo()

Update 21st August 2016: I wrote this article based on outdated Akka .NET documentation that discouraged async/await within actors and suggested using PipeTo() instead. Akka .NET now does support async/await (thanks to the ReceiveAsync() method), and PipeTo() is not a replacement for it. Aaron Stannard (in a comment on this post) and Roger Alsing (on Reddit) from the Akka .NET team were very prompt in correcting various misconceptions, and Aaron Stannard has since updated the Petabridge blog post about PipeTo(). See my followup post for the latest best practices.

Tasks and the more recent async/await syntactic sugar have been a blessing for .NET developers aiming to keep their applications responsive despite increasing requirements for I/O and CPU-intensive requests.

Thus it was really odd for me to learn that Akka .NET, an emergent framework for distributed computing, not only does not support async/await directly within actors, but actually discourages its use (going as far as calling them “code smell”).

In fact, they implemented this PipeTo() workaround that you need to use, sending the result of a task to an actor for processing. You can’t use async/await; you have to resort to the old ContinueWith() way of chaining tasks if you want to do any post-execution logic. If you’ve worked with ContinueWith() in the past, you’ll know it can get ugly really fast.

“The fact I decided to use DbCommand.ExecuteNonQueryAsync() instead of DbCommand.ExecuteNonQuery() shouldn’t force me to break a single message into multiple messages with PipeTo.”

Update 20th August 2016: Thanks to the Reddit user who brought to my attention that there actually is proper async support (though apparently not yet documented anywhere). Use the ReceiveActor’s ReceiveAsync() method.

Why Akka .NET Discourages async/await

To learn why awaiting in an actor is bad, let’s break the rules and do it.

What happened here? All three messages were processed in quick succession, and they have been interleaved. This is very bad, and in fact we were warned about it. Quoting the questions on the official PipeTo() sample:

“Await breaks the “actors process one message at a time” guarantee, and suddenly your actor’s context might be different. Variables such as the Sender of the previous message may be different, or the actor might even be shutting down when the await call returns to the previous context.”

Why Processing Messages Asynchronously Causes Interleaving

We can learn a lot about how actors process messages by investigating the Akka .NET source code. This method in Mailbox.cs seems to be more or less where actors begin to process their messages:

Now, the code above may look confusing, but the point here is not to understand what it’s doing exactly. Take a closer look. The methods in the call stack are mostly void (or otherwise returning simple types). There are no Tasks to be seen anywhere.

What does this mean for us? It means that we’re doing something very bad when we declare our message handler as async void.

Understanding async void

In order to better understand why the approach we took earlier will never work, it’s best to look at a much simpler example:

This is exactly the same interleaving problem we had with Akka .NET, except that this time we have no Akka .NET.

The real reason why we have this problem is due to an incorrect use of asynchrony. As you can read in Stephen Cleary’s MSDN Magazine article, “Async/Await – Best Practices in Asynchronous Programming” (March 2013), async void methods can be pretty dangerous to work with. When you call an async void method, you have two main problems: you have no way of awaiting completion of the method, and exceptions can bring the whole application down.

But here, we have also seen a third problem: that the method effectively exits when you await, returning execution control to the caller. In Akka .NET, this means that the next message will begin processing while the current one hasn’t finished yet.

async void methods should be restricted to methods at the beginning of the call chain (such as event handlers and WPF command handlers). You can’t sneak asynchrony into an otherwise synchronous call stack by introducing an async void. If you do async, it has to be all the way.

So it really seems that the problem with having asynchronous actor logic is simply that Akka .NET was never really designed to work with asynchronous methods.

Asynchrony in Akka .NET with PipeTo()

It should be clear by now that doing async/await in actors is not an option. So how do we go about doing our asynchronous work? We do that by using the PipeTo() pattern (because in Akka .NET, everything is called a pattern).

Let’s go back to our original example with the BusyActor. We left off with this code:

Now, we need to refactor this to do the asynchronous operation (in this case Task.Delay()) in a fire-and-forget manner, and send the result as a separate message to an actor. We’re going to need separate messages for this:

Similarly to the official example (which shows how to do an HTTP GET request within an actor), we are firing off an asynchronous request but not awaiting it. This happens in fire-and-forget manner as far as the actor is concerned. When the asynchronous operation is done, we create a new message and send it to the same actor so that he can log the end of the task.

Note that we have those two TaskContinuationOptions settings. You can read more about them in the official PipeTo() blog post, but the point I want to make here is that you need to remember to include them, and this makes this approach pretty error-prone.

Back in our main program, we need to send a TaskMessage instead of a simple string now:

This is bad. Even with PipeTo(), we still have the same interleaving problem as before. If you think about it, it makes sense.

What we are doing is firing off a fire-and-forget task, and the method can return immediately, thus allowing the next message to be processed before the asynchronous operation has completed. This is exactly the same problem we had when using async void.

If you’re firing off an asynchronous operation that doesn’t touch anything else and you just want to take its result, then the suggested PipeTo() approach will work. But if you need a guarantee on message order because your message processing is touching some state (thus an older message might overwrite the results of a newer message), then this is going to be a problem.

Coupling and Cohesion

Another problem with using PipeTo() is that it… complicates things. You can already see how our original example has been bloated into something a lot less easy to work with, just for the sake of not using async/await. But there’s more.

One common pitfall I see when developers begin to understand the importance of decoupled software is that they go to the other extreme: they split up components into extremely granular classes. In doing so, they breaking the companion principle of coupling: cohesion. While coupling dictates that software components should have minimal dependencies between themselves, cohesion suggests that components with strong direct interrelations should work closely together. Making classes too granular, for instance, is another way to end up with messy software.

At the beginning of this article, I quoted Natan Vivo’s comment about having to break a database operation into multiple operations. Typically, in ADO .NET, a database operation would look something like this:

Open a connection to the database.

Execute a command (query, nonquery, etc) against the database.

In case of a query, iterate over the rows and do something with them.

Each of the three operations above can be done asynchronously in sequence. They are meant to be together because they are part of the same cohesive operation. But if you break each of these operations into different messages and different message handlers, you’re going to scatter this otherwise contiguous operation all over the place. And that makes software a lot harder to maintain.

1. The idea behind PipeTo is that the results of the task just get delivered through an actor’s mailbox. That’s it. It allows the actor’s mailbox do its job, which is to enforce serial access to the actor’s internal state.

Your example doesn’t demonstrate that anything is wrong with PipeTo – that’s a consequence of how you’re using the TPL. In a real-world scenario if those tasks had to be completed in an explicit order, you wouldn’t start all three of them simultaneously. You’d pipeline them one after the other.

Any combination of Task scheduling could be done with Task composition – that’s the point of the Promises (TaskCompletionSource) and Futures (Task) pattern to begin with. To be able to compose asynchronous work into directed graphs and sequences. PipeTo just delivers the result of that to the actor’s mailbox.

2. Why we recommend PipeTo over async / await. As I explain in the link I included, we have to suspend the actor’s mailbox if we’re processing an `await` – otherwise the state isolation guarantees of actors are violated. The cost of this is that your actor literally can’t do anything while each `Task` in an `await` chain runs and it will hurt total throughput.

Using PipeTo, on the other hand, allows your actor to continue doing other work while the `Task` runs on another thread somewhere. This gives the actor the ability to parallelize its work, but it also gives you the ability to execute control-flow over the Tasks you’re executing.

For instance, if you need to cancel a task and have a specific control message you want to send to an actor to have it cancel a task it started previously there’s no real way to do it with async / await. The message won’t get received until the await chain is finished being processed.

With PipeTo this is trivial to do: the actor just hangs onto the `Task` or `CancellationTokenSource` and invokes it.

Calling PipeTo an “anti-pattern” is a huge misunderstanding of what it does and what await actually does.

In Natan’s scenario – just package the three of those calls into a single method with 2 awaits internally and then have the actor pipe the result of that to itself. No need to await directly inside the actor for it.