Eduasync part 9: generated code for multiple awaits

Last time we looked at a complex async method with nested loops and a single await. This post is the exact opposite – the method is going to look simple, but it will have three await expressions in. If you’re glancing down this post and feel put off by the amount of code, don’t worry – once you’ve got the hang of the pattern, it’s really pretty simple.

Nice and straightforward: start three tasks, then sum their results. Before we look at the decompiled code, it’s worth noting that writing it this way allows the three (admittedly trivial tasks) to run in parallel. If we’d written it this way instead:

… then we’d have waited for the first task to finish before starting the second one, then waited for the second one to complete before we started the third one. That’s appropriate when there are dependencies between your tasks (i.e. you need the result of the first as an input to the second) and it would still have been asynchronous but when you can start multiple independent tasks together, that’s generally what you want to do. Don’t forget that this doesn’t just extend to CPU-bound tasks – you might want to launch tasks making multiple web service calls in parallel, before collecting the results.

The comment is somewhat brief here, but the basic idea is that all the code which will only ever execute the first time (with no continuations) can go here. If there’s a loop that contains an await, then a continuation would have to jump back into that loop, so that code couldn’t be contained within this initial block. (A loop which didn’t have any awaits in could though.)

Anyway, this time we don’t have an "if" statement like that – we have a switch. It’s the same idea, but we could be in any of three different states when a continuation is called, depending on which await expression we’re at. The switch statement efficiently branches to the right place for a continuation, and executes the initial code otherwise. The "branch" for state 1 is just to exit the switch statement and continue from there.

It’s possible that the generated code actually has more levels of indirection than it needs; I don’t know about the details of what’s allowed within an IL switch, but it seems odd to effectively have an "On X goto Y" where Y immediately performs a "goto Z". If the switch statement could branch immediately to the right label, we’d end up with IL which probably wouldn’t have a hope of being decompiled to C#, but which might be slightly more efficient. It’s quite likely that the JIT can sort all of that out, of course.

I tend to actually think about all of this as if the code that’s really in the "default" case for the switch statement appeared after the switch, and the default case just contained a goto statement to jump to it. The effect would be exactly the same, of course – but it means I have a mental model of the method consisting of a "jump to the right place" phase before an "execute the code" phase. Just because I think of it that way doesn’t mean you have to, of course 🙂

Multiple awaiters and await results

There’s room for a bit more optimization in this specific case. We have three awaiters, but they’re all of the same type (TaskAwaiter<int>). Likewise, we have three await results, but they’re all int. (It would be possible to have different awaiter types with the same result type, of course.)

In the CTP (at least without optimization enabled) we end up with an awaiter / awaitResult pair of variables for each await expression. There’s never more than one await "active" at any one time, so the C# compiler could generate one awaiter variable per awaiter type, and one result variable per result type. In the common situation where the result is being directly assigned to a "local" variable of the same type within the method, we don’t really need the result variable at all. On the other hand, it’s only a local variable (unlike the awaiter) and it’s quite possible that the JIT can optimize this instead.

Ultimately it’s entirely reasonable for the C# compiler to be generating suboptimal code at this point in the development cycle. After all, it could be quite easy to introduce bugs due to inappropriate code generation… as we’ll see next time.

Conclusion

Other than the different way of getting to the right place on entry (using a switch instead of an if statement), async methods with multiple await expressions aren’t that hard to follow. Of course when you combine multiple awaits with loops, try/catch, try/finally blocks and any number of other things you might use to complicate the async method, things become tricky – but with the fundamentals covered in this blog series, hopefully you’d be able to cope with the generated code in any reasonable situation. Of course, it’s rare that you’ll need (or want) to look at the generated code in anything like the detail we have here – but now we’ve looked at it, you don’t need to wonder where the magic happens.

The next post will be the last one involving the decompiled code, at least for the moment. I’d like to demonstrate a bug in the CTP – mostly to show you how a small change to the async method can trigger the wrong results. I’m absolutely positive it will be fixed before release – probably for the next CTP or beta – but I think it’s interesting to see the sort of situation which can cause problems.

After that, we’re going to look at exception handling, before we move into a few odd way of using async – in particular, implementing coroutines and "COMEFROM".

8 thoughts on “Eduasync part 9: generated code for multiple awaits”

Any insight as to why each completion case assigns a newly constructed default instance of TaskAwaiter to the “awaiterX” variable that was just used? As near as I can tell, that new instance is never used and, as you say, the compiler could have just merged all the “awaiterX” variables into one anyway.

@pete.d: It’s not creating an instance of an object – TaskAwaiter is a struct. If actually assigned “default(TAwaiter)” to the awaiter variable, to ensure that it doesn’t keep an object alive pointlessly.

I’ll try to either edit an existing post or include that in another one at some point. Thanks for pointing it out.

Oh for crying out loud. One of these days, the “struct”-ness of that type will stick in my brain.

I assume you mean that “it’s actually assigned”, not “if actually assigned”. And if so, that makes a lot more sense to me (and is more self-explanatory) than using the “new” operator (though of course is equivalent).

@pete.d: Sorry, yes, I meant “it’s”. And yes, default(T) would have been a clearer way of writing it – I just copied what Reflector gave me. I’m about to go on holiday, but will attempt to remember to fix that up when I get home.

What I find odd is that the switch has an if() in the default case. Why not implement it as a case?

switch (state) {
case 1: …
case 2: …
case -1: …
default: return;
}

My other question is about multi threading.
If we await the same task on two threads, would it enter the MoveNext() method on both threads (wreaking havoc), or would it do the right thing? I’m having a bit of a hard time understanding what awaiting a task generated with an async method actually does.

@configurator: I suspect that some of the handling for the -1 case is due to the iterator block code generator. It may go away later… we should never re-enter this code in state -1 anyway. Likewise we shouldn’t be entering MoveNext() in two different threads anyway.

Awaiting a task generated with an async method will call GetAwaiter() on the task, and then either fetch the result or add a continuation, depending on whether the task is completed or not. It won’t *cause* the MoveNext() method to be called.