Introducing F# Asynchronous Workflows

F# 1.9.2.9 includes a pre-release of F# asynchronous workflows. In this blog post we'll take a look at asynchronous workflows and how you might use them in practice. Asynchronous workflows are an application of F#'s computation expression syntax.

Below is a simple example of two asynchronous workflows and how you can run these in parallel:

For the technically minded, the identifier async refers to a builder for a computation expression. You can also dig into the details of the F# implementation for more details on this.

You can try this example in F# Interactive from Visual Studio.

The above example is a bit misleading: asynchronous workflows are not primarily about parallelization of synchronous computations (they can be used for that, but you will probably want PLINQ and Futures). Instead they are for writing concurrent and reactive programs that perform asynchronous I/O where you don't want to block threads. Let's take a look at this in more detail.

Perhaps the most common asynchronous operation we're all do these days is to fetch web pages. Normally we use a browser for this, but we can also do it programmatically. A synchronous HTTP GET is implemented in F# as follows:

#light

open System.IO

open System.Net

let SyncHttp(url:string) =

// Create the web request object

let req = WebRequest.Create(url)

// Get the response, synchronously

let rsp = req.GetResponse()

// Grab the response stream and a reader. Clean up when we're done

use stream = rsp.GetResponseStream()

use reader = new StreamReader(stream)

// Synchronous read-to-end, returning the result

reader.ReadToEnd()

You can run this using:

SyncHttp "http://maps.google.com"

SyncHttp "http://maps.live.com"

But what if we want to read multiple web pages in parallel, i.e. asynchronously? Here is how we can do this using asynchronous workflows:

let AsyncHttp(url:string) =

async {// Create the web request object

let req = WebRequest.Create(url)

// Get the response, asynchronously

let! rsp = req.GetResponseAsync()

// Grab the response stream and a reader. Clean up when we're done

use stream = rsp.GetResponseStream()

use reader = new System.IO.StreamReader(stream)

// synchronous read-to-end

return reader.ReadToEnd() }

[ Note: This sample requires some helper code, defined at the end of this blog post, partly because one fuction called BuildPrimitive didn't make it into the 1.9.2.9 release. ]

Here AsyncHttp has type:

val AsyncHttp : string -> Async<string>

This function accepts a URL and returns a Async task which, when run, will eventually generate a string for the HTML of the page we've requested. We can now get four web pages in parallel as follows:

Async.Run

(Async.Parallel [ AsyncHttp "http://www.live.com";

AsyncHttp "http://www.google.com";

AsyncHttp "http://maps.live.com";

AsyncHttp "http://maps.google.com"; ])

How does this work? Let's add some print statements to take a closer look:

let AsyncHttp(url:string) =

async {do printfn "Created web request for %s" url

// Create the web request object

let req = WebRequest.Create(url)

do printfn "Getting response for %s" url

// Get the response, asynchronously

let! rsp = req.GetResponseAsync()

do printfn "Reading response for %s" url

// Grab the response stream and a reader. Clean up when we're done

use stream = rsp.GetResponseStream()

use reader = new System.IO.StreamReader(stream)

// synchronous read-to-end

return reader.ReadToEnd() }

When we run we now get the following output:

Created web request for http://www.live.com

Created web request for http://www.google.com

Getting response for http://www.live.com

Getting response for http://www.google.com

Created web request for http://maps.live.com

Created web request for http://maps.google.com

Getting response for http://maps.google.com

Getting response for http://maps.live.com

Reading response for http://maps.google.com

Reading response for http://www.google.com

Reading response for http://www.live.com

Reading response for http://maps.live.com

As can be seen from the above, there are multiple web requests in flight simultaneously, and indeed you may see the diagnostics output interleaved. Obviously, multiple threads of execution are being used to handle the requests. However, the key observation is that threads are not blocked during the execution of each asynchronous workflow. This means we can, in principle, have thousands of outstanding web requests: the limit being the number supproted by the machine, not the number of threads used to host them.

In the current underlying implementation, most of these web requests will be paused in the GetResponseAsync call. The magic of F# workflows is always in the "let!" operator. In this case this should be interpreted as "run the asynchronous computation on the right and wait for its result. If necessary suspend the rest of the workflow as a callback awaiting some system event."

The remainder of the asynchronous workflow is suspended as an I/O completion item in the .NET thread pool waiting on an event. Thus one advantage of asynchronous workflows is that they let you combine event based systems with portions of thread-based programming.

It is illuminating to augment the diagnostics with a thread id: this can be done by changing printfn to use the following:

Note that the execution of the asynchronous workflow to fetch www.live.com "hopped" between different threads. This is characteristic of asynchronous workflows. As each step of the workflow completes the remainder of the workflow is executed as a callback.

Asynchronous workflows are essentially a way of writing simple continuation passing programs in a nice, linear syntax. Importantly standard control operators such as try/finally, use, while, if/then/else and for can be used inside these workflow specifications. Furthermore this style of writing agents matches well with functional programming: agents that are state machines can often be defined as recursive functions, and the actual information carried in each state passed as immutable data. Mutable data such as hash tables can also be used locally within a workflow as long as it is not transferred to other agents. Finally, message passing agents are particularly sweet in this style, and we'll lok at those in later blog posts.

One important topic in this kind of programming is exceptions. In reality, each asynchronous workflow runs with two continuations: one for success and one for failure. In later blog posts we'll take a look at how errors are handled and propagated by asynchronous workflows, or you can play around with the 1.9.2.9 implementation today.

In summary, we've seen above that asynchronous workflows are one promising syntactic device you can use to help tame the asynchronous and reactive parts of the asynchronous/reactive/concurrent/parallel programming landscape. They can be seen as a nice, fluid F#-specific surface syntax for common compositional patterns of accessing user-level task scheduling algorithms and libraries. They are also a primary use of the monadic techniques that underpin computation expressions and LINQ, and similar techniques have been used in Haskell (see Koen Classen's classic 1999 paper, and related work was reported by Peng Li and Steve Zdancewic at PLDI and by Chris Waterson at CUFP this year).

I'll be talking more about asynchronous workflows at TechEd Europe 2007 in Barcelona, and they are also covered in Chapter 13 of Expert F#, which is entering the final stages of production as I write.

Some examples of the underlying techniques that might be used to execute portions of asynchronous workflows now or in the future are the .NET Thread Pool (used in F# 1.9.2.9), Futures and the CCR, all of which incorporate many advanced algorithms essential to good performance and reliability in these areas. As we move ahead with the F# design in this space we will ensure that asynchronous workflows can be used effectively with all of these.

Enjoy!

----------------------

Finally, here is the extra code required for the web sample above. These functions will be included in future release of F#.

I was initially suspicious of Microsoft co-opting ML but I have to admit that F# looks like a great language.

I’m curious, however, about the overlap I see in .NET concurrency features. Why are asynchronous workflows preferable to Tasks for IO? Similarly, what makes Tasks more suitable for parallel computation than asynchronous workflows?

But what about true workflow engine built on top of ‘async’? I mean, this could be really useful – concurrent handling of many business processes. But an important ingredient of business logic is the use of distributed transactions, or at least ‘ normal’ database transactions. What would happen to such transaction in an async function when it switches between threads? Would transaction context be preserved?

The library is built upon CCR for concurrency and relies on the C# yield return for the sequential illusion, and its workflow semantics are based on WS-BPEL standard. But the basic idea is quite close to this post