Wed, 07 Sep 2011

In my last blog post (“Managing Concurrency in Windows Azure with Leases”), I showed how blob leases can be used as a general concurrency control mechanism. In this post, I’ll show how we can use that to build a commonly-requested pattern in Windows Azure: task scheduling.

Let’s set some requirements:

We should be able to add tasks in any order and set arbitrary times for each.

When it’s time, tasks should be processed in parallel. (If lots of tasks are ready to be completed, we should be able to scale out the work by running more instances.)

There should be no single point of failure in the system.

Each task should be processed at least once, and ideally exactly once.

Note the wording of that last requirement. In a distributed system, it’s often impossible to get “exactly once” semantics, so we need to choose between “at least once” and “no more than once.” We’ll choose the former. These semantics mean that our tasks still need to be idempotent, just like any task you use a queue for.

Representing the Schedule

Tackling requirement #1, we’ll use a table to store the list of tasks and when they’re due. We’ll store a simple entity called a ScheduleItem:

Note that we’re using a timestamp for the row key, meaning that the natural sort order of the table will put the tasks in chronological order, with the next task to complete at the top. As we’ll see, this sort order makes it easy to query for which tasks are due.

[UPDATE 10/06/2011] Thanks to an astute reader, who pointed out that duplicate times would be a problem, I appended a GUID to the row key. Also corrected a typo (“partition key” versus “row key”) in the preceding paragraph.

Managing Concurrency

We could at this point write code that polls the table, looking for work that’s ready to complete, but we have a concurrency problem. If all instances did this in a naïve way, we would end up processing tasks more than once when multiple instances saw the tasks at the same time (violating requirement #4). We could have just a single instance of the role that does the work, but that would violate requirements #2 (parallel processing) and #3 (no single point of failure).

Part of the solution is to use a queue. A queue could distribute the work in parallel to multiple worker instances. However, we’ve just pushed the concurrency problem elsewhere. We still need a way to move items from the task list onto the queue in a way that avoids duplication.

Now, we have a fairly easy concurrency problem to solve with leases. We just need to make sure that only one instance at a time looks at the schedule and moves ready tasks onto the queue. Here’s code that does exactly that:

using (var arl = new AutoRenewLease(leaseBlob))
{
if (arl.HasLease)
{
// inside here, this instance has exclusive access
var ctx = tables.GetDataServiceContext();
foreach (var item in ctx.CreateQuery<ScheduledItem>("ScheduledItems").Where(s =>
s.PartitionKey == string.Empty
&& s.RowKey.CompareTo(DateTime.UtcNow.Ticks.ToString("d19")) <= 0)
.AsTableServiceQuery())
{
q.AddMessage(new CloudQueueMessage(item.Message));
// potential timing issue here: If we crash between enqueueing the message and removing it
// from the table, we'll end up processing this item again next time. This is unavoidable.
// As always, work needs to be idempotent.
ctx.DeleteObject(item);
ctx.SaveChangesWithRetries();
}
}
} // lease is released here

We can safely run this code on all of our instances, knowing that the lease will protect us from having more than one instance doing this simultaneously.

Once the messages are on a queue, all the instances can work on the available tasks in parallel. (Nothing about this changes.)

All Together

A web role inserting entities into the ScheduledItems table. Each entity has a string message and a time when it’s ready to be processed. These are maintained in sorted order by time.

A worker role that loops forever. One instance at a time (guarded by a lease) transfers due tasks from the table to a queue, and all instances process work from that queue.

This scheme meets all of our original requirements:

Tasks can be added in any order, since they’re always maintained in chronological sort order.

Tasks are executed in parallel, because they’re being distributed by a queue.

There’s no single point of failure, because any worker instance can manage the schedule.

Because tasks aren’t removed from the table until they’re on a queue, every task will be processed at least once. In the case of failures, it’s possible for a message to be processed multiple times (if an instance fails between enqueuing a message and deleting it from the table, or if an instance fails between dequeuing a message and deleting it).

The mechanism presented here could certainly be improved. For example, rescheduling a task is possible by performing an insert and a delete as a batch transaction. Recurring tasks could be supported by rescheduling a task instead of deleting it once it’s been inserted onto the queue.

Feedback

The snippets of code above are really all you need to implement task scheduling, but I wonder if a reusable library would be useful for this, or if it might make sense to integrate with some existing library. Let me know on Twitter (@smarx) what you think.