If you have been reading some of the posts in my blog, you will know that sometimes I present a problem and a solution in an article which takes you through the thought process from making the first straightforward solution through refining that solution into a hopefully better code design. This is one of those posts.

Let’s talk a bit about separation of concerns. One scenario that I stumble upon fairly frequently when coding is the need for some code that should perform a task at a time based interval. This is nothing strange, and it is not very hard to achieve. But it still does present some challenges.

In order to have a simple example, let’s say that we have a TimeTeller class. It will, at a certain interval, tell us what time it is:

As you can see, most of the code in the class is actually not about telling time, but rather about maintaining the interval using a Timer. In this code, telling time is more of a side effect than anything else. Also, if I would want to test this, it would involve some sort of waiting mechanism:

This is not overly complicated, but as a test it has one flaw: it can fail for reasons that are not related to our code. If the build server happens to be under great pressure (or just very slow), perhaps the code is not executed as fast as we expect so the wait handle times out. Our code may be perfectly correct, but the test still fails. I am strongly of the opinion that tests should be designed in a way so they should fail only if there is something wrong with the code of the application, not because of anomalies in the surrounding environment.

Now my code base has two problems. The test that is not optimal, and the fact that TimeTeller violates the Single Responsibility Principle.

In the Solid Responsibility Principle, a responsibility is sometimes described as a reason to change. The TimeTeller class in its current design may change for more than one reason: if we want to change how it is triggered, and if we want to change how it tells time.

So, first I want to remove the trigging mechanism from the TimeTeller class. To do this, I design an ITrigger interface:

It’s extremely simple; I can register an Action that will be invoked by the trigger, and I can unregister the Action. Notice how this interface says nothing at all about how the code will be trigged. From this point of view, that is of no importance. That is somebody else’s concern. Now I can refactor the TimeTeller class to use an external trigger instead of maintaining its own trigging mechanism:

Notice how this class now has a much more focused design. There is a lot less code that is about how to do stuff. Let’s say it has a lower “how-to-what” ratio.

Great, but what about the trigger part? The code that I just ripped out of the TimeTeller class needs to go somewhere, right? Right. I need to implement some sort of interval based trigger. Already at this point I realize that I will want to implement a couple of different triggers. I want the interval trigger to use with my TimeTeller class. I will also want to make some sort of direct trigger that I can use to directly trig the functionality, which might be very useful in unit tests in order to “fake” the interval based trigging. So, I put the most basic stuff in a base class:

This takes care of the bare necessities: it allows me to register and unregister Action callbacks with the trigger, and have them invoked. Note that by using Delegate.Combine and Delegate.Remove, I can register multiple actions with the same trigger, which might be handy. Now, let’s extend the base class to make the interval trigger:

You can see how the IntervalTrigger implementation is very much like the code we removed from TimeTeller. Now when we use the TimeTeller class, we simply provide an IntervalTrigger that will take care of calling into the TimeTeller at the given interval:

Now that we have separated the concerns how to call code at a given interval and how to tell time, we also get a much nicer setup for testing. As I mentioned previously, I was planning to implement some sort of direct trigger that can be used for instance in tests:

By passing a DirectTrigger to TimeTeller, I can now easily control exactly when it is being called into, and it is done in a synchronous manner. The asynchronous test above can be rewritten into a synchronous one instead:

This test is not time sensitive anymore, but will be very tolerant for slow build environments and such.

One nice thing about having moved the whole trigging concern out of the TimeTeller is that we can now allow the code to be trigged in new ways, without the TimeTeller even knowing about it. I used this fact when rewriting the test for instance. But what if I suddenly got the requirement that the TimeTeller should be able to tell time at a given interval, but there should also be a possibility to invoke it directly, at will. As it happens, this is now very easy to achieve. I just make a new ITrigger implementation:

This will cause TimeTeller will be invoked once every 5 seconds, and whenever any code calls directTrigger.Trig(). In the sample project that I have linked to in the end there is also an AsyncTrigger which is a flavor of the DirectTrigger that will invoke the target callbacks on a separate thread, so that the calling code is not blocked.

To sum it up, by separating concerns like this you will hopefully end up with a code design where both types and tests are more focused on their primary task. There will be more what and less how in the code.

How many times have you written the "double-check locking" code when adding stuff to a dictionary? You know, the "does the dictionary contain this key? No, ok, so we take a lock. How about now, the key is still not present? No? OK, good, then let’s add it." drill. Code-wise, it would typically look something like this:

It’s not particularly pretty, is it? The signal-to-noise ratio of that code is pretty low, so it’s nice to see that something was done in the base class library to address this not uncommon scenario: the ConcurrentDictionary. Using one of those instead, the above code could be distilled down to this:

dictionary.TryAdd(key, value);

Yes, that’s it. This will check if the key exists in the dictionary and if it doesn’t the value will be inserted and associated with that key. All done in a manner that will guarantee that the value will be inserted only once.

In many cases, you will also want to use the value that is either already present in the dictionary, or the one that you just inserted. What I used to do with the old Dictionary was to simply add a line after all the double-check locking code that would pull out he value from the dictionary:

While you can still do that with the ConcurrentDictionary (first add, then fetch the value), there is a more convenient way:

var valueToUse = dictionary.GetOrAdd(key, value);

This will check in the dictionary if the key is present. If the key was not present, the value that was provided in the second argument will be inserted. Finally the value associated with the key is returned. Again, this is done in a way that ensures that only one attempt to insert he key into the dictionary is done.

So, what’s the problem?

Now, let’s say that the second argument, the default value, is in fact a method call that will spawn some process of some kind, and that we would not want to start that task unless the key does not already exist. Well, luckily the designers of the ConcurrentDictionary thought about this, so they added an overload taking a Func that can provide the default value:

Task task = dictionary.GetOrAdd(key, k => CreateTaskForKey(k));

This means that the method CreateTaskForKey will not be executed unless the Func is called. That should work, right? Well, it might. And it might not. It is certainly not safe. You see, there is no guarantee that the Func will be executed only once for each key. The only guarantee is that each key (with an associated value) will be inserted into the dictionary only once.

It turns out that the designers have aimed for minimal locking. So the process goes roughly like this:

Take a lock, check if the key is present, release the lock

If the key was not present, call the Func producing the value

Take a lock, check if the key is present and, if it’s not, insert the key and value

This means that if two parallel calls - A and B - came in, A could see that the key is not there, and while A is busy producing a value, B checks and also finds that the key is not there. So B also moves along to produce the value. Then A will have the value ready, check and find that the key is still not present and insert the key and value. Then B also gets the value ready, and goes to check the key only to find that now it’s there. So B throws away its value and GetOrAdd will return the value inserted by A.

As you see, the state of the dictionary itself is safe; only one value is inserted for each key, but there are side effects. So I wanted to solve this so that the task would be started only if it also inserted in the dictionary. I also found this to be an interesting case for some TDD exercise, since it offers a bit of a challenge when it comes to writing a test that would provoke the problem.

In order to test this, my thoughts went something like this:

First, I want to create a test where I can observe the behavior that the value factory method is invoked twice for the same key. The purpose of this test would be to prove the problem, and to prove that we can create the problematic state in code.

Then, I want to create another test that will expect that the value factory is invoked only once for each key. The purpose of this test is to prove the solution.

The first task was to write a test the would guarantee that this race condition occurred. In order to achieve that I would have to have some code that would call GetOrAdd with a value factory that would wait before returning its value, until it was told to continue. That way I can make two concurrent calls, wait until I know that both are inside the value factory, in parallel. When that happens, I have proven he problem. This is what I came up with:

There, now I had a reliable way of producing the problem. This code clearly points out two things;

The value factory method is invoked twice for the same key.

The dictionary contains only one value afterwards.

That was the first step. Now on to...

The Solution

One prerequisite is that I could of course not alter he behavior of the ConcurrentDictionary as such, but instead needed to focus on how to handle the side effects. I needed to find a way to allow for the value factory to execute, but push the side effects forward until after the value has been inserted into the dictionary. As already mentioned; only one value for each key is inserted into the dictionary, even if the value factory is called several times. Sounds like a job for Lazy<T>, right?

Both cases will produce a task that is started, but in case B, the task is not produced and started until we access the Value property of lazyTask. That's nice, since it means that we can set up everything we need around producing the task, but not actually create it and start it until we say so.

This means that if we create two identical Lazy<Task> objects, but fetch the Value from just one of them, only one of the tasks will be created and started. So, by changing our ConcurrentDictionary<int, Task<int>> into a ConcurrentDictionary<int, Lazy<Task<int>>> we shield ourselves from he side effects caused by the GetOrAdd value factory to be invoked several times. The only thing that happens is that we create a bunch of Lazy<Task> objects, but only one of them will find its way into the dictionary.

Since GetOrAdd will return either the already present object, or the one that is inserted as a result of he call, all calls to GetOrAdd for a given key will return the same Lazy<Task> instance.

As often when I want to solve a situation like this, I want to find a solution that will require as little changes as possible in the code. In this case I decided to make an extension method. I started from the first test, and tried to make minimal changes for a new test that would verify the behavior where the task-producing method was invoked only once:

Just as the community got used to using using...

I think that the using statement in C# is a really nice invention. It’s an excellent syntactic sugar for handling IDisposables in a correct fashion. I get the impression that it took a while for the community to make it part of their daily routine to use it and now, as we are coming there, there is a shift towards asynchronous programming that may render the using block useless (or at least temporarily so, I hope to see C# 5 handling this in conjunction with the await keyword).

Recently I ran into a situation where the asynchronous approach made the using block less ideal. I still like the simplicity of it, so I started thinking about creating a pattern that would only mean a minor shift away from it; an asynchronous using block. This pattern should ideally be close enough to a regular using block so that you could make some sort of (probably regex-based) search replace approach for moving to the asynchronous version.

This blog post is also written in a way so that it follows the thought process that I went through when making the code.

Executive summary

If you don’t want to wade through the whole thought process of creating the asynchronous using approach along with tests, skip ahead to the solution.

Let’s start from the beginning

As a starting point, let’s say that you a class that can save its contents to a Stream, and that you have some code doing that, nicely wrapped in a using block:

...and then you figure that "nah, blocking this thread for that save... on that slow server… no thanks", and as it happens there is an asynchronous version of the method returning a Task:

using (Stream stream = File.OpenWrite(@”\\slowserver\share\file.txt”))
{
// Disclaimer: in real code you would want to get hold of that Task, observing
// any errors on it and handle them. This is omitted from the code samples for
// the sake of focusing on the topic at hand.
someClass.SaveAsync(stream);
}

And now you have a bug instead. You create a Stream, pass it to the asynchronous method and then immediately dispose the stream. When the save method comes around to writing to the Stream, it's is quite likely already disposed, and you will have an exception.

I imagined that rewriting the code to work with Task but without using blocks could look like this:

Setting the stage

First I needed to figure out how to test this. The first test I wanted to write was one that used a regular using block, and that waited for the task to finish within the block. The reason was that this is a known and simple scenario where I knew the expected outcome.

This is the bare minimum that I needed for the test: a class that implements IDisposable that contains a property indicating whether it’s disposed or not, and a task using that resource. If the resource is disposed before the tasks executes, an exception is thrown. Finally I verify that the resource has indeed become disposed.

There, with that in place the first test passed as it should. Time for the next test. Now I wanted to create a scenario where the test consistently would fail if not waiting for the task within the using block; a test that would guarantee that the resource is disposed before the task tries to use it. This could probably be achieved by putting a Thread.Sleep inside the task, before checking the IsDisposed property, but Thread.Sleep is a thin ice to walk on. It can give you a reasonable chance that Dispose is invoked while sleeping, but there is no such guarantee. So I decided to use a wait handle instead, so that I can signal things in a deterministic way. I also decided to use an event to get notified when the object was disposed.

Now I was in a position where I could verify that a regular using block would dispose my resource as expected, and also that a regular using block could dispose my resource before the task was finished, if I didn’t wait for the task inside the block. Perfect, now I had the tools to create the asynchronous using functionality.

Making it work

I realized at this point that I would needed to write a few different tests to make sure things work as I wanted them to. I also realized setting up the tests will be quite similar for each test. Since I am not a huge fan of copy/pasting large portions of code that does the same thing, I decided to encapsulate the task creation into a method:

There, now that I had a disposable resource and an asynchronous consumer, I could write a test for an asynchronous approach to using disposable resources.

As mentioned before, there are three steps involved; acquiring the disposable resource, invoking some code that uses the resource and finally disposing the resource. Disposing the resource is done in the exact same way regardless of what resource it is. This is also reflected by the regular using block; you as a developer never supply any code for that part. The other two however must be supplied by the developer.

I imagined making an Async class (for functionality facilitating some async headaches), containing a Using method that takes two parameters; a Func that produces a disposable resource, and a Func taking such a resource as input and returning a task. Since I didn’t want to be forced to unnecessary type casting, I decided to make the method generic.

Do you see the similarity of this code construction, compared to the regular using block? First we have a statement that will produce an IDisposable object of some kind, then we have a code block where we can use that object, and in this case we call the variable holding the reference to that object disposable. We use it for whatever it’s needed for, and then we return a Task (so that we can chain ContinueWith calls on it, check the result, observe exceptions and so on).

By refactoring the code sample from above (that showed using a disposable resource with tasks, but without the help of any syntactic sugar or helper methods), I came to this implementation:

…and the test passed! There it was, the first incarnation of the async using method.

My next step would be to create a version of the async using block for when I would want to execute any piece of code inside an asynchronous using block, but that where the code does not necessarily return a Task object. It could be that you just want to push some work off the UI thread to keep the UI responsive, for instance.

As always, test comes first. By now I realized that the structure of the following tests would be quite similar to the first one, in terms of necessary setups and such, so next step was to move some setup and teardown work to separate methods, moving some noise away from the tests:

Now the disposable class instance was created with all necessary wiring set up automatically before each tests was executed, so that code did not need to reside inside each test. I like how that makes the test code more focused on its purpose. So with that in place, I could make a simple test to make sure that I could execute code that does not involve a Task in the same sort of asynchronous manner:

Same approach as before, only this time without a Task. The code will throw an exception if the disposable object is disposed prior to the wait handle being signaled. The wait handle is signaled after the Async.Using method call returns. Finally the test verifies that the disposable resource is properly disposed, and that the resulting Task does not fail.

"Wait a minute..., didn't you just say that this test scenario did not involve any Task!?" Yes I did, and I meant it ;-) The code that executes asynchronously within the block does not use or return any Task, but the construct itself, the Async.Using method, wraps the whole thing in a new Task that is returned so that your code can wait for it, chain other operations to follow on it, observe exceptions and so on. Let's looks at the implementation and it will be more clear:

As you can see, it's very similar to the one we created first, but it creates a new Task that invokes the Action<T> delegate that is passed into the method, and returns that Task object.

At this point, the Async class started to fulfill the needs that I had, so I didn't do much more work on it, but there are of course a lot more that code be done with it.

The final Async class

Here is the the "full" Async class, should you want to use it. I say "full", since it is in no way complete; there are many overloads that you could wish for, supporting Action and Func delegates with various number of type arguments, for instance.

You can also grab a full Visual Studio solution (containing both the class and some tests) from my BitBucket repository.

Is stubbing time overkill? If it is required for your tests to be reliable, I would say no, it’s not overkill. In this text I will discuss why and how to stub time to make your coded tests break free from the constraints of time. Wish it was as easy to do outside of the code world as well…

I use the term stub here. I usually make the distinction between mocks and stubs, in that a stub simply returns a hard-coded response, while the mock will also register the interaction as such, so that the test can verify that the mock was indeed invoked.

An important part of writing tests is to always be aware of what it is that you are actually testing. One side effect of this is that you will need to fake anything that is not currently under test. For instance, if you have code that processes data from some data store, your test should fake the data store and feed hard-coded input to the processing code. This will give your test code at least three advantages:

It does not require extra dependencies on the data layer

It makes the test reliable since you always execute it using the exact same input

The tests will take less time to run

If you run your test on input that can change over time, the test may fail for two different reasons:

there is something wrong with the code that you are testing

your test is making a false assumption on the result based on the expected input

This is bad since you want reliable tests. The test should fail only if the code under test does not behave in an expected manner, not for any other reason.

In many cases this comes rather naturally in your code, since you may introduce various abstractions anyway, typically around your data access layer and similar. However, there is one obvious dependency that many systems have, but not many create an abstraction for: time.

Consider that you have a piece of code that validates whether a certain action is valid to perform, and that the rule is that it is OK to perform this action between noon and 6 PM. How do you create a test that tests this in a reliable manner, no matter at what time of day it is executed?

As it happens, I wrote this at 5PM so the test passes as it should, but about one hour later it would fail. This is of course not a reliable test. It’s reliable in the same way as a broken watch; it will still show the correct time twice a day.

What we want here is an abstraction of time, so that we can represent any time of day, at any time of day. One way to do this is to change the IsValid method signature to accept a DateTime value as input and validate against that. However, this can create a class interface that is cluttered with parameters into methods, and that may be quite cumbersome to consume. A better approach would be to create an abstraction that can be injected into the types that need time related data:

Now we can inject an IDateTime provider into the ActionValidator, and have it perform its work. If no IDateTime is expiciltly injected, the ActionValidator will default to using SystemDateTime. But in order to drive our tests, let's create a fake one:

The fake provider has an extra property, NowValue. This enables us to set a hard-coded DateTime value that will be returned by the Now property. This way we can represent any time of day (and date) at any time, and write tests that repeatedly will test how our code behaves at that exact instant:

Now, while this is a very simple example, it points at something that I find quite important to think about when writing tests, and designing your code: be aware of what it is that you are actually testing. Only the code that you intend to test should be able to make the test fail. Antyhing else is an unwanted side effect.