Introduction

As a developer, chances are that you have looked through your code or the result from the profiler and said ..ahh this should be cached .. That ought to fix it and before you know it, you have your typical dictionary with a TryGetValue cluttering up your code.

To me, that is just noise in my code and cross cutting concerns, such as caching should be a one liner. Preferably it should be a zero liner, but that is a little bit harder to achieve, although it most certainly is possible. We will take a look at this later on.

So this article is going to demonstrate how to create a simple caching mechanism for deterministic methods.

Background

First of all, what makes up a good candidate for caching? I would say that it depends on how static our data is. In this case, we talk about deterministic methods, meaning methods that given the same input they will always produce the same result.

Common sense should tell us that these methods only need to be executed once. It's like asking the same question twice and we don't really need to since we already know the answer. Makes sense, doesn't it? The only problem is, methods don't have memory. Yet!!!

Example

In order to better show how this all comes together, we need an example.

Not very clean, but it takes care of business. That is until the project manager comes up to you and tells you that you can forget about that raise unless you fix that "specified key already exists in the Dictionary" exception. Being an experienced developer, you know a threading bug when you see one and maybe you go ahead and add a lock statement. That way we can make it perform better and at the same time provide a decent level of thread safety.

The only problem now is that most of the code is all about performance and very little about its actual intention. Noise, is what it is.

Adding memory to methods

It would be nice if the Calculate method had a little memory of its own so that it could just return the correct value given that we provide some known input. (Still, we have to imagine here that input * 2 is a very expensive operation.)

To be honest, I have never thought of this until I read this blog post by Denis Markelov.

His idea is to decorate a method in such a way that it returns a Func that points directly back to code not very different from our previous example. Actually, here is the whole class that Denis wrote:

This is actually sort of a cache itself that keeps a list of decorated methods indexed by the method itself.

You might wonder why this class is not static, and that is because of the fact that we are storing function delegates and then we are also keeping a strong reference to the target of the delegate. We could get around this by implementing a WeakFunction or something in that direction. I actually tried that, but things got messy really fast and I wanted this to be as simple as possible.

If we go back to our SampleClass, we can see this class put to good use.

As we can see, the type inference mechanism infers the generic arguments from our target method and we are left with a pretty clean one-liner. Since the MethodCache is not static, we still need to declare it, but we use the same MethodCache regardless of the number of cached methods in the class.

Multiple arguments

If we go back to the CacheProvider class, we can observe that it stores the return value based on the input value. Quite simple, but this also keeps us from having more than one method argument. I like to keep the number of arguments in my methods as low as possible, but I occasionally create one or two methods that breaks my principle :)

Anyhow, since we now are faced with two input values and only one possible key for the dictionary, we need to somehow make them appear as one. For this, we can use the new Tuple class introduced in .NET 4.0. Pretty straightforward, and has served me well on a number of occasions. The CacheProvider's Invoke method now looks like this for the method accepting two arguments:

We draw the the line here at a maximum of four arguments. Feel free to extend this although it might be wise to consider if more than four arguments is really needed for any method.

To use this, you actually just need to copy the CacheProvider and the MethodCache classes into your own project and you are good to go.

You might want to stop reading here, or you could continue as we shift gears into some really cool stuff.

Transparent caching

Previously, I mentioned how much I would like to implement caching with as little effort as possible. We are now down to a one-liner, but would it be possible to cache-enable our methods without writing any code at all? And what if we wanted to add caching to a library for which we don't have the source code? Is this actually possible?

Let us step back for a moment and take a look at what we are actually doing to enable caching for any given method. This is what we have in our SampleClass.

Let's see what we have done here. First, we have created a private property that gives us access to the MethodCache itself. Next, we have extracted the code from the Calculate method into the DoCalculate method. Finally, we have rewritten the Calculate method so that it now invokes the MethodCache.

Note that the first step is only needed once regardless of the number of cached methods.

Sure, we all have those fancy refactoring tools that makes these kinds of operations a breeze, but it is a simple and repetitive task. Maybe we could delegate all this tedious method extraction and rewriting to someone else. And while you go.. Ah didn't we hire some greenhorn fresh out of school the other day?.. I can tell you that is not whom I'm talking about..

Mono.Cecil

There are essentially just two tools that lets you emit IL (Intermediate Language) instructions into an assembly. The first one is Reflection.Emit and the second one is Mono.Cecil.

So why choose one over the other? The answer to that is really quite simple.

Reflection.Emit can only be used to create new assemblies, while Mono.Cecil has the capability to modify an existing assembly and save it back to disc (or load the modified assembly directly into the app domain).

Since we are dealing with an existing assembly here, we need to use Mono.Cecil to accomplish the task. The process of modifying the IL of an assembly is also known as assembly weaving.

The AssemblyWeaver

The AssemblyWeaver class is a simple class that loads an assembly, weaves the target types, and saves the modified assembly back to disc.

The first parameter is an implementation of the ITypeWeaver interface that does the actual weaving of each type. The second parameter is an implementation of the ITypeSelector interface that is responsible for selecting which types to send to the ITypeWeaver. You might recognize this style of coding as Dependency Injection. There are default implementations of both interfaces available and we will take a look at both of them.

The ITypeSelector

The ITypeSelector interface represents a class that is responsible for selecting the types that should be weaved. The interface itself is quite simple.

The IMethodSelector implementation is responsible for selecting the methods that are eligible for weaving.

There is a default implementation of this interface as well, called the AttributeMethodSelector class that works almost in the same way as the AttributeTypeSelector class apart from the fact that we are now looking for the CachedAttribute class.

Note that the AttributeTypeSelector could be changed into looking at all types that contain at least one method decorated with the CachedAttribute, but that might be a time consuming operation, so that is actually the reason for the EnableMethodCachingAttribute in the first place.

So what does this TypeWeaver actually do? Well, it should do exactly the same thing as the greenhorn now sitting next to you asking for network credentials.

Create a property that represents the MethodCache instance

Extract the code in the target method into a new private method

Rewrite the target method so that it calls the MethodCache.Invoke method

Now, how hard could it be? Surprisingly, it is not that difficult; however, working with Cecil can be challenging at times. Mono.Cecil is probably one of the most powerful libraries available on the .NET platform, but on the other hand, I will put this kindly and say that its documentation is scarce.

When trying to figure out how to emit code, it is best to implement what you need in plain C# and then investigate the IL using ildasm.exe. Personally, I use Reflector or LinqPad to do this, but ildasm will do just fine.

The TemplateClass is a class written to have something to reflect over to get the IL. Anyway, not much code here and that is because standard CLR properties are just wrappers around getter and setter methods, so we need to go to the get_Method to get the actual implementation.

That's more like it. This is exactly the code that we need to emit in order to have a property that returns a MethodCache instance. You can see that there is a lot of IL just to express something that in C# seemed quite simple. As a matter of fact, it can be optimized by, for instance, removing some of the stloc and ldloc instructions that the compiler seems to throw in without any real purpose.

But be careful; if in doubt, emit exactly what ildasm outputs or else you run the risk that the evaluation stack becomes imbalanced and that is definitely not a good thing.

I am not going to elaborate on the details of how to emit this code using Cecil. It's pretty straightforward and you will find all the IL rewriting code inside the TypeWeaver class.

PostWeaving

PostWeaving is a term that is used for assemblies that get weaved just after they have been compiled. So we need something that we can execute from an MSBuild file. Let us create a simple console application (PostWeaver.exe) to serve this purpose.

Comments and Discussions

I’m delighted to have found this article and the comments/remarks to make this functionality part of a webAPI I have been building. But due to the syntax complexity (a bit beyond my hobbyist knowledge level) I’m a bit lost how I can best incorporate a caching timespan.

The neatest way (I think) would be specifying an Attribute to the function which triggers caching and setting the caching timespan. But… That’s beyond my expertise.

Anyone know how to do this? I would really like to see this becoming a standard library.

Best regards,Rémy Samulski

EDIT:Couldn't let go of this problem so I made this piece of code to implement caching lifetime/timespan:

I'm very impressed by the simplicity of your example, but I have to inform you that the ConcurrentDictionary version of your method cache has a flaw that prevents it from achieving any calculation time improvements at all:

You are correct. The code in the article picks the wrong GetOrAdd overload. If you look at the actual code, you will notice that the function actually represents the value factory and should work as advertised

Been (sort of) looking for some real world applications of closures (in C#) that are not necessarily math related, and where it really makes sense to use a closure. This is one of the better examples. You should do an article with more examples (if you've got any cool ones).

I really liked your implementation. I actually have two "helper classes" to create my caches. Both receive the delegate to be called at constructor (so I don't keep generating new delegates) but one is weak (and also thread safe) and the other is only thread-safe... in fact I never had the need for non-threadsafe caches, but it very simple to create one... the classes can be reduced as:

A constructor that receives the create delegate (actually it is always one parameter... if more parameters are needed, it is the caller who needs to create the keyvaluepair/tuple).A get method that internally calls GetOrCreateValue.

I could make these two classes be one using the dictionaries by their interface, but as they are really small I decided to have two dedicated versions.