Probing a Hidden .NET Runtime Performance Enhancement

Jomo Fisher--Matt Warren once told me that the runtime had a performance optimization involving calling methods through an interface. If you only had a small number of implementations of a particular interface method the runtime could optimize the overhead of those calls. Coming from the C++ world where a vtable is a vtable this seemed a little odd to me. I finally got around to trying this out myself and he was right. Here's the code so you can try it for yourself:

Jacob, I don’t believe its internal only–its not a static thing. Rather, when the second or third implementation is encountered the runtime uses a more and more general-purpose strategies. I’m only guessing here.

Desco, your technique is very interesting–your timings are equivalent to what I see when calling through class methods instead of interface methods. I also notice that it doesn’t make any difference how many implementations of ICall are passed through. I would say this is a good tool for any C# dev’s toolbox.

I dug a little more into your technique. Its not exactly a free lunch. You need to know the concrete type when calling DoManyTimes<T>. If you were storing your Call1, Call2, Call3 instances in ICall-typed variables you will lose the perf. For example, the following runs at normal (slow) speed:

The generics/struct code performs well because struct-instantiations of generic methods are specialized at runtime. In other words, the CLR will actually create three methods, as though you had typed

static void DoManyTimesCall1(Call1 ic){

for (int i = 0; i < 100000000; ++i)

ic.Do();

}

and similarly for DoManyTimesCall2 and DoManyTimesCall3. Possibly the JIT will even inline the call to Do inside the loop.

For ordinary interface dispatch on objects, performance depends on the dynamic pattern of calls, as you have seen. The runtime is tuned to respond well to a sequence of calls all on the same class of object. It would be interesting to test code in which a single interface dispatch site was fed a sequence of objects of differing types (e.g. cycling through classes Call1, Call2, and Call3).