Menu

Monthly Archives: June 2008

Since my last report on generic code sharing I chased down a few bugs we uncovered when trying out IronPython 2.0. That new version uses the Microsoft Dynamic Language Runtime, which extensively utilizes generics. One issue we came across was how to figure out the actual method for a delegate when only the native code pointer (acquired with ldftn) and no target class is given. For example:

public class Gen<T> {
public void work () { ... }
}

With generic code sharing the methods Gen<string>.work and Gen<object>.work will share the same native code, so given only a pointer to it it’s not possible to differentiate between the two. What one could do to make it possible to tell between the two would be to let ldftn produce a pointer not to the method directly but to a small piece of trampoline code for which there is one for each instantiation of the method. Fortunately it seems like we don’t have to bother with that, since the .NET CLR doesn’t either. Instead it gives you the instantiation of the method where all type arguments are object, so we do the same.

Another thing I did was implement sharing of methods of generic value types. There doesn’t seem too much code out there which utilizes generic value types extensively, but it wasn’t a big deal to implement so I went ahead and did it. Since instances of value types don’t contain VTable pointers we need to pass the runtime generic context (RGCTX) explicitly for all methods, like we do for static methods of reference types. One complication that arises here is when the value type implements an interface. When casting such a value type to the interface type it gets boxed and receives a VTable for the interface methods. Since the caller of those methods doesn’t know it’s dealing with a value type, much less which particular one, it cannot pass the RGCTX, so the methods in the interface VTable need a wrapper which will pass it. This is very similar to the wrapper we use when taking the address of a static method of a reference type (for constructing a delegate, for example).

I’ll end with an updated table of memory statistics for a few test applications. “Nemerle” is the Nemerle compiler compiling itself. “IronPython 2.0″ is running pystone. “F# 1.9″ is running a simple “Hello world” program on the command line and “F# 2.0″ is compiling a simple program.

If we want to share this methods between different instantiations, i.e. different type values of T, we need to provide a place for the code to look up the type of Dictionary<S,T>. This place cannot be the runtime generic context, because the data in there only depends on the type arguments of the class, i.e. S, but not of generic methods.

Our solution is to introduce a data structure very similar to the runtime generic context, called the method runtime generic context, or MRGCTX. It is associated not with generic classes and their type arguments, like the RGCTX, but with generic methods and their type arguments. We use the same MRGCTX for generic methods of a specific class if the method type arguments are the same. As an example, these methods would all share the same MRGCTX:

The MRGCTX contains, apart from the RGCTX-like slots, two items of data: A pointer to the vtable of the method’s class, and the values of the method’s type arguments. The first one is needed to get to the class’s RGCTX if no this argument is passed, i.e. in static generic methods. The type arguments are needed to instantiate new slots in the MRGCTX – without knowing what the value of T is, for example, we cannot look up the type Dictionary<S,T>.

So how much memory do we save with shared generic methods? In my previous post on sharing generic code I presented a table with the savings in memory my three large test applications. Here it is again, updated with data for sharing generic methods:

No sharing

Sharing

Sharing w/gen methods

Methodscompiled

Codeproduced

Methodscompiled

Codeproduced

Methodscompiled

Codeproduced

Memory for(M)RGCTXs

Savings

IronPython

3614

719k

3368

691k

3324

687k

7k

25k

Nemerle

7210

2001k

6302

1943k

6150

1891k

34k

76k

F#

15529

2193k

11431

2062k

9823

1652k

154k

387k

Note that this time I’m also counting all the memory used by (M)RGCTXs and the (M)RGCTX templates, which I didn’t do last time.