Does anyone know if it possible to define the equivalent of a "java custom class loader" in .NET?

To give a little background:

I am in the process of developing a new programing language that targets the CLR, called "Liberty". One of the features of the language is its ability to define "type constructors", which are methods that are executed by the compiler at compile time and generate types as output. They are sort of a generalization of generics (the language does have normal generics in it), and allow code like this to be written (in "Liberty" syntax):

In this particular example, the type constructor "tuple" provides something similar to anonymous types in VB and C#.

However, unlike anonymous types, "tuples" have names and can be used inside public method signatures.

This means that I need a way for the type that eventually ends up being emitted by the compiler to be shareable across multiple assemblies. For example, I want

tuple<x as int> defined in Assembly A to end up being the same type as tuple<x as int> defined in Assembly B.

The problem with this, of course, is that Assembly A and Assembly B are going to be compiled at different times, which means they would both end up emitting their own incompatible versions of the tuple type.

I looked into using some sort of "type erasure" to do this, so that I would have a shared library with a bunch of types like this (this is "Liberty" syntax):

and then just redirect access from the i, j, and k tuple fields to "Field1", "Field2", and "Field3".

However that is not really a viable option. This would mean that at compile time tuple<x as int> and tuple<y as int> would end up being different types, while at runtime time they would be treated as the same type. That would cause many problems for things like equality and type identity. That is too leaky of an abstraction for my tastes.

Other possible options would be to use "state bag objects". However, using a state bag would defeat the whole purpose of having support for "type constructors" in the language. The idea there is to enable "custom language extensions" to generate new types at compile time that the compiler can do static type checking with.

In Java, this could be done using custom class loaders. Basically the code that uses tuple types could be emitted without actually defining the type on disk. A custom "class loader" could then be defined that would dynamically generate the tuple type at runtime. That would allow static type checking inside the compiler, and would unify the tuple types across compilation boundaries.

Unfortunately, however, the CLR does not provide support for custom class loading. All loading in the CLR is done at the assembly level. It would be possible to define a seperate assembly for each "constructed type", but that would very quickly lead to performance problems (having many assemblies with only one type in them would use too many resources).

So, what I want to know is:

Is it possible to simulate something like Java Class Loaders in .NET, where I can emit a reference to a non-existing type in and then dynamically generate a reference to that type at runtime before the code the needs to use it runs?

NOTE:

*I actually already know the answer to the question, which I provide as an answer below. However, it took me about 3 days of research, and quite a bit of IL hacking in order to come up with a solution. I figured it would be a good idea to document it here in case anyone else ran into the same problem. *

2 Answers
2

The System.Reflection.Emit namespace defines types that allows assemblies to be generated dynamically. They also allow the generated assemblies to be defined incrementally. In other words it is possible to add types to the dynamic assembly, execute the generated code, and then latter add more types to the assembly.

The System.AppDomain class also defines an AssemblyResolve event that fires whenever the framework fails to load an assembly. By adding a handler for that event, it is possible to define a single "runtime" assembly into which all "constructed" types are placed. The code generated by the compiler that uses a constructed type would refer to a type in the runtime assembly. Because the runtime assembly doesn't actually exist on disk, the AssemblyResolve event would be fired the first time the compiled code tried to access a constructed type. The handle for the event would then generate the dynamic assembly and return it to the CLR.

Unfortunately, there are a few tricky points to getting this to work. The first problem is ensuring that the event handler will always be installed before the compiled code is run. With a console application this is easy. The code to hookup the event handler can just be added to the Main method before the other code runs. For class libraries, however, there is no main method. A dll may be loaded as part of an application written in another language, so it's not really possible to assume there is always a main method available to hookup the event handler code.

The second problem is ensuring that the referenced types all get inserted into the dynamic assembly before any code that references them is used. The System.AppDomain class also defines a TypeResolve event that is executed whenever the CLR is unable to resolve a type in a dynamic assembly. It gives the event handler the opportunity to define the type inside the dynamic assembly before the code that uses it runs. However, that event will not work in this case. The CLR will not fire the event for assemblies that are "statically referenced" by other assemblies, even if the referenced assembly is defined dynamically. This means that we need a way to run code before any other code in the compiled assembly runs and have it dynamically inject the types it needs into the runtime assembly if they have not already been defined. Otherwise when the CLR tried to load those types it will notice that the dynamic assembly does not contain the types they need and will throw a type load exception.

Fortunately, the CLR offers a solution to both problems: Module Initializers. A module initializer is the equivalent of a "static class constructor", except that it initializes an entire module, not just a single class. Baiscally, the CLR will:

Run the module constructor before any types inside the module are accessed.

Guarantee that only those types directly accessed by the module constructor will be loaded while it is executing

Not allow code outside the module to access any of it's members until after the constructor has finished.

It does this for all assemblies, including both class libraries and executables, and for EXEs will run the module constructor before executing the Main method.

The class defines a singleton that holds a reference to the dynamic assembly that the constructed types will be created in. It also holds a "hash set" that stores the set of types that have already been dynamically generated, and finally defines a member that can be used to define the type. This example just returns a System.Reflection.Emit.TypeBuilder instance that can then be used to define the class being generated. In a real system, the method would probably take in an AST representation of the class, and just do the generation it's self.

Compiled assemblies that emit the following two references (shown in ILASM syntax):

Here "SharedLib" is the Language's predefined runtime library that includes the "Loader" class defined above and "$Runtime" is the dynamic runtime assembly that the consructed types will be inserted into.

A "module constructor" inside every assembly compiled in the language.

As far as I know, there are no .NET languages that allow Module Constructors to be defined in source. The C++ /CLI compiler is the only compiler I know of that generates them. In IL, they look like this, defined directly in the module and not inside any type definitions:

For me, It's not a problem that I have to write custom IL to get this to work. I'm writing a compiler, so code generation is not an issue.

In the case of an assembly that used the types tuple<i as int, j as int> and tuple<x as double, y as double, z as double> the module constructor would need to generate types like the following (here in C# syntax):

Ugh, how is your module constructor definition different from the ordinary class constructor? Is the difference in using privatescope as opposed to private hidebysig?
–
Dmitri NesterukDec 27 '10 at 13:19

Ahh, just figured it out. No difference except the module cctor is not placed in any particular type. Didn't know you could even do that :)
–
Dmitri NesterukDec 27 '10 at 14:55

I think this is the type of thing the DLR is supposed to provide in C# 4.0. Kind of hard to come by information yet, but perhaps we'll learn more at PDC08. Eagerly waiting to see your C# 3 solution though... I'm guessing it uses anonymous types.