Introduction to Creating Dynamic Types with Reflection.Emit

Introduction

Dynamic types can provide the Framework developer with efficient programming abstractions without the performance penalty usually incurred by many abstractions. By coding against interfaces and using the Factory design pattern, you can develop a framework that has the generic benefits of abstractions, but also has the performance benefits of hard coded logic.

Dynamic type factories use a program's underlying metadata to determine the best way to 'build' a new class at runtime. The code for this class gets 'emitted' directly into an assembly in memory, and does not have to be run through the .NET language specific compiler. Once a class has been emitted, it is then 'baked' by the CLR and is ready to be consumed by your application.

This pattern lets you create very specific classes with hard coded logic, but can also be flexible because you can emit as many classes as needed, as long as all the classes consume the same public interface.

With reflection in .NET, there are now thousands of late binding abstractions that you can write in order to create generic multi-use functions or frameworks. These abstractions are incredibly valuable to enterprise developers because they can dramatically cut development time. Why write a small variation of logic ten times, for ten different classes that share a common pattern, when you can write a generic pattern once and have it work for every situation?

The problem with many of these late bound abstractions is that they usually come with a performance penalty. This is where the System.Reflection.Emit namespace (from here on, referred to as just Reflection.Emit) and dynamic types can be of great use. This article is part one of two, in which I'll discuss what dynamic types are, strategies to use when writing and consuming them, and how to create them. I'll cover one possible usage of dynamic types, and give some code. But I'm going to save a full example of implementing dynamic types, with full code, for my second part of this article.

Possible uses for dynamic types

The most common reason for using dynamic types is to solve a performance problem. One common pattern I've encountered many times as a programmer is the Object / Relational Database mapping framework, whose purpose is to provide a generic API for mapping class properties to database tables or stored procedure result sets. Most use some sort of metadata, such as XML, to map what columns from a result set gets written to what property of a class. In order to do this, they use reflection to query a class for a desired property, and use reflection again to populate the property with the data from the result set.

This creates a framework that allows you to add new classes quickly and easily, with much less code. But using reflection can be a huge performance killer. Instead, you could create an O/R mapping framework that creates a dynamic type that has hard coded mapping logic specific to that class and the columns that are used to populate it.

What exactly is a dynamic type

Dynamic types are types, or classes, manually generated and inserted into an AppDomain at runtime, from within the program. The cool thing about dynamic types is that the program can evaluate a set of given metadata and create a type that will be optimized for the situation at hand. To do this, you use the classes supplied by the Reflection.Emit namespace to create a new type, and emit functions directly into it.

The down side of creating dynamic types using Reflection.Emit is that you can't just dump C# code into your dynamic assembly and have the C# compiler compile it to IL. That would just be way too easy. You have to use the classes in Reflection.Emit to define and generate Type, Method, Constructor, and Property Definitions, and then insert or 'emit' IL opcodes directly into these definitions. It's harder than normal coding because you have to use, and generally understand, IL. It isn't all that bad. There are ways to make this learning curve easier, which I'll cover in a bit.

Defining the generic interface

There is one major problem with creating types with Reflection.Emit, and it's a pretty big one. When you're developing an application with dynamic types, don't have an API to program against. Think about it. The class that you are going to generate at runtime doesn't exist at design time. So, how are you supposed to program against dynamic types? Ahh, the power of interfaces.

So, the first step is to figure out what the public interface for the dynamic type will be. Let's take a look at an example. Earlier, I mentioned an Object / Relational Database mapping framework that could map columns in a database to objects in your application. You could create a mapping function for each class in your app, or you could create a framework that knows what columns get assigned to what property, depending on the class it's loading.

So the interface that I might come up with for this type of framework could look like this:

This doesn't look all that exciting, but your dynamic type generator could read an XML file, and use its contents to create a new type that takes the input entity object, cast it to the appropriate type, and then assign the columns from the DataRow to the appropriate properties in the entity object. This new dynamic type could then be used to populate every new object of that type.

Are there other OR mapping frameworks out there already? Yes, but most utilize reflection or late binding to map and assign which column gets assigned to which property, and as I mentioned, reflection is one of the worst performance killers.

The dynamic type factory

The next thing to do is come up with a design for the class that will generate the dynamic types and return them to the caller. For the dynamic type generator, the factory pattern fits very well. The factory pattern is generally used when the developer wants to hide the details about how to create a new instance of a type. This pattern is commonly used when you have an abstract base class or interface and several classes that inherit from that base class or interface. The consumers of the type should not manually create a new instance of the type themselves, so they must call a factory method which determines which class to create and return it to the consumer. This is a great black box way to hide a type's initialization logic so it is not duplicated all over an application. This fits the needs of our solution perfectly since the caller can't explicitly invoke the constructor of the dynamic type. Also, I want to hide the implementation details of the dynamic type away from the caller. So the public API for the factory class will be something like this:

In this O/R mapping framework, the function CreateORMapper would take the name of the passed in type, and then look for a node in a mapping XML file that matched that type name. This XML node would have a collection of inner nodes that tells the factory what column names map to what object properties. As the factory is generating the dynamic type, it would use this XML metadata to generate IL code to cast the input object to the type that this mapper is created for, and then create code to assign a value from a DataRow column to a specific object property. And that's about it.

Once this dynamic type has been generated, it can be used from that point forward for any new object of that type that needs to be populated from a DataRow. This way, the application only has to incur the cost of the type generation once.

The following sequence diagram roughly demonstrates how this works. First, the Consumer class calls the ORMapperFactory and asks it for an IORMapper instance. The Consumer passes in the type, “typeof(Customer)”, which the factory will use to determine which ORMapper it needs to generate and return. The Consumer then calls the newly generated ORMapper instance, passing in a DataRow and an empty instance of the type the mapper was generated for, which in this example is the Customer class. The ORMapper has hard coded logic that assigns data from the DataRow into the correct properties of the Customer class. The Consumer is then free to call the properties of the Customer class and get its values.

Setting up a dynamic type

Before getting down to showing how to actually emit IL into the IORMapper.PopulateObject() method, there are a few house cleaning tasks you need to take care of first. First, you need to set up an assembly to hold the new type. Since Reflection.Emit cannot add a new type to an existing assembly, you have to generate a brand new one in memory. To do this, you use the AssemblyBuilder class.

To create a new AssemblyBuilder instance, you need to start out with an AssemblyName instance. Create a new instance, and assign it the name you want to call your assembly. Then get the AppDomain from the static Thread.GetDomain() method. This AppDomain instance will allow you to create a new dynamic assembly with the DefineDynamicAssembly() method. Just pass in the AssemblyName instance and an enumeration value for AssemblyBuilderAccess. In this case, I don't want to save this assembly to a file, but if I did, I could use AssemblyBuilderAccess.Save or AssemblyBuilderAccess.RunAndSave.

Once the AssemblyBuilder has been created, a ModuleBuilder instance also needs to be created, which will be used later to create a new dynamic type. Use the AssemblyBuilder.DefineDynamicModule() method to create a new instance. If you wanted, you could create as many modules for your dynamic assembly as you wanted to, but for this situation, only one is needed.

Luckily, once an AssemblyBuilder and ModuleBuilder have been created, the same instances can be used over and over to create as many new dynamic types as you need, so it only needs to be created once.

Next, in order to create the actual dynamic type, you have to create a new TypeBuilder instance. The following code will create a new type and assign it to your dynamic assembly:

You create an instance of a TypeBuilder class by calling the ModuleBuilder.DefineType() method, passing in the class name as the first argument, and for the second argument, an enumeration value of TypeAttributes that defines all the characteristics of your dynamic type. The third argument is a Type instance of the class that your dynamic type inherits from, in this case, System.Object. And the fourth argument is an array of interfaces that the dynamic type will inherit from. This is very important in this solution, so I need to pass in the type for the IORMapper.

There is one thing that I'd like to point out here. Have you noticed a pattern in how you create instances of these Reflection.Emit classes? AppDomain is used to create AssemblyBuilder, AssemblyBuilder is used to create ModuleBuilder, ModuleBuilder is used to create TypeBuilder? This is another example of a factory pattern, which is a common theme throughout the ReflectionEmit namespace. Can you guess how you would create a MethodBuilder, ConstructorBuilder, FieldBuilder, or a PropertyBuilder class? Through the TypeBuilder, of course!

But I don't want to learn IL!

OK, now it's time get down and dirty with the Reflection.Emit classes and create some IL. But what if you really want (or need) to work with dynamic types, but you don't want to spend weeks poring through the IL specification and other documentation to learn IL? Not a problem. Microsoft gives us a tool that comes with Visual Studio .NET that will give you a huge head start: ILDasm.exe.

ILDasm lets you examine the internals of an assembly, most notably the metadata and the IL code that an assembly is made up of. This is where ILDasm becomes a huge help to you when creating dynamic types. Instead of trying to figure out what IL code you need to generate for your dynamic type, you can simply prototype the dynamic type in C#, compile it into an assembly, and then use ILDasm to dump out the IL code. After that, it's a simple matter of figuring out what the IL means, and then trying to recreate it with the classes available in Reflection.Emit. ILDasm was a huge help to me in learning the ins and outs of creating dynamic types.

Now, to say you don't need to know or understand IL to create dynamic types is true. But I can say that, it will greatly help if you have a general understanding of the basic IL syntax and opcodes, as well as how stack based programming works. I'm not going to even try to cover this, but at the end of this article, I list a few reads that were really instrumental to me in learning IL.

Anatomy of emitting a method using Reflection.Emit

Whether you want to write a method, constructor, or property, you are, in essence, writing a method; a block of code that performs a piece of functionality. And, there are a few items to be aware of when defining one of these types of code constructs with Reflection.Emit.

In C#, if a type's default constructor has no functionality, you are not required to define it in the class. The C# compiler takes care of this for you. The same is true in both IL and Reflection.Emit; it's all handled for you behind the scenes by either ilasm.exe or the TypeBuilder.CreateType() method, respectively. In this example, I don't have any functionality to add to the constructor, but I'm going to define it anyway because it's a good, easy example of a method to go over.

Now that a TypeBuilder instance has been created, the next step is to create an instance of a ConstructorBuilder class. The TypeBuilder.DefineConstructor() method accepts three arguments: a MethodAttribute enumeration, a CallingConvention enumeration, and a Type array that corresponds to the list of input arguments for the constructor. This is shown below:

Notice that in my Reflection.Emit code, I don't have a MethodAttributes.Instance value defined, but the IL code has an “instance” attribute assigned to the constructor definition. This is because the MethodAttributes enumeration doesn't have one. Instead, it has a MethodAttributes.Static value. If the Static value is not set, then the ConstructorBuilder implicitly sets the “instance” attribute for you.

The next value I passed into the DefineConstructor() method was CallingConventions.Standard. The MSDN documentation has fairly little information about the different values for the CallingConventions enumeration. But from what I can tell, if you pass in Standard, the CLR will decide for you what the appropriate CallingConvention is for you. So, I just always default to this value.

The last value passed into DefineConstructor is a Type array. All function builders take this argument, and it corresponds to the Type of each argument passed into the method being defined. The order of the types in this array must match the order in the method argument list. Since a default constructor does not take any arguments, the predefined Type.EmptyType, which is an empty Type array, can be used.

The funny thing about IL is that you don't get very much for free. In C#, even though every class somewhere along the inheritance chain inherits from System.Object, you don't actually have to call the Object's base constructor (though you can, if you want to). But in IL, you must call the base class constructor for your class, and in this example, you use the System.Object default constructor.

In order to do this with Reflection.Emit, you need to start with an ILGenerator instance. This class is the heart of most of the work you'll do with dynamic types. The ILGenerator has an Emit() method that is used to actually pump IL opcodes into your new methods. An ILGenerator instance is created from the 'builder' object you are currently using (ConstructorBuilder, MethodBuilder, PropertyBuilder, etc). The Emit() method has 17 overloads, so I won't try to go over each. But, each overload takes a value from one of the OpCodes class static properties as its first argument. The second argument corresponds to whatever IL opcode argument is needed, if any.

Another important item to keep in mind when creating instance methods in both IL and Reflection.Emit is that there is a hidden argument passed into every method. This hidden argument is always the first input argument, and it is a reference to the object that the method belongs to. This is how you are able to use the “this” keyword in C#, or the “Me” keyword in VB.NET. This is an important fact because any time you need to call an instance method or field from within a class, you have to use this argument to reference the instance of the calling type. Another interesting tidbit about IL is that arguments are always referred to by their positional index from the list of arguments. Because of this, the hidden argument that I just mentioned is always referred to as argument 0. And any explicitly defined method arguments are referenced starting at index 1. So what that means is that every instance method has at least one argument, even default constructors and property getters. And what about static methods? Static methods don't get this hidden argument, so explicitly defined method arguments should be referenced starting at index 0.

Here lies another big different between IL and Reflection.Emit. Notice in the IL code above, on line IL_0001, you use the “call” opcode and pass in “instance void [mscorlib]System.Object::.ctor()”. This basically means, call the instance constructor for System.Object. To do this with Reflection.Emit, you need to use Reflection and create a ConstructorInfo instance that corresponds to the constructor for System.Object. Then, you pass this instance in as the second argument to the Emit() method.

So, now that all that is out of the way, let's take a look at what the default constructor does. As I mentioned before, I'm going to assume that you know the basics about IL and stack based programming as I go over this. In order to call the constructor of Object, you first need to add the hidden “this” argument onto the stack. Then use the “call” opcode and the ConstructorInfo instance. This is the equivalent of calling “this.base();”, which is illegal in C# if the class inherits directly from System.Object, but required in IL. Finally, a return opcode is used to tell the thread to leave the constructor, which is required at the end of every method.

Now, a constructor is just like any other method, except that it must not have a return value. In IL, when the ret opcode is called, the CLR will take whatever value you have at the top of the stack and try to return it. This can be a problem in a constructor if you leave a value in the stack. If the stack is not empty when return is called, you'll get the ever cryptic “Common Language Runtime detected an invalid program” error. What is even worse is that you won't even get this error until you are running your application and actually trying to create an instance of your dynamic type by calling its constructor. (To verify that your IL code is syntactically correct before you run it, see the section “How do I know that my IL is correct?” at the end of the article). In fact, if you have more than one value loaded on the stack by the time the ret opcode is called, you'll get this same error. To test this out, put the following code just before the ret opcode:

il.Emit(OpCodes.Ldc_I4_3);
//il.Emit(OpCodes.Pop);

Run a test program and see what happens. It blows chunks, right? What the first line does is load a constant Int32 value of 3 onto the stack. When ret is called, the CLR sees that you have a 3 on the stack, but that your method has a return type of void. This is illegal, so it throws an exception. Now, uncomment the next statement. What the pop opcode does is removes the top value from the stack. Now, your constructor has nothing in its stack when the ret opcode executes, and it can return successfully.

This basic structure can be followed when creating other constructors, functions, and properties for your dynamic types. I'm not going to cover the main function that actually maps DataRow columns to object properties, since it's fairly repetitive and I've already covered the needed basics. The important thing to remember is that the easiest way to create functions with Reflection.Emit is to prototype them in C# first, and then dump out the IL with ILDasm. After that, you just create a ILGenerator and use the Emit() method to emit IL into your dynamic type in exactly the same structure that ILDasm showed you.

Creating and using an instance of the dynamic type

Now that all the tools are in place to create a dynamic type, I have one last area to cover: how the factory class actually creates a new dynamic type and returns a new instance to the caller, and how the dynamic type can be used. Below is the basic structure of the factory class:

The first thing the factory does is check if the TypeBuilder class has been created yet. If it hasn't, the factory calls the private methods that I have already covered, which creates the DynamicAssembly, DynamicModule, TypeBuilder, the constructor, and the PopulateObject() method for the dynamic type. Once these steps are complete, it uses the TypeBuilder.CreateType() method to return a Type instance for the ObjectMapper. From this Type instance, I can then call the CreateConstructor(), and invoke the constructor to actually create a working instance of the dynamic type.

This is the moment of truth. Creating a new type and calling the constructor on the type will invoke the constructor that was built earlier. If any of the IL was emitted wrong, the CLR will throw an exception. If everything is fine, you'll have a working copy of the ObjectMapper_<TypeName>. After the factory has an instance of the ObjectMapper_<TypeName>, it will cast it down and return an IORMapper.

Using a dynamic type is pretty simple. The main thing to remember is that you have to code against an interface, because at design time, the class doesn't exist. Below is the example code that calls the ObjectMapperFactory class and asks for an IORMapper instance. If it is the first time the factory is called, it will generate the ORMapper and return an instance of it. From that point on, anytime the factory is called, it doesn't have to generate the type, it can just create an instance and return it. The consumer can then call PopulateObject() and pass in the DataRow and an empty instance to the Customer class. This is shown below:

You could also make the ORMapper act like a factory as well, by just passing in a DataRow and having the PopulateObject method create a new instance of Customer, and then populate it with the data.

How do I know that my IL is correct?

OK, so now I've completed the dynamic type factory, I compile the solution in Visual Studio, and there are no errors. If I run it, it'll work, right? Maybe. The downside of Reflection.Emit is that you can emit just about any combination of IL that you want. But there is no design time compiler checking to see if what you wrote is valid IL. Sometimes, when you “bake” your type with TypeBuilder.CreateType(), it'll throw an error if there is something wrong, but only for certain problems. Sometimes, you won't get an error until you actually try to call a method for the first time. Remember the JIT compiler? The JIT compiler won't try to compile and verify your IL until the first time a method is called. So it's very possible, and actually probable, that you won't find out that your IL is invalid until you are actually running your application, the type has been generated, and you are calling the dynamic type for the first time. But the CLR gives helpful errors, right? Not likely. Usually, I got the ever helpful “Common Language Runtime detected an invalid program” exception.

OK, so how do you tell if your dynamic type contains valid IL? PEVerify.exe to the rescue! PEVerify is a tool that comes with .NET that will inspect an assembly for valid IL code, structure, and metadata. But, in order to use PEVerify, you must save the dynamic assembly to a physical file (remember, up until now, the dynamic assembly only exists in memory). To create an actual file for the dynamic assembly, you'll need to make a few changes to the factory code. First, change the last argument in the AppDomain.DefineDynamicAssembly() method from AssemblyBuilderAccess.Run to AssemblyBuilderAccess.RunAndSave. Second, change AssemblyBuilder.DefineDynamicModule() to pass in the assembly file name as its second argument. And finally, a new line of code to save off the assembly to file. I put it just after I call TypeBuilder.CreateType(), as shown below:

Now that that is in place, run the application and create the dynamic type. Once it has run, you should have a new DLL named “DynamicObjectMapper.dll” in the Debug folder for your solution (assuming you are using a Debug build). Now, open the .NET command prompt window, and type “PEVerify <path to the assembly>\ DynamicObjectMapper.dll” and hit Enter. PEVerify will validate the assembly, and either tell you everything is fine, or tell you what is wrong. The nice thing about PEVerify is that it gives decently detailed information about what is wrong and where to find the problem. Also note that just because PEVerify comes up with an error doesn't mean that the assembly won't run. For example, when I first wrote the factory class, I used the “callvirt” opcodes when I called the static String.Equals() method. This caused PEVerify to output an error, but it still ran. This was an easy fix to call the static method with the “call” opcodes instead, and the next time I ran PEVerify, it found no errors.

One last thing, if you change your code to output a physical file, be sure to change it back to the way it was before. This is because, once a dynamic assembly has been saved to file, it is locked down so you won't be able to add any new dynamic types. This would be a problem in this situation because your application might need to create multiple object mappers based on several different types. But after the first type has been generated, if you save off the assembly to file, an exception will be thrown.

Future of Reflection.Emit in .NET Framework 2.0

So, what new stuff is there in .NET 2.0? Well, one neat new feature is something called Lightweight Code Generation (LCG). What LCG does is provide a very fast way to create global static methods into an assembly. But here is the cool part. You don't have to create a dynamic assembly, dynamic module, and dynamic type to emit the method into. You can emit the method straight into the main application assembly! Just create a new DynamicMethod class using one of its six constructor overloads (no factory methods needed). Next, create your ILGenerator and emit your IL opcodes. Then, to invoke the method, you can ether call the DynamicMethod.Invoke() method, or use the DynamicMethod.CreateDelegate() method to get a delegate instance that points to the dynamic method, and then invoke the delegate at will.

LCG seems very similar to another new feature in .NET 2.0, Anonymous Methods. Anonymous Methods are global static methods that do not belong to any class, and are exposed and invoked as a delegate. I wouldn't be very surprised if, behind the scenes, the DynamicMethod class just creates an Anonymous Method in the assembly, especially since DynamicMethod exposes a method that returns a delegate.

The solution presented in this article could just as easily be implemented with the new DynamicMethod class. Since it's a class with only one public method, it's a perfect candidate for LCG. It would just require a bit of restructuring in the factory method. Instead of having the factory create an instance of IDataRowAdapter and passing it back to the user, you could have a class defined at design time called DataRowAdapter. It has one private variable of type delegate, and the public GetOrdinal() method just calls the Invoke() method on the delegate and returns the value. The factory could just create a new instance of the DataRowAdapter, create the DynamicMethod, get the delegate from the DynamicMethod, and store it in the DataRowAdapter. When the user calls GetOrdinal, the delegate will be invoked and the integer ordinal would be returned.

The other major addition to the Reflection.Emit namespace is full support for Generic Types when creating dynamic types.

Further reading for this topic

For a good introduction to IL, checkout the first few chapters of Simon Robinson's book 'Expert .NET 1.1 Programming' (this is an awesome .NET book, in general). Next, I'd suggest Jason Bock's book 'CIL Programming: Under the Hood of .NET'. The whole book is about IL programming, and he has several chapters about creating dynamic types, and a great chapter on debugging IL and dynamic types. And finally, if you want the “black book” of IL, get 'Inside Microsoft .NET IL Assembler' by Serge Lindin. This one is fairly dry, but you can't question the content. Serge Lindin is updating this book for a second edition, which will cover 2.0 content and is scheduled to be released in May.

Share

About the Author

I have been a professional developer since 1996. My experience comes from many different industries; Data Mining Software, Consulting, E-Commerce, Wholesale Operations, Clinical Software, Insurance, Energy.

I started programming in the military, trying to find better ways to analyze database data, eventually automating my entire job. Later, in college, I automated my way out of another job. This gave me the great idea to switch majors to the only thing that seemed natural…Programming!

Yes, I've worked with the CodeDom several times, and your right, you could do the same thing with CodeDom. In fact thats the approach the ASP.Net takes in order to combine the code behind cs files and the aspx pages into an assembly.

I've found that once you know (or even have the basic understanding of) IL, Reflection.Emit is just about as easy, and much, much less code verbose than trying to create code via the CodeDom.

Alse, performance is one of the main reasons people will use Reflection.Emit vs the CodeDom. Since Reflection.Emit dumps IL opcodes directly into an assembly, you get the performance benifit of not having to go through the language compiler like you would with CodeDom

Also, once an assembly is created with CodeDom, it is static and can not be changed, meaning you can not add more types to it. So you can run into a problem if your app creates multiple dynamic types, but doesnt know ahead of times which ones it will create. With the CodeDom approach the app could end up compiling and loading many assemblies into your process space, one for each dynamic type it determines it needs to create.

With Reflection.Emit, even after you create and bake one type, you can create more types at any time, all within the original assembly you generated.

I am not sure about not being able to add a new types in CodeDom.. For example, you can create one assembly per type.. or you can unload the old assembly (run it in its own app domain) then recompile it with the new types.