ILRewriting for beginners

Runtime IL-Rewriting can be used to add behavior such as logging to applications, or redirect calls from one API to another. This article and accompanying source code explains how to substitute a method call at runtime.

Introduction

This article is meant to be a small tutorial for runtime IL-rewriting. I don't pretend that it is going to be a complete one,
since IL-Rewriting can be a big and and sometimes difficult topic. The targeted audience are developers without former experience of IL-Rewriting, but are curious to try it out.

Background

My past articles about the
CLR hosting API,
and a mixed-mode profiler
have primarily been towards diagnostics and testing of running applications. This article also has its origin in the testing field.
Once I saw a video by Roy Osherove (a unit testing guru) speaking about the implementation of some mocking frameworks.
Mock frameworks are used in unit-testing to isolate a class and remove dependencies to objects. Some of them do it by using IL-rewriting to fake and stub objects.

A cool tool to check out is Moles
from Microsoft Research, which allows you to override return values in the System libraries. Let's say that you have a legacy system and a
calendar implementation to test,
and the problem is that it behaves differently depending on the time the test is run. Using IL-rewriting, you will be able to override the
System.DateTime.Now in your tests.
The benefit is that your tests will be deterministic and always give the same result.

Code Rewriting

Modifying the Code of a Running System is an Old Practice with Native Binaries

The author could write self-modifying code.
For license checking one could ship a crippled binar, and if the license check is successful,
some important code could be decrypted and copied from a hidden place in the binary and written into some place in the binary where it had been removed.

To Circumvent a License Check or Give Unlimited Lives in a Computer Game

Some people made
permanent changes in the EXE, by changing a conditional jmp to a direct jmp.
Some binary EXEs were encrypted on disk so this approach was not always possible, so instead they used a small loader and applied the patch on the running system.

To Add Code to a Native Application

One must find a free memory region (aka code cave), either on disk or in the memory space of the running process.
In order to add this code to the program you simply copied some instructions from the beginning of function A, to your code cave,
and at the beginning of function A, you inserted a call to the code cave. Last in the code cave you added a return (ret), and the execution continued from where it was called.

IL-rewriting works more or less in the same way. The fact is that it is much easier.

No need for code caves. Free space can allocated directly in the CLR.

IL Assembler is more readable than native assembler code, because it can often also be viewed as C# or VB.NET

Metadata contains the full type information of classes and types.

Rewriting through the ICorProfiler API

Fortunately. There is a profiler API which makes it possible to interact and get notifications from the CLR.
It is an unmanaged API, this is unfortunately necessary, otherwise also the profiler code would be profiled.
Let's look at the interfaces we need to build a profiler.

IcorProfilerCallback interfaces

Your own profiler must implement this interface in order to get notifications from the CLR.
The first callback we will look at is Initialize, which is called at startup.

The parameter we get is a pointer to an object implementing ICorprofilerInfo, IcorProfilerInfo2, and/or ICorProfilerInfo3.
The object you get depends on the version of the CLR that is running. ICorprofilerInfo and ICorprofilerInfo are the most important.
The ICorProfilerInfo3 interface is implemented in .NET 4.0 and adds attach and detach capabilities.
ICorProfiler3 inherits from ICorProfilerInfo2, which inherits from ICorPRofilerInfo.
So if you get a IcorPRofiler2 object, there is no need to query for the
ICorprofilerInfo object.
So the first thing we should do is to query for ICorProfilerInfo2.

Subscribing to Events

You can tell the CLR what events you are interested in getting notifications from by setting an event mask.
The mask is constructed by Or:ing together some enum values, and calling SetEventMask on the ICorProfilerInfo object.
This can only be done once, and only inside the Initialize method. Calling it from other functions later will result in an error.

That callback is called on all managed methods when the IL-code is about to be
JITted into native code.
This is the window of opportunity we have to do some IL-Rewriting.

Steps to Follow

What we get from the JITCompilationStarted callback is a FunctionID.
By using the FunctionID as a parameter to ICorProfilerInfo::GetFunctionInfo we can obtain its ClassID and ModuleID.
A call to ICorProfilerInfo::GetModuleInfo with the ModuleID will return its Module name, and its AssemblyID.

IMetaDataImport Interface

This interface is for doing lookups in the metadata. You can for example iterate over all methods of a class, or find the parent class or interfaces of a class.

IMetaDataEmit

This interface is for emitting/generating new Modules, Assemblies, Classes, Methods, Strings etc. If you are interested in using methods from other assemblies,
you will have to generate a mdMethodRef to that method in the module you will call it from. It is sort of like a forward declaration or external declaration in C.
The loading of an external assembly is automatically taken care of by the CLR. Note, that it will be loaded when the method is executed, not when the
MethodRef is created.

Internal Structures

The IL-code of a method contains a header, describing the IL-code.

In its Easiest Implementation

This header is just 1 byte. 6 bits for the length and 2 for flags.
This structure is called IMAGE_COR_ILMETHOD_TINY. A tiny method must fulfill the following requirements.

Small method - IL code is max 63 bytes.

No Exception Handling

No local variables

The other structure is IMAGE_COR_ILMETHOD_FAT. It is a more complicated header, containing stack size, type information of local variables, and information about sub sections.
Usage of Exception Handling results in one or more extra sections. If you add prelude code you will have to update the start and end of the exception handling.
Having added only prelude code the addresses are easily adjusted by compensating for the size of the new IL.
Adding a few new IL code instructions in the middle, is more complicated.

Adding Prelude Code

There is already a CodeProject article describing how to add prelude code for managed methods
called Really Easy Logging using IL Rewriting and the .NET Profiling API.
The good part is that it is a simple and working sample. What the author does is dynamically allocating a string and a creating a
mdMethodRef to point to System.Console.WriteLine.
In the prelude code he puts the string on the stack and calls the mdMethodRef. Unfortunately, there is not so much "Rewriting" done at all.

My own contribution in the area of IL-rewriting is to show how to replace existing calls with calls to methods in external assemblies.

Replacing Existing Method Calls

Let's start! Below is the IL code from the method FatDateNow in
SampleApp1.exe:

The lines starting with "IL_00XX" is the method body. Everything before that is info coming from the Header.

I recommend accessing the header through COR_ILMETHOD_TINY and COR_ILMETHOD_FAT (from
corhlpr.h in the SDK),
those are two structures that contains accessor methods for the particular fields. This way you don't have to worry about bit shifting so much.

If we look to the left at the lines containing method calls, we can see that the IL code for doing method calls is 0x28.
It also takes 1 parameter called a token, which is of type mdMethodRef.

If you look close enough, the mdMethodRef has its first byte in
parenthesis. The mdMethodRef is an encoded token, the first byte refers to the module where it is located.
The rest of the number just seems to be a sequence number of method references of that module. This means that if you want to add a call to a method that is already referenced,
it is possible to reuse an mdMethodRef. Otherwise you will have to create one. Fortunately duplicates are also accepted if you don't care to do lookups.

References to methods in other assemblies must be of type mdMethodRef. If you call a method within the same assembly it is implemented/defined you can use the
mdMethodDef.

Looking Up an Existing mdMethodRef

I have added a function that looks up the mdTypeRef (class/struct) and the
mdMethodRef. In my example, I call it to look up the getter accessor for System.DateTime.Now.

FindTypeRef and FindMemberRef are methods that simply iterates over all types and all methods, until finding the token we are looking for.
The full source is included in the attachment. It would take up unnecessary space here.

The code for creating a new mdMethodRef is very specific to the method you are creating it for.
First of all it depends on the types of parameters and the type of the return value.
The types are encoded into a method signature, that identifies the method. It is similar to a function pointer type.
Casting a function pointer to a pointer that doesn't accept the same type of parameters makes no sense.
Secondly, if strong naming is used, one must also supply the public token for the assembly caf27b24caa5a188.

The method doesn't take parameters, but it does return a value. If the value is a primitive type like
int or double,
there is no need to specify a complementing type token. In this case the return type is a
DateTime, i.e. a class or struct.
This is described as ELEMENT_TYPE_VALUETYPE, since this is too little info for the CLR,
the type must be followed by a compressed token reference (replacing the four zeros).
The function CorSigCompressToken is available in the SDK in corhlpr.h:

What the code does, is test if the current OpCode is a function call (testing if the opcode is equal to 0x28)
if so we test if it is the one we are looking for (fromMemberRef) and replace it with the new one (toMemberRef), otherwise we skip to the next instruction.

The implementation of InstructionSize can initially be tricky. Instructions may have parameters and are therefore not of the same size.
In the appendix of the book Expert .NET 2.0 IL Assembler,
I found a list with all the instructions and information of how many parameters they expected.
With this information in hand, I did a function with a switch statement and a some if statements testing for certain bytecode ranges.
Probably not the nicest implementation you will see, but it served my purpose at the time.

Recently I found out that the information I found in the appendix of the book is also available in the .NET SDK.
On my machine I found the file at the following location: "C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Include\opcode.def".

Including this file in your project, and doing a home made macro, it should be possible to convert the file into an array of structs.
Here is a link to someone that has done it Thoughts on writing an IL Disassembler.
Unfortunately, he did that for a company and cannot release the source code. In some future, I might do it myself too.

Running the Demo

The demo consists of three executables

SampleApp1.exe - Prints the current time and a fixed time (18.15)

InterceptApp.exe - Takes a filename as a parameter, launches an managed app with profiling

Share

About the Author

Mattias works at Visma, a leading Nordic ERP solution provider. He has good knowledge in C++/.Net development, test tool development, and debugging. His great passion is memory dump analysis. He likes giving talks and courses.

I actually get the text logged and replacement for DateTime.Now works, all is good. The bad thing is that you must do the filtering, like done for ".ctor", for all the application functions that may call say

_log.Info("Some info")

It would be great if we can somehow solve this more elegantly or circumvent the problem. It is okay that no rewriting of IL code in log4net assembly is allowed. But not being able to invoke methods in other assemblies, which also invoke log4net methods, is a pain.

Sorry I didn't respond until now.
I was travelling when I read your first message.
I was just about to have a look.

I suspected that something got messed up IL-rewriting the logger,
which certainly uses the DateTime or some other primitive.
The current IL-rewriter tries to replace all occurrences,
which probably isn't a good strategy.

I realised some time ago, that it is hard to create a generic IL-rewriter.
You IL-Rewrite for a purpose. If you just replace all calls globally,
especially method alls concerning the system libraries,
you kind of lose control of what you are doing.
Static IL-Rewriting is in this case safer.

First of all, a logger probably should be left unmodified,
since a logger probably needs time to move forward, rather than being fixed.
Secondly. The IL-rewriter was an experiment.
I have not tested this code very thoroughly

When I experimented with some prelude code,
I also used string compares to exclude and include certain assemblies.

One should probably elaborate on some kind of .config file,
where you explicitly specify the assemblies that you want to be modified.
While one is at it, add a mapping section for the methodRefs.

I downloaded log4net source, thinking that it may be doing something 'special' at initialization time, but that turned out not to be true.

In the end, it was just some missing opcodes for storing and loading fields.
Adding the piece below to InstructionSize member in OpCodeParser.cpp fixed the issue - I can now call log4net from any method, without including the method in the list for rewriter to ignore.

Even though I mentioned the opcode.def file in my article, I didn't use it myself.
I took the opcodes from the book, I must have missed that range.
The book also has the limitation that it only cover.Net 2.0 IL code.

One extension that would be welcome would be combining yours and Eric's approach and having a method prolog that calls some external function, but then invokes the old implementation, rather than completely substituting it .

If one implements a multi-pass profiler.
One could at the first pass, add a hit counter on all calls and measure the time (min/max/average)
Looking at this data, one most probably can identify long running functions,
or functions that are called too often. This would be very useful indeed.
The speed would also be acceptable I think.

On a second pass/run.
One could tell the profiler to look closer on a limited number of functions (10 or so).
Every time, one of those functions are hit, one could log stack trace,
check memory consumption, etc.

Sometimes it is hard to know exactly where some strings gets allocated.
Memory leaks are can be tricky to find in a .Net app.
I have experienced some myself working with WPF and the dependency property.
In a programs there can be tens of thousands, or even millions of managed strings.
It would be a nice feature to log a stack trace triggered by pattern matching of a string;

God! Now i got inspired again!
I am currently doing some coding for an arcade game on the .Net Gadgeteer (.Net micro)
I must put this on my ToDo list. This is cool stuff.

There are lots of great open source projects doing magic with IL-Rewriting.
Cecil is one example, but it is about static rewriting, and the source code is in managed code.
Dynamic rewriting is different, it must be done in unmanaged code.

This project was mostly exploratory development with lots of crashes during the journey.
The CLR was not very forgiving when I got it wrong.