Introduction

You can download the demo and have a try, it demonstrates modifying .Net method code during runtime.

Supports variants of .NET versions from 2.0 to 4.5

Supports variants of methods to be modified, including dynamic methods and generic methods.

Supports release mode .NET process.

Supports both x86 & x64.

Modifying .NET methods' MSIL codes during run-time is very cool, it helps to implement hooking, software protection, and other amazing stuff. That's why I want it, but there is a big challenge on the road -- the MSIL code could have been complied to native code by JIT-complier before we have a chance to modify; also the .NET CLR implementation is not documented and it changes during each version, we need a reliable and stable way without dependency to the exact memory layout.

Anyway, after more than one week research, finally I made it! Here is a simple method in the demo problem:

Certainly it returns "Number 1 is less than 2"; let's try to make it return the incorrect result "Number 1 is greater than 2 (O_o)".

Looking at the MSIL code for this method, we can do it by changing the opcode from Bge_S to Blt_S. And then the jump works in a different logic which returns in a wrong result, that is what I need.

And if you try in the demo application, it shows a wrong answer as below.

Here is the code replacing the IL, I assume there are enough comments between the lines.

Hook .NET Method

According to the explaination in ECMA-335, OpCode jmp is used to transfer control to destination method. Unlike OpCode call, current parameters are transferred to destination method -- this is much simpler.

For example, there is a method declared in the sample app.

string TargetMethod(string a, string b)
{
return a + "," + b;
}

To hook above method, first declare the destination method with the same parameters and return type.

The InjectionHelper.Initialize method loads the unmanaged injection.dll from the directory of the current assembly directory, so all the related files need to be there, or you can modify the code to change the location.

Modify IL code for JIT-complied methods

Now we are here, the compileMethod method above won't be called by CLR for the JIT-compiled method. To solve this problem, my idea is to restore the data structures in CLR to the previous status before JIT-compliation. And in this case, complileMethod will be called again and we can replace the IL.

The above diagram is a bit out of date, but the primary structure is the same. For each "class" in .NET, there is at least one MethodTable structure in memory. And each MethodTable is related to a EEClass, which stores the runtime type information for Reflection and other use.

For each "method", there is at least one corresponding MethodDesc data structure in memory containing the information of this method like flags / slot address / entry address / etc.

Before a method is JITted-complied, the slot is pointed to a JMI thunk (prestub), which triggers JIT compliation; when the IL code is complied, the slot is rewritten to point to the JMI thunk, which jumps to complied native code directly.

To restore the data structure, first clear the flags, then modify the entry address back to a temporary entry address, and so on. I successfully did that in the debugger by modifying the memory directly. But this is messy, it depends on the layout of the data structures, and the code is unreliable for different versions of .NET.

I was seeking a reliable manner, and luckily, I found the MethodDesc::Reset method in SSCLI source code (vm/method.cpp).

void MethodDesc::Reset()
{
CONTRACTL
{
THROWS;
GC_NOTRIGGER;
}
CONTRACTL_END
// This method is not thread-safe since we are updating
// different pieces of data non-atomically.
// Use this only if you can guarantee thread-safety somehow.
_ASSERTE(IsEnCMethod() || // The process is frozen by the debugger
IsDynamicMethod() || // These are used in a very restricted way
GetLoaderModule()->IsReflection()); // Rental methods
// Reset any flags relevant to the old code
ClearFlagsOnUpdate();
if (HasPrecode())
{
GetPrecode()->Reset();
}
else
{
// We should go here only for the rental methods
_ASSERTE(GetLoaderModule()->IsReflection());
InterlockedUpdateFlags2(enum_flag2_HasStableEntryPoint | enum_flag2_HasPrecode, FALSE);
*GetAddrOfSlotUnchecked() = GetTemporaryEntryPoint();
}
_ASSERTE(!HasNativeCode());
}

As you can see above, it is doing the same thing for me. Hence I just need to invoke this method to reset the MethodDesc status to pre-JITted.

Certainly I can't use the MethodDesc from SSCLI, and the MethodDesc is internally used by MS, whose exact implementation and layout are unknown to everyone except Microsoft.

After endless mountains and rivers that leave doubt whether there is a path out, suddenly one encounters the shade of a willow, bright flowers, and a lovely village.

Fortunately the address of this internal method exists in the PDB symbol from Microsoft Symbol Server, and it solves my problem. The Reset() method's address in the CLR DLL can be known by parsing the PDB file!

Now only one mandatory parameter is left -- the this pointer of MethodDesc. It is not hard to obtain this pointer. Actually MethodBase.MethodHandle.Value == CORINFO_METHOD_HANDLE == MethodDesc address == this pointer of MethodDesc .

The static variables above store the addresses of the internal methods from the MethodDesc implementation from the CLR DLL. And they are initialized when my unmanaged DLL is loaded. And the public members just call the internal method with the this pointer.

Find internal methods' addresses from the PDB Symbol file

The internal method's virtual addresses can be known from PDB symbol file. With the virtual address, we can know the method real address by plus the base address of the DLL.

Method Address = Method Virtual Address + base address of dll.

In previous version, the PDB file is downloaded and parsed locally with Microsoft symcheck.exe.

In the current version, I have made a web service to parse the addresses on the server and return the virtual addresses to clients. This will reduce the initialization time.

Further, after collecting most of the virtual addresses, the virtual addresses for different binaries are stored in the DLL resource. And during initialization the injection.dll will first lookup the virtual addresses locally, and only request the web service if the virtual addresses for current binaries are not found. In this case, the web service will only be a backup when virtual addresses can not be found.

Reset the MethodDesc to pre-JITted status

Now everything is ready. The unmanaged DLL exports a method for managed codes, accepts the IL codes and MethodBase.MethodHandle.Value from the managed code.

Generic method

A generic method is mapped to a MethodDesc in memory . But calling the generic method with different type parameter may cause the CLR to create different instantiations of the definition method. (The instantiation may be shared, you can see the types of generic method instantiation below).

shared generic method instantiations

unshared generic method instantiations

instance methods in shared generic classes

instance methods in unshared generic classes

static methods in shared generic classes

static methods in unshared generic classes

The following line is a simple generic method defined from the demo program

string GenericMethodToBeReplaced<T, K>(T t, K k)

Calling GenericMethodToBeReplaced<string, int>("11", 2) for the first time, CLR creates an InstantiatedMethodDesc instance( sub-class of MethodDesc and its flag is marked with mcInstantiated ), which is stored in InstMethodHashTable data structure of the method's corresponding module.

Hence, we need find and reset all of the InstantiatedMethodDesc of the generic method so that we can replace the IL code without missing.

From SSCLI source code (vm/proftoeeinterfaceimpl.cpp), there is a class named LoadedMethodDescIterator can be used. It accepts 3 parameters, and search the instantiated methods in given AppDomain and Module by MethodToken.

So, we need detect the current .Net Framework version and invoke the correct method. The primary problem comes from .Net 4.5, which is an in-place upgrade of .Net4.0. Hence, in the demo code, this is done by detecting the CLR binary version number.

Points of interest

Compilation optimization

I found that if the method is too simple and the IL codes are only several bytes, the method may be complied as inline mode. And in this case, Reset MethodDesc does not help anything because the execution even doesn't reach there. More details can be found in CEEInfo::canInline, (vm/jitinterface.cpp in SSCLI)

Dynamic method

To update the IL code of a dynamic method we need to be very careful. Filling incorrect IL code for other kinds of methods only causes an InvalidProgramException; but incorrect IL code in a dynamic method can crash the CLR and the whole process! And IL code for a dynamic method is different from that for others. Better to generate the IL code from another dynamic method and then copy and update.

Inject a running .Net process

To modify a running .Net process without source code, you can first inject your own .Net assembly into the target process via using RhInjectLibrary from EasyHook. Then after the .Net assembly is loaded in the target process, call the InjectionHelper.UpdateILCodes to update the target method. More information about EasyHook can be found in its documentation.

I'm wondering what the limits of applicability in production scenarios are.

First of all, CLR 4 boasts some of re-JIT support, at least in profiler scenarios. Does it mean it's dropping the whole chain of JIT-inlining, transitively (or just disables inlining)? Is it possible to call that routine from the injection code?

In CLR 2 would probably fail in inlining scenarios, alas. I'm also not sure if native pointers to JITted method bodies might be used explicitly -- maybe cached in native code, maybe used as a jump target in a method thunk when you make an IntPtr out of a delegate.

If you choose to unload an appdomain with an injected method, would there be a memory leak?

Are there any treading issues if it's possible for the method being instrumented to be called from another thread at the same time?