Anticipation

Unmanaged callbacks across AppDomains

In one of the projects I’m working on I hit a fairly nasty problem involving (as might be obvious from the title) both AppDomains and unmanaged code calling back into managed code. But first, a little background.

It all started with an unmanaged C++ class library. We’ve been using it for a while from other unmanaged C++ applications, but since we’ve been writing most new applications in C#.NET, the time had come to bring the library (kicking and screaming) into the .NET world. Naturally, the way forward was to write a compatibility layer in C++.NET, thereby taking advantage of its IJW framework (which makes C++ the best .NET language to use to interface with native code). (.NET 2.0, of course. Don’t even get me started about some of the problems interfacing native & managed code in 1.x.)

One of the key ingredients was an unmanaged callback. Since some of the work of the unmanaged base library was asynchronous, we gave it an object to call back through in order to tell us when the results were ready — meaning that the unmanaged code needs to call into the managed code. This is normally simple enough — while unmanaged classes can’t hold on to managed objects directly, they can through a GCHandle (or the equivalent helper class, gcroot). The code went something like this (paraphrased to protect the guilty):

So, all well and good so far, right? No such luck. At first this was all going smoothly; the code was coming together and manual testing of the application showed that it was communicating properly with the library and with native applications that were also using the library. Once the proof of concept was in place it was time to ensure nothing went wrong later on through the use of unit testing. This is where the problems began, however.

The unit tests in question were written using the NUnit framework, which runs the tests in a separate AppDomain. This is useful because it permits the test runner to treat the application/library under test as a plugin — keeping it loaded only while actually running the tests, thereby allowing it to be recompiled and have the tests run again without having to exit the test runner.

The problem is that AppDomains are purely a managed construct. They’re intended to keep sections of managed code mostly isolated from each other (as in the above plugin-like case), and as such objects usually only exist in one AppDomain at a time. Unmanaged code of course knows nothing about any of this. Consequently, when calling managed code from unmanaged code, the compiler has to pick one AppDomain to use, and it appears to pick the first one. This is fine for most applications, since normally apps only use one AppDomain — which is why this code was working at first.

But when running the unit tests, it was executing in a second AppDomain, and the callback failed. Specifically, when trying to access the m_Managed object (from the gcroot, which you’ll recall is a GCHandle) an exception was thrown saying “Cannot pass a GCHandle across AppDomains”.

The solution is to use delegates. They’re not just function pointers — they also contain an object reference, a few other odds and ends, and (most importantly for us) a reference to the AppDomain that created it. However there’s a catch. Delegates themselves are managed objects, and so would have to be stored in a gcroot if held in unmanaged code — and we already know that we can’t access anything in a gcroot outside its original AppDomain.

Fortunately there’s a loophole: delegates can be marshalled into unmanaged function pointers (via a thunking layer). There is a downside to this though. Since the method signatures must match as closely as possible, and since a purely unmanaged function pointer can’t have anything to do with managed objects, the parameters must be restricted to native types. This required a bit of a redesign, but fortunately since you’re coming in from unmanaged code all the data you’re dealing with is going to be native anyway. So here’s the redesigned code. It’s possible there’s still something that could be improved in it; but this one does the trick, and maybe it’ll help someone else who has been struggling with this issue