Bringing CLR’s Power to non .NET languages - Part 1

Bringing CLR’s Power to non .NET languages - Part 1

The .NET Framework helps developer boost their productivity by giving them a set of tools and libraries they need to quickly start implementing the core of their software without loosing time on details. The idea behind the Common Language Infrastructure (CLI) is to allow programming languages to interoperate by sharing code through libraries.

Most of these languages when interpreted are implemented in .NET languages and can easily interact with the underlying platform. For compiled languages they generate IL and the result is a .NET assembly.

In the same way you can dynamically load symbols from a shared library, there are times you would like to be able to interact with the .NET Framework from non .NET languages such as C, C++, Java, Perl or PHP.

Through a short series of articles I am going to show you how to achieve this goal and get the power of .NET in all your favorite languages in less than 500 lines of code.

Back to the origin…

Like for software components at a higher level, the best way to make programming languages interoperate with each other is to define some kind of interfaces. If we look what has been done in the past with native languages we can consider as part of this interface: the processor architecture and its instruction set, the binary format with the way symbols are exported, the type sizes and the conventions to call code.

Assuming that your binaries respect this same interface they can freely share symbols. We are going to respect this interface in order to build a bridge library between the native and the .NET worlds. This native library would be then accessible from non .NET languages.

C++++++…

But don’t worry I have good news for you, a nice version of C++ exists in this world! It is C++/CLI! As you probably guessed this version of C++ is targeting the Common Language Infrastructure. Besides the fact it fills some C++ lacks such as automatic memory management and boilerplate code generation for data member accessors, one of the good thing is that it provides you a seamless and simple way to mix native and managed code. Moreover once compiled the binary has both native and managed sections and can export native symbols such as functions using CLR objects.

As a solution for our problem I then suggest to use C++/CLI to build an interoperability DLL, whose functions are dynamically loadable by non .NET languages and which will allow us to use the .NET Framework.

Bridge the two worlds

Exporting the API

To export our DLL symbols we define a macro CLR_API:

Notice that we use extern “C” to avoid C++ symbol names mangling.

Mapping primitive types

First let’s make some typdefs for primitive types. We do so to ensure that primitive types conserve the same size between architectures. For example a long size is 32 bits on a 32 bits architecture while being 64 bits on a 64 bits architecture. A long long is however always 64 bits which matches the CLR long type alias to Int64.

Note that t_decimal, t_char, and t_string don’t match their CLR counterparts, but exists only for the native to native interoperability interface. Bytes strings are used to ensure that languages that don’t support Unicode can still use the interface. The null macro gives C++ a C# flavor, however C++/CLI null pointer is not 0 but the special value nullptr.

Release dynamically allocated memory

C++/CLI will automatically to the marshaling necessary to map our primitive types however we need to handle the case of CLR object references. In C++/CLI CLR object references are accessed through handles. Our native interface to be exported has to hide these CLR object references. The best way to do it is to wrap CLR object handles within plain native C++ objects. Doing so brings a little complexity, since while CLR object are released by the CLR garbage collector, dynamically allocated native objects have to be released manually through the use of the delete operator.

We don’t know when the language using our library will need to release our objects so one of the best solution here, is to use an object reference tracking method such as reference counting used by COM objects. The class below defines the C++ methods needed for our reference counting.

We define two functions to use these references in our interface:

Wrapping CLR objects

Ok we can have a solution for references, we now need to find a way to wrap CLR objects into a native type inheriting from CLRReference. This is done thanks to the use of the following C++ class template:

CLRWrapper<T> wraps the CLR object handle in its private data member value. The gcroot<T> class template provided by C++/CLI allows us to use CLR object as members for native classes by pinning the handle and preventing the garbage collector to play with our reference in memory. As an example of use CLRWrapper<System::Object>* is a wrapper pointer to a the CLR Object type, and the GetValue getter method can be used to get the wrapped CLR object value.

Boxing/Unboxing primitive types

As you will see in a few paragraph to be generic our interface is going to rely on the fact that all CLR objects inherit from the Object class, and boxing and unboxing will be necessary for primitive types, so let define converter functions to do so:

Catch exceptions

While working within the CLR world exceptions might occur, we need a way to represent and handle possible exceptions in our interface, this is done thanks to the template CLRResult<T> class below which inherits from CLRWrapper<T>:

We define two functions in our interface to work with this class instances:

If the C function CLRResultIsSuccess returns true, no exception occurred and the internal exception wrapper pointer will be null, the latter value available otherwise.

Calling code in

In programming everything is basically code executed thanks to processor instructions, or data located in some memory. In the previous sections I mainly wrote about data with primitive types and objects, it is now time to talk about code.

The main purpose is to be able to call CLR methods on CLR objects from a native interface. There are several way of doing this, I chose the solution which gave me the best performances. This solution consists in generating on the fly a delegate, which captures a call to the target method. Once this delegate created it can be cached, and reuse as much as you want without needing to resolve the method on the type again. Therefore once cached the performance are really good and the time needed to call the delegate is nearly the same as if you were calling it directly from .NET! The on the fly generation is possible thanks to the System::Reflection::Emit namespace classes.

Let’s have a look at the interface we are going to implement:

The first function CLRMethodGet is doing the job of resolving the method on a particular type with the specified name and accepting the given parameter types then it generates a delegate of type Method invoking this particular method. Notice that it returns a CLRResult<T> instance since an exception might occur if no method matching the prototype is found on type.

The second function invokes the delegate of type Method given an instance object and a list of arguments, the result of the call is then returned a possible exception might have occurred during the call and can be checked.

To handle all possibilities with these two functions the rules are:

For CLRGetMethod the target method name should be:

.ctor for a constructor.

get_<name> for a property getter.

set_<name> for a property setter.

add_<name> for an event adder.

remove_<name> for an event remover.

For CLRInvoke the instance object may be null for static methods.

Calling code out

The above sections explains how to call .NET methods from a native interface. What about calling native functions from the .NET world? I mean let say we want one of our target language function to be an event handler to some .NET object events.

For this purpose we declare the following function in the interface:

It creates a delegate with the given type from the specified native function pointer.

We are almost done !

As you perhaps noticed we are close to the end, we are however still missing something, the function GetMethod needs Type and Array of Types wrappers as arguments, and CLRInvoke needs an Array of Objects wrappers as argument. Where are we getting CLR type from? From assemblies of course. So we need a way to load assemblies, get type from them based on their names and a way to create and populate CLR arrays.

Done! With these about 40 native C functions we are able to interoperate with the .NET Framework! And some of them could have been defined in term of CLRGetMethod and CLRInvoke within the target language rather than C++/CLI.

End of part 1

I am going to publish the complete sources as soon as possible on Codeplex.

Although this interface can be used directly from C or C++. My next article will show an example of use of this interface from a non .NET exotic language: Racket.