.NET - COM Interoperability

This paper provides a technical overview of .NET and COM interoperability.

Summary

This paper provides the technical overview of .NET and COM interoperability. It describes how .NET components can communicate with existing COM components without migrating those COM components into .NET components, thus helping the migration cost and business systems. This paper also provides an overview of marshalling. The intended audience is a development team that wishes to interact with COM and .NET applications. (This paper assumes that the reader has the fundamental knowledge of COM and .NET)

Introduction

From the time Microsoft engineers started working on the ideas behind COM in 1998, COM went through quite an evolution. Once .NET was released everything was about the CLR. Those business systems made lot of investments on those COM developments and they may not be willing to invest more money to build their components into .NET. Also this will make a severe impact in productivity.

Fortunately, switching from COM to .NET involves no such radical loss of productivity. The concept of providing bridge between .NET and COM components is .NET-COM interoperability. Microsoft .NET Framework provides system, tools, and strategies that enable strong integration with past technologies and allow legacy code to be integrated with new .NET components. It provides a bridge between the .NET and COM and vice versa.

There are two key concepts that make it much easier to move from COM development to .NET development, without any loss of code base or productivity.

Interaction with COM components from .NET

Interaction with .NET components from COM

Before going further, this paper will describe about the basic communication fundamentals of COM and .NET components.

Communication between Object and Client

COM is a binary reusable object which exposes its functionality to other components. When a client object asks for instances of server object, the server instantiates those objects and handout references to the client. So, a COM component can act as a binary contract between caller and callee. This binary contract is defined in a document known as Type library. The Type library describes to a potential client the services available from a particular server. Each COM components will expose a set of interfaces through which the communication between COM components will occurs.

The following diagram shows the communication between a client and a COM object.

Fig.1 Communication between client and a COM object

In the above figure the IUnknown and IDispatch are the interfaces and QueryInterface, AddRef, Release, etc., are the methods exposed by those interfaces.

The communication between the .NET objects occurs through Objects, there are no such interfaces for communication. So, in .NET component, there is no type libraries, instead they deal with assemblies. Assembly is a collection of types and resources that are built to work together and form a logical unit of functionality. All the information related to the assembly will be held in assembly metadata. Unlike the communication between COM components, the communication between .NET components is Object based.

Calling COM components from .NET Client

Generally COM components will expose interfaces to communicate with other objects. A .NET client cannot directly communicate with a COM component because the interfaces exposed by a COM component may not be read by the .NET application. So, to communicate with a COM component, the COM component should be wrapped in such a way that the.NET client application can understand the COM component. This wrapper is known as Runtime Callable Wrapper (RCW).

The .NET SDK provides Runtime Callable Wrapper (RCW) which wraps the COM components and exposes it into to the .NET client application.

Fig.2 calling a COM component from .NET client

To communicate with a COM component, there should be Runtime Callable Wrapper (RCW). RCW can be generated by using VS.NET or by the use of TlbImp.exe utility. Both the ways will read the type library and uses System.Runtime.InteropServices.TypeLibConverter class to generate the RCW. This class reads the type library and converts those descriptions into a wrapper (RCW). After generating the RCW, the .NET client should import its namespace. Now the client application can call the RCW object as native calls.

When a client calls a function, the call is transferred to the RCW. The RCW internally calls the native COM function coCreateInstance there by creating the COM object that it wraps. The RCW converts each call to the COM calling convention. Once the object has been created successfully, the .NET client application can access the COM objects as like native object calls.

Calling .NET components from COM Client

When a COM client requests a server, first it searches in the registry entry and then the communication starts. Calling a .NET component from a COM component is not a trivial exercise. The .NET objects communicate through Objects. But the Object based communication may not be recognized by the COM clients. So, to communicate with the .NET component from the COM component, the .NET component should be wrapped in such a way that the COM client can identify this .NET component. This wrapper is known as COM Callable Wrapper (CCW). The COM Callable Wrapper (CCW) will be used to wrap the .NET components and used to interact with the COM clients.

CCW will be created by the .NET utility RegAsm.exe. This reads metadata of the .NET component and generates the CCW. This tool will make a registry entry for the .NET components.

Internally, when coCreateInstance is called, the call will redirect to the registry entry and the registry will redirect the call to the registered server, mscoree.dll. This mscoree.dll will inspect the requested CLSID and reads the registry to find the .NET class and the assembly that contains the class and rolls a CCW on that .NET class.

When a client makes a call to the .NET object, first the call will go to CCW. The CCW converts all the native COM types to their .NET equivalents and also converts the results back from the .NET to COM.

Programming model comparison of .NET-COM interoperability

The following table compares the .NET and COM based component programming models.

.NET

COM

Object based communication

Interface based communication

Garbage Collector to manage memory

Reference count will be used to manage memory

Type Standard objects

Binary Standard objects

Objects are created by normal new operator

Objects are created by coCreateInstance

Exceptions will be returned

HRESULT will be returned

Object info resides in assembly files

Object info resides in Type library

Before the application starts to communicate, there are some technical constraints associated with this. When an object is transmitted to a receiver which is in a separate machine/process (managed/unmanaged) space, the object may need to undergo a transformation according to the native type to make it suitable for use by the recipient. That is the object will be converted into a recipient readable form. This process of converting an object between types when sending it across contexts is known as marshaling. The next section of the paper will gives an overview of marshalling in .NET.

.NET Marshalling

Thus .NET runtime automatically generates code to translate calls between managed code and unmanaged code. While transferring calls between these two codes, .NET handles the data type conversion also. This technique of automatically binding with the server data type to the client data type is known as marshalling. Marshaling occurs between managed heap and unmanaged heap. For example, Fig.4 shows a call from the .NET client to a COM component. This sample call passes a .NET string from the client. The RCW converts this .NET data type into the COM compatible data type. In this case COM compatible data type is BSTR. Thus the RCW converts the .NET string into COM compatible BSTR. This BSTR will be passed to the object and the required calls will be made. The results will be returned to back to the RCW. The RCW converts this COM compatible result to .NET native data type.

Fig.4 Sample diagram for marshalling

Logically the marshalling can be classified into 2 types.

Interop marshalling

COM marshalling

If a call occurs between managed code and unmanaged code with in the same apartment, Interop marshaler will play the role. It marshals data between managed code and unmanaged code.

In some scenarios COM component may be running in different apartment threads. In those cases i.e., calling between managed code and unmanaged code in different apartments or process, both Interop marshaler and COM marshaler are involved.

Interop marshaler

When the server object is created in the same apartment of client, all data marshaling is handled by Interop marshaling.

Fig.5 Sample diagram for same apartment marshalling

COM marshaler

COM marshaling involved whenever the calls between managed code and unmanaged code are in different apartments. For eg., when a .NET client (with the default apartment settings) communicates with a COM component (whichever developed in VB6.0), the communication occurs through proxy and stub because both the objects will be running in different apartment threads. (The default apartment settings of .NET objects are STA and the components which are developed by VB6.0 are STA). Between these two different apartments COM marshaling will occurs and with in the apartment Interop marshaling will occurs. Fig.6 shows this kind of marshaling.

This kind of different apartment communication will impact the performance. The apartment settings of the managed client can be changed by changing the STAThreadAttribute / MTAThreadAttribute / Thread.ApartmentState property. Both the codes can run in a same apartment, by making the managed code’s thread to STA. (If the COM component is set as MTA, then cross marshaling will occurs.)

Fig.6 Sample diagram for cross apartment marshalling

In the above scenario, the call with in different apartments will occur by COM marshaling and the call between managed and unmanaged code will occur by Interop marshaling.

Conclusion

Thus the communication between .NET applications and COM applications occurs through RCW and CCW.

As you have seen, COM applications can implement .NET types to achieve type compatibility or a .NET type can implement COM interfaces to achieve binary compatibility with related coclasses.

Although the managed clients can interact with the unmanaged objects, the managed client expects that the unmanaged object should act exactly the same as managed object.

When developing against the unmanaged component through COM interoperability, managed code developers will not be able to use some features of .NET like parameterized constructors, static methods, inheritance, etc., migrating an existing component or writing a managed wrapper will make the component easier to use for managed code developers. In some cases, the developer wants to migrate parts of the application to .NET so that application can take advantage of the new features that the .NET Framework offers. For example, ASP .NET provides advanced data binding, browser-dependent user interface generation, and improved configuration and deployment. The designer should evaluate when the value of bringing these new features in to the application outweigh the cost of code migration.

Bkgrnd: I have written a c# library and I attempt to use it in VBA (MS-Access).

When I let the C# (VS2005) compiler register the library as a COM library, by clicking on the "Register for COM interop" checkbox (in the Project's Properties.Build sheet) then the library is accessible from MS-Access VBA code.

Issue: this solution works on my machine only.

When I try to use the RegAsm utility I am not successful to run the library from an MS-Access client. From VBA the Debug.Compile <application> compiles fine but at runtime I get a file not found error (Run-time error 80070002)

Q: Do you have any clues to what settings I need to use with the RegAsm.exe utility to get the .Net (C#) library to work from an MS-Access VBA client?

I have some problems using interfaces across STA apartments of the same process in .NET. Calls are made, no errors are reported, but they fail to perform their actions. Is there a special technique of marshalling COM interfaces in .NET, in this case?

I have a question. What happens if you have a COM Client (i.e. COM is calling an .NET assembly using a CCW) and the .NET component (with the CCW) has references to another .NET Component. Does the 'dependancy' assembly require the CCW or an RCW? or can these 2 assemblies communicate in their native tongue (i.e. .NET).

Here is a problem I am trying to solve at the moment: I have Assembly1 which has a CCW (it is a "Custom Command" that can be called by a COM application). The second Assembly (Assembly2) contains generic methods that can be called by all kinds of different "Custom Commands". This way I get decent code re-use (all the common methods are stored in Assembly2).

I will try to 'diagram' it here:

COM (Client) -> CCW -> .NET Assembly1 -> .NET Assembly2

So if .NET Assembly1 has a reference to .NET Assembly2 can they "talk to each other" using .NET, or do they need a proxy (i.e. RCW or CCW) to communicate also?

What I am asking is... is the following config required?

COM (Client) -> CCW -> .NET Assembly1 -> RCW -> .NET Assembly2

Also, what happens if Assembly2 has other dependancies (for example Oracle.DataAccess.Net) Do these also need a wrapper?

I currently have a situation where another developer is trying to use my common code (i.e. he's trying to make his own Assembly1 which references the common methods I've made in Assembly2) but he can't seem to instantiate a class in Assembly2 because he's getting a System.IO.FileLoadException when Assembly2 is referring to the Oracle.DataAccess.dll. I've checked the HRESULT in WinError.h and it is "NTE_PROVIDER_DLL_FAIL" (i.e. "DLL Failed to initialize correctly").

Not sure if this is a COM Interop problem, or related to the fact he has 2 Oracle homes (he has Oracle 7 - installed first) and Oracle 9. I'm thinking it could also be a CAS (Code Access Security) problem? (i.e. Assembly1 has CCW so it can't call/reference/instantiate a .NET assembly which refers to another .NET assembly - Oracle.DataAccess.dll)... If he runs the Assembly2 code on his machine which calls a stub.exe (written in .NET) he can use the Oracle.DataAccess.dll no problem - but when it's in .Dll form, he gets the System.IO.FileLoadException error (above).