COM Interop in C# 4.0

COM Interop in C# 4.0

Wow, it’s been a while since I’ve last posted! Don’t worry, I’m still alive and kickin’, and we’re still workin’ on cool stuff for y’all to use.

Let’s take a bit of a recap of how far we’ve come. We’ve chatted about dynamic binding in C# and how that all plays in with the DLR, and about named and optional arguments and how they change the way methods are bound. The only other major piece in C# 4.0 is this notion of COM interop. We chatted about how dynamic really is a gateway to interop with different object models and languages (ie interacting with dynamic languages, dynamic object models, javascript objects, HTML DOM etc), but in C# 4.0, we want to go a bit further and provide you a few more tools to help make your interop life much easier.

These remaining features that we’ll chat about all have a strong tie to the COM world – that is, the features themselves require that the objects that you’re playing with are COM objects. How do we determine that? Well, you’ll soon find out!

Keep on rockin’ in a… COM… world!

Alright, I admit it – I’m a Neil Young fan. The man wrote some great tunes! Anyway, the point is that Rock’n Roll ain’t goin’ nowhere. And either is COM. No matter how hard we try to get rid of it, it just won’t die! So we decided this time around that instead of trying to beat ‘em, we might as well join ‘em.

We’ve therefore created several features that are geared towards making your COM programming experience much easier. First let me list them:

The first two of these are comparatively smaller features from a complexity standpoint, so we’ll tackle those one at a time. We’ll discuss the No PIA feature in length in the several posts following.

Passing non-ref arguments to by-ref parameters

The first feature is really an acknowledgement that the APIs generated for COM interop are quite poor. For those of you who have worked in any amount of detail with the Office PIAs, you’ll quickly realize that for some reason, just about everything is passed around by ref.

This is such typical code! I have to struggle with the type system to make it happy, just to add a simple Word document!

C# 4.0 makes this easier. The compiler will now determine that you’re working with a COM object by looking at the attributes of the type of that object, and checking to see if it has the [ComImport] attribute. Once it determines that you indeed are working with a COM object, it then gives you the ability to pass arguments to the method, index or property (yes, properties can have arguments a la indexed properties! We’ll talk about that later!) without giving them by ref.

This is really just compiler syntactic sugar – the compiler does the work to generate the temporaries for you, and slot them in place of the actual arguments. That means that in the following code, call (1) gets transformed into call (2).

The great thing about this too, is that with the introduction of named and optional arguments, and using the fact that the feature generates Type.Missing in place of default values for object on COM types, we can simply remove the arguments altogether!

Pretty cool stuff huh? Definitely makes programming against the Office APIs much nicer. The added bonus is that the IDE helps you out by letting you know that the parameters are optional, indicating that you can omit them, and indicating what the default value used in place will be.

So why now?

Let’s get into a bit of a philosophy discussion now, about why we’re doing all these different COM interop features now. In the past, we’ve been asked for these features – nay, we even pushed back against some of them.

For example, indexed properties is something that the VB language has had support for for quite some time now, but C# had decided that they weren’t the right way to go. Our standpoint was (and still is!) that the programming paradigm ought to be that the property is accessed, and that it is the thing that should supply the indexing.

So why are we adding all these features in now?

Well, for starters, COM’s pretty entrenched in the application programming world today. Many developers are having to struggle with the COM APIs and Office programming models every day, and it just doesn’t look like that model is going to be replaced, or will go away any time soon.

Next, C# 4.0 is really an interop release. Our focus this time around was to be able to interop with different programming languages and programming models. It seemed only fitting that one of the largest programming models still out there ought to be pretty high up on our list of priorities.

Dynamic binding allows for ease of interop with other object models and dynamic languages. The ability to use named and optional arguments allows ease of interop with legacy libraries like COM which have a lot of optional parameters, and large parameter lists. The introduction of the DLR, whose sole purpose is to provide a common runtime for interop with dynamic languages. All of these point us towards interop, and combined with the knowledge that COM interop has been a big pain point for our users really tipped the scale to pushing more interop features out this time around.

Agree? Disagree? As always, with philosophy things (and with everything else for that matter), I’d love to get your feedback. Until then, happy coding!

I wish - I REALLY wish - that C# could stick to its design principles and not compromise for the sake of interoperating with languages and platforms that (IMHO) have made poor design choices.

However, I am forced to agree that COM isn't going away (as much as I wish it would), and many people still have to interoperate with it (as much as they probably wish they didn't). The only other thing I might consider is pushing these "features" into a secondary layer of the language that would have to be explicitly enabled (a la "Allow unsafe code"). But I can understand why that would be considered too extreme for this case (although I would be interested in knowing if something like that were even considered).

Ultimately, I don't like it, because I know that these "features" won't be confined to the usages for which they were intended, to the detriment of us all. But the world is imperfect, and we have to live in it.

> Ultimately, I don't like it, because I know that these "features" won't be confined to the usages for which they were intended, to the detriment of us all.

The good thing about all these changes is that they _only apply to COM objects_! I.e., to interfaces and classes with ComImportAttribute. So it isn't a major compromise - you still can't pass values to ref-arguments on POCOs, or declared named indexed properties, etc. It's strictly an interop thing, much like dynamic.

Dynamic may be "strictly an interop thing" by design, but in practice it can be used anywhere on any object, regardless of its origin. In my opinion this is a serious design flaw that will have long lasting negative consequences.

However, it does appear that these changes designed for COM interop will only apply to COM objects, so the negative impact will be limited.

Based on your statement, is it fair to say that C# 4.0 will have intrinsic support for indexed properties? If so, does that mean that Extension Properties would be back on the table?

As I understand it, the primary reason that Extension Properties are vetoed is because C# would have to support indexed properties and, as you've stated, the C# team thinks that is ill advised.

Extension Properties could be immensely useful and should have the same effective run-time capabilities as Extension Methods, since properties are ultimately a facade for get_xxx/set_xxx methods.

Working with the Silverlight BCL has proven to have great need for Extension Properties in order to duct tape types that are missing capabilities commonly found in the Windows-based BCL. For example, System.Threading.Thread.CurrentPrincipal was omitted.

It's a shame that this is required - it makes C# feel so "dirty" - but I can see why it's needed. It may have been possible to create .NET-friendly wrappers for the Office PI(T)A, but versioning would probably be a nightmare, and it wouldn't help with third-party COM libraries.

If you've actually looked at what's in PIA assemblies, you'll see that there's no real code there - just interface declarations (and structs, where they apply). The framework itself does the interop.

What NoPIA means, effectively, is that you no longer need those PIA assemblies with interfaces. They get embedded directly into assemblies that use COM objects, and CLR ensures that such embedded types in different assemblies, even though technically distinct from CLR point of view, are compatible and can be substituted one for another when they actually correspond to the same COM interface/struct/etc.

Simply put, it's a form of structural typing specifically for COM interop artefacts.

> Dynamic may be "strictly an interop thing" by design, but in practice it can be used anywhere on any object, regardless of its origin. In my opinion this is a serious design flaw that will have long lasting negative consequences.

What are the alternatives? Unlike the rest of it, "dynamic" is not just for COM - it's also for interop with IronPython and other dynamic friends. And there's no good way to distinguish a "scripting" object from a non-scripting one - and, from multi-language .NET perspective - there really shouldn't be.

There's one more thing to consider. Let's consider IronPython interop again. Say you use "dynamic" to call a method on IPy object, and that returns you yet another object. Now you don't know what is returned - it may be another IPy object, but it also may be a C# (or other static language) object that IPy code obtained from elsewhere - and there's no reason why you should. But you have to have some common way to work with that object, and that means that "dynamic" has to work regardless of what the object is implemented in.

In my opinion, it's alright. A major compromise would be if C# allowed you to define truly dynamic "expando" objects (easily), but that is not the case.

Wow, sorry I haven't been on here to answer anything in a while! Let me start from the top:

Commongenius:

> Ultimately, I don't like it, because I know that these "features" won't be confined to the usages for which they were intended, to the detriment of us all. But the world is imperfect, and we have to live in it.

I entirely agree. Someone else mentioned that it makes C# feel "dirty" - believe me, as the dev who implemented these things, I felt much dirtier. But it was a necessary evil I believe.

Xenan:

I don't think thats actually a problem, as much as it is an artifact of software engineering. You cant just move forward from version to version and simply assume you can wipe out any existing legacy concepts from previous versions. This is the nature of software design, and unfortunately is best solved through the language.

chrimart:

> If so, does that mean that Extension Properties would be back on the table?

Not currently. There were several things that took Extension Properties off the table, one of which was the indexed property problem. However, note that with this release, we're providing support for *consuming* indexed properties, not *authoring* them, so nothing would have changed from the extension properties front.

Richard:

Great catch, I've changed it from emit to omit. Thanks!

Pavel:

Thanks for your help answering questions that folks have! Let me just vouch for your answers and say that you are correct in your understanding of the features. The NoPIA feature (which I'll describe in upcoming posts) embeds only the types and members you use in your code into your compiled assembly, and removes the reference to the PIA. The CLR then does some magic to note that these types are equivalent, and treats them as such at runtime. More on that later :)