Introduction

Microsoft Component Object Model, COM, is a mechanism providing binary interoperability between software components, and as far as I know, it’s the most successful standard for interoperability ever conceived.

COM is a fundamental part of Windows and .Net. .Net developers write software that uses COM all the time, even if they’re not consciously aware of this, and most software written for the .Net platform can easily expose it’s functionality to unmanaged code using COM by just selecting the ‘Make assembly COM-Visible’ check box found by clicking the ‘Assembly Information…’ button on the project properties ‘Application’ page.

Over the years Microsoft has added a significant number of APIs’ to Windows based on COM, and for C++ developers using them is sometimes a bit cumbersome. As a developer you’re obviously expected to check for program errors, and most COM based APIs’ return a HRESULT conveying information about whether a call succeeded or not. Normally a negative value indicates that an error occurred, but not every API follows this convention – .Net on the other hand transparently converts errors to into exceptions, improving the whole development experience.

One of the fundamental features of COM is object lifetime management, which is implemented using reference counting, so COM requires C++ developers to explicitly decrement the reference count of an object when they no longer needs it:

Which tells us one fundamental thing about COM interfaces: An interface is just a C++ class consisting of pure virtual functions – which really is just a structure containing a single pointer to a structure containing nothing but function pointers.

Note that the order of the function pointers is the same as the order of the pure virtual function declarations for the C++ representation of the interface. Those from IUnknown come first, and then come the function pointers representing the two functions added to the interface by the IClassFactory class.

IUnknown is deceptively simple, but if you stop and think about it, you’ll realize that it’s quite elegant. Addref() and Release() provides the functionality required for lifetime management, while QueryInterface allows us to access all the interfaces implemented for the object:

When we use .Net to access a COM based API things are a little easier, as .Net performs this error checking for us, and in case of a failure .Net converts it into an exception that is thrown before control is returned to our code. That’s actually a pretty nifty service provided by .Net, because when we go native, as in this function:

There are suddenly a lot of things going on that has to do with object lifetime and error handling. Not only does .Net convert errors into exceptions, it also manages the lifetime of our references.

To solve the lifetime part of the problem, it’s common to use smart interface pointer classes, like CComPtr<>, which works as far as lifetime is concerned, but we still have to check the HRESULT, because the smart interface pointer does its magic by implementing T* operator -> () const, providing access to the interface.

The above illustrates how I would expect a well-designed C++ API to work, letting me focus on what I want to do with a particular piece of code, not error handling and lifetime management.

Error handling and lifetime management still has to be performed, but behind the scenes, somewhat alike to what .Net developers can take for granted when they work with COM.

The Façade Pattern

The facade pattern is a software design pattern often used with object-oriented programming. The name refers to an architectural façade – it’s what you’re able to see from the outside. A facade is an object that provides a simplified interface to the developer, hiding the complexity of interacting with a larger piece of code, such as a class library. A facade object can:

Make a software library easier to use, understand and test, since the facade provides convenient methods for common tasks.

Make the library more readable, for the similar reasons.

Reduce dependencies of outside code on the inner workings of a library. Since the client code uses the facade, this also adds more flexibility to the development process.

Wrap a poorly designed collection of APIs with a single well-designed API.

As the last implementation of ConvertBitmapSource demonstrates, this can radically simplify the development process while at the same time improve the robustness of our application.

Requirements

Since COM interfaces inherit the IUnknown interface allowing us to access all the other interfaces implemented by the underlying object, that’s a significant piece of functionality that we want for our façade objects too.

In the above code a graphics::RenderTarget, renderTarget, is assigned to an Unknown object, unknown. Afterwards the Is<T>() function is used to determine whether unknown is something that can be successfully converted to an object of type graphics::RenderTarget – which is done using the As<T>() function.

Meaning we’re able to perform significant COM related operations in a straight forward manner, without really being concerned with any details related to what actually makes this work.

The GetInterface() function will return a pointer to the IEnumUnknown interface, or throw an exception if no interface pointer is assigned to the object. We then proceed to call the COM interface and check whether an error occured. CheckHRESULT will throw an exception if hr is < 0.

If everything went well, we construct and return an EnumUnknown object.

Concluding remarks

By implementing facade classes for COM APIs’ we’re able to improve the COM client development experience in a way that makes our applications more maintainable and robust, and the good thing is that it isn’t all that difficult.

If you’ve never used C++ templates, this also demonstrates that you can get a lot of mileage out of very little code – and that it doesn’t have to be complicated to be useful.

Most new APIs' seems to be based on it, so it's kind of making a comeback.

Not that it ever really went away - internally .Net relies heavily on COM for many things, which is why we always have to be aware of things such as the apartment model of the thread we're executing under ...

COM is used in all scenarios, on Windows or not. Components, refcounts, interface-based is present everywhere (Apple ObjC framework and Mozilla NS are two examples - though Mozilla XPCOM didn't nailed it quite right...). I find it particularly amusing everyone hates MS, Windows, and especially COM; I am still considering that COM is still one of the greatest implementation of all times.

Coming from the .NET world I always thought that COM was a legacy things...
Now I'm moving into lower layers, I'm amazed about how cool things we can do with it... I never used COM in C++ though, because I was afraid of the syntax... not anymore with your work.

However, why don't you use COMPtr
to manage the lifetime of you objects ?

I could have used COMPtr in place of Unknown, but I imagine that somewhere down the road, I'm going to do someting odd to Unknown.

As you know, a single object often exposes mutiple interfaces, and imagine that you want to attach a piece of data to the object - from the client side of things - and preserve access to the attached data through conversion from one interface through the other.

Aggregation would be one solution, but not all objects supports that...

The implementation of std::shared_ptr<FileSaveDialog> FileSaveDialog::Create() and std::shared_ptr<FileOpenDialog> FileOpenDialog::Create() uses shared_ptr because this mechanism is not in place.

Currently, the user of the code can use C++ lambda functions in place of implementing the IFileDialogEvents interface, and the boost::signals2::signal objects that facilitates this are part of the FileDialog class. As it is, those objects will not be preserved through conversion to/from Unknown - which I think is desirable.