The Code Project brings a handful of resources to the community and many, including myself, benefit from it. Since the site has been very beneficial to me, I thought it was about time I contributed something in return to the community by posting an article of my own. I wanted to post for a while but never found (or should I say, took) the time to do so. Well, here we go for my first post; please feel free to provide your feedback and questions which I will try to answer to the best of my knowledge.

The .NET Framework has greatly improved support for multithreaded applications, thereby making it easier for the programmer to create such applications. Even though the .NET Framework provides an enormous amount of tools and resources to accomplish a multitude of tasks, there are still many
programmers using older libraries such as WTL/ATL/MFC or even the good old Win32 API to accomplish their tasks.

For some tasks I'm part of the latter group of programmers and I sometimes find it difficult as well as time and energy consuming to switch from using the .NET Framework. For this reason, and until C++0x brings multithreading support to a compiler near you, I thought I would re-create in C++ one of the features I use most from the .NET Framework: Asynchronous Design Patterns, in particular, the ability to invoke a function asynchronously (e.g. on a thread pool) using a delegate and then retrieve its result later.

The .NET Framework Asynchronous Design Patterns is also characterized by an interface to synchronize a function call on a specific thread (ISynchronizeInvoke) which I also included in this implementation.

The main goal of this implementation is to bring a very basic implementation of some of the .NET Asynchronous Design Patterns to the C++ language. Since I mostly use WTL/ATL when I develop native Win32 applications, I made this implementation blend well with its respective style. By no means does this implementation provide all the answers and should be considered only a starting point and a learning project for future ideas and enhancements.

Keeping in mind that dependencies are sometimes painful to manage and considering the size of this project, I made the implementation fit inside a single header file, making it very easy to include as part of an existing project. If I wasn't worried about extra dependencies, I could have used the great Boost C++ libraries to benefit from their extended (and portable) support for functors, tuples and threads, however I decided to start lean in dependencies for this tiny project. Besides, Boost wouldn't have blended well with the WTL/ATL style because of its naming convention and its heavy use of templates following the STL guidelines. The only library dependency enforced by this implementation is the use of ATL (specifically CThreadPool) which can easily be removed by re-implementing the CDelegate class to use a custom thread pool (or any other means to execute the delegates).

You should also note that the .NET Framework benefits from the CLR features, therefore forcing my implementation to simulate some behaviors using different techniques and mechanisms. Additionally, some features have just not been implemented (e.g. support for multicast delegates, variadic arguments, etc.).

More specifically here is what this implementation provides:

Documented source code to an example implementation of a portion of the Asynchronous Design Patterns

Capability to asynchronously call a global or class function and retrieve its result at a later time

Support for callback upon completion of a delegate asynchronous call

A means to detect if an asynchronous call is required and capability to synchronize that call on a specific thread

To better understand this article you should be familiar with asynchronous design patterns. There are many great articles about them right here on The Code Project and here are some that
caught my attention:

The core of this implementation relies on IAsyncResult, ISynchronizeInvoke and IDelegate. IDelegate doesn't exist in .NET but I needed an interface to define a delegate, which doesn't exist in C++ (and function pointers are much lower level than delegates). I was guided by the .NET Framework for implementing IAsyncResult (which encapsulates all the resources of an asynchronous call) and created 2 classes, CAsyncResult and CThreadMethodEntry to be used respectively by CDelegate and CSynchronizeInvoke.

Let's continue with a class diagram (an image is worth a thousand words):

NOTE: CMainDlg is part of the demo application, not the implementation.

The diagram should be self explanatory and the design follows some of the semantics from similar .NET classes, but here are some more details:

While CAsyncRequest, IDelegate and IClassDelegate could find use in an application (IDelegate more than the others), they were designed to be used internally by this implementation.

IAsyncResult is used for keeping a reference to an asynchronous call. It is important to keep this asynchronous call reference and call EndInvoke with it otherwise it will result in a memory leak unless it is called as a "fire & forget" type of call (further details on that topic below).

It is important to mention that IDelegate and ISynchronizeInvoke are closely related and both have a BeginInvoke and EndInvoke method, however their meaning is very different.

IDelegate::BeginInvoke is used to begin an asynchronous call for the function wrapped by the delegate implementing the IDelegate interface while ISynchronizeInvoke::BeginInvoke is meant to asynchronously call the provided delegate on the thread the object implementing the ISynchronizeInvoke interface is living on (or any thread it decides to call it on for that matter). So the delegate passed to IDelegate::BeginInvoke is a callback, while the delegate passed to ISynchronizeInvoke::BeginInvoke is the delegate to call on the thread selected by the object implementing ISynchronizeInvoke.

IAsyncResult follows the same semantics as its .NET counterpart aside from the fact that it is missing the CompletedSynchronously getter, which didn't have a use in this implementation.

CAsyncResult and CThreadMethodEntry are respectively used to encapsulate the management of resources associated with an asynchronous call from a CDelegate and a CSynchronizeInvoke. CAsyncRequest is simply an abstract class implementing common portions of CAsyncResult and CThreadMethodEntry.

Before moving forward with the interface and implementation, I would like to address the question of whether calling EndInvoke() is required or not. In the .NET implementation, if you don't call EndInvoke() on a delegate asynchronous call, it is stated that you will leak resources (documented on MSDN), but if you don't call EndInvoke() on a synchronized call (as used by WinForms), apparently no leak will occur (not officially documented).

Since the documentation of the interface states that EndInvoke() should always be called, it is up to the implementer to officially document any exceptions to the interface rules. Since no official documentation exists to state that EndInvoke() is optional in the WinForms implementation, one can assume it is required. (This still can be discussed as the WinForms implementation doesn't really follow the interface definition of the pattern.)

To get around this issue, I modified the interface definition to include an extra parameter to state whether the call is a "fire & forget" type or if EndInvoke() will be called to retrieve the return value (and free up allocated resources). In order to follow some similarity to the .NET implementation I included a default value to "fire & forget" for a synchronized invoke and not "fire & forget " for a delegate asynchronous call.

Below are the rules around the "fire & forget" scenario in this implementation:

For non-"fire & forget" calls, always call EndInvoke() for both delegate asynchronous and synchronized calls, otherwise resources will be leaked.

For "fire & forget" calls, never call EndInvoke() for delegate asynchronous or synchronized calls as the resources associated with the call have already been disposed.

ulParam - The delegate parameter; this will be passed to the function wrapped by the delegate class

pCallback - The callback delegate; an optional delegate that wraps a function to be called upon completing the asynchronous call. In order to simplify the use of callbacks (any resources associated with delegates) they are auto deleted upon completion. (Always use the provided delegate macros to create new delegates. See below for details)

pvState - An optional state associated with the asynchronous call; could be anything you want to keep in context

bFireAndForget - Whether this call is a "fire & forget" scenario or not.

pDelegate - The delegate wrapped function to call; this call will be made on the thread the object implementing ISynchronizeInvoke decides to make the call on, typically the thread the object implementing the interface lives on; a GUI thread for the .NET Framework implementation

ulParam - The delegate parameter, passed to the function wrapped by pDelegate

bFireAndForget - Whether this call is a "fire & forget" scenario or not.

ppAsyncResult - The reference pointer of the asynchronous call to end; this has been implemented as a double pointer since the resulting pointer is set to NULL to signal that an IAsyncResult reference pointer must never be used after calling EndInvoke (as all associated resources have been deallocated).

The return value of EndInvoke is the return value of the function wrapped by the delegate. Clients must use a cast to recover the original type.

EndInvoke has the same signature for both an IDelegate or ISynchronizeInvoke. To retrieve the result of an asynchronous call one must call EndInvoke. As stated above EndInvoke must always be called to claim back allocated resources for the asynchronous call unless BeginInvoke was called as a "fire & forget" type of call. If EndInvoke is called before the asynchronous call has time to complete, it will block until the call has completed. It is also worth mentioning that callbacks are never part of an asynchronous call time frame; that is, calling EndInvoke will not block until the callback has completed. This is important because otherwise calling EndInvoke inside a callback would result in a deadlock.

An important note about EndInvoke: Calling EndInvoke with another IAsyncResult than the one returned by the matching BeginInvoke call will result in an InvalidOperationException in the .NET version. The same rule applies to this implementation and the IAsyncResult passed to EndInvoke must be the one returned by the matching BeginInvoke. This is important as delegates use a CAsyncResult as IAsyncResult while synchronized calls use a CThreadMethodEntry as IAsyncResult.

Delegates

CDelegate

Implementing IDelegate is pretty straightforward, however there are some issues (you know, there always are some issues, oops, I mean challenges). First, C++'s support for variadic arguments is quite limited; I know of only va_list and the use of overloaded templates to achieve the wanted behavior. Second, function pointers are very low level and they offer no flexibility, making it difficult to adapt them in this context. (C++0x may change all of this.) After investigating many options (e.g. one could have used the Boost C++ libraries) I decided I wouldn't break my head on this and opted for a lazy fix to solve both issues: use only one argument and make the delegate signature pre-defined. The side effects are limited to:

casting the argument and result back and forth, and

clients are forced to use the pre-defined signature for all delegates, including callbacks (which are also delegates).

These limitations are enforced by my implementation, not by the design of the pattern.

Creating new delegates is the same as using the .NET Framework, simply pass a function pointer to the constructor. To make a class instance member function delegate, however,you will need CClassDelegate. (See below for more details.)

Since this implementation queues asynchronous delegate calls on a thread pool, the first thing a delegate must do is initialize the thread pool. This is basically done on the first asynchronous call of any delegate, however one could make the initialization function public and call it upon starting the application.

To queue a new asynchronous call, the implementation simply posts the new request on the thread pool:

The machinery behind processing queued requests lives within the ATL Worker Archetype compliant class CDelegateWorker. This is a very simple, self explanatory class where Initialize and Terminate are called once for each thread in the pool and Execute is called every time the thread is cycled. (You can read the header file atlutil.h for implementation details; it is part of ATL.)

By now you may have noticed I haven't mentioned the details of passing a class member function pointer to a new CDelegate, which is another challenge ;-) . This is where CClassDelegate<T> comes into play. In order to accept a class member function pointer, CDelegate would need to use a template, which I didn't like considering the fact that it is a central class in this design implementation and would have affected many other classes because of template dependencies. Consequently I decided to create a tiny class to wrap around the details of keeping a class member function pointer and since that class uses a template, using it within CDelegate would require adding a template to CDelegate which would defeat the purpose of this class. So an interface definition IClassDelegate was created to decouple from the template dependency. The interface basically provides a mean to invoke the class member function and delete the CClassDelegate wrapper automatically. All of this means you can do something like this:

without worrying about the new CClassDelegate<T> memory allocation. In fact, you shouldn't even worry about the memory allocation for pDelegate in a "fire & forget" scenario, since it will also be auto-deleted when its time comes. This is all very nice but creating a delegate for a class instance member function makes every declaration very lengthy so I also created some macros to help make things more concise:

For implementing ISynchronizeInvoke, I thought it would be a great opportunity to use Windows I/O Completion Ports. Even though I/O completion ports are much more useful than indicated in this implementation, they still provide an efficient and easy way of queuing a request to be processed in a different thread. This makes implementing ISynchronizeInvoke a breeze. Below is the core of queuing/processing the asynchronous calls. (Note the similarity to queuing an IDelegate asynchronous call.)

Processing pending requests is done by calling ProcessPendingInvoke() which is called by IsThreadCallbackMessage from within a loop mechanism, typically a message pump (e.g. from PreTranslateMessage in WTL). By default, the function processes two asynchronous calls in order to minimize the disturbance of performing a function call on the receiving thread, potentially a GUI thread. You can pass a higher number of requests to process if you're not receiving the calls on a GUI thread.

As you can see, this is pretty straightforward. Using an I/O completion port is a personal choice and one could use a different mechanism (i.e. pass pMethod with WM_THREADCALLBACK). I also looked at using an APC with QueueUserWorkItem, however it works only when the calling and receiving threads are the same, which is useless in this context.

CAsyncResult and CThreadMethodEntry

These two classes are the core of an asynchronous call; they contain all the information relating to the call such as the call wait handle, a pointer to the caller, etc. Note that their implementation was modeled to fit with CDelegateWorker's style (Initialize, Execute, Terminate).

Interestingly enough, it is mentioned in the .NET implementation that clients can cast a delegate's BeginInvoke resulting IAsyncResult to AsyncResult for accessing additional resources linked to the asynchronous call. See http://msdn.microsoft.com/en-us/library/system.runtime.remoting.messaging.asyncresult.aspx for more details. In this same manner, you can cast an IAsyncResult to CAsyncResult for delegate calls, which is very useful, especially for retrieving the delegate pointer using GetAsyncDelegate() so you can call EndInvoke() to retrieve the call's return value.

The only real challenge of creating these two classes was implementing their Terminate() member function. Once an asynchronous call has been executed it must be terminated; that is, it must check whether the call has completed, deallocate resources and, optionally, call EndInvoke() in a "fire & forget" scenario. Also, since CAsyncResult is used for a delegate asynchronous call, it is also necessary to support a callback function.

Below are the implementations of CAsyncResult and CThreadMethodEntry's Terminate() member functions respectively:

If you have used this pattern with the .NET Framework you should find it very easy to use this implementation with C++. Since the implementation takes care of delegate allocations, you are not required to keep a delegate reference pointer, which makes it even closer to the .NET implementation. Also, because of the added support for the "fire & forget" scenario, it is easy to just call a function in order to avoid blocking the current thread. However, it is very important to keep track of all the asynchronous calls made one way or another, especially during application shutdown, otherwise unexpected behavior may occur.

Asynchronous delegate calls

To call a delegate asynchronously create a CDelegate function wrapper, call BeginInvoke to start the call and EndInvoke to terminate it. Typically you would call EndInvoke from within the callback, if one is provided. You typically would create all delegates using the provided macros. So assuming you have the following member functions defined in class CMainDlg:

As you can see, there are differences between this implementation and the .NET Framework's implementation for this portion of the Microsoft asynchronous design patterns. The goal was not to create an identical implementation but a similar one, to provide a very similar pattern implementation to be used in C++.

Many areas of the implementation have not been discussed, such as waiting for an asynchronous call to complete using the wait handle, polling an asynchronous call to determine whether it has finished or not, etc. Even though they are very straightforward to use, these features are worth being mentioned. Please refer to the demo application for more examples on how to use this implementation and its features. In most cases, if you have used asynchronous calls in .NET, you will find this implementation's features very easy to use since they mostly mimic the .NET implementation.

This implementation could really benefit from having error handling and exception support. Routing exceptions raised in threads is very important and should be handled by calling EndInvoke within try/catch statements as done when using .NET.

Also, this implementation provides only the basics of working asynchronously. Of course, some questions are raised such as "How can I cancel an asynchronous operation?" or "How can I report progress from the asynchronous operation?", etc. A good communication between the calling and receiving threads is very important and ISynchronizeInvoke helps in achieving good results, but its use it still quite low level. Looking forward, one may implement a BackgroundWorker class which could provide a solution to some of the questions on cancelling an asynchronous operation and progress reporting.

Share

About the Author

Daniel has been coding on and off using C and C++ for over 10 years and has recently gained more interest towards .NET and managed languages such as C# and C++/CLI. He is currently working full-time at PricewaterhouseCoopers as a senior support specialist developing in-house software solutions during his spare time. He is also currently studying part time at the University of Quebec in Montreal (UQAM) to obtain his bachelor's degree in software engineering.