Yet Another Generalized Functors Implementation in C++

An article on generalized functors implementation in C++. Generalized functor requirements, existing implementation problems and disadvantages are considered. Several new ideas and problem solutions together with the compete implementation are suggested.

Introduction

Generalized functors are important and powerful design artifacts. An excellent introduction on generalized functors idiom, its application and one of the best implementations can be found in [1]. I'll try to retell the primary ideas, concepts and implementation details below with a strong emphasis on the latter as expected from the article title.

So let's discover the generalized functors' goals first. Suppose we have the following C++:

Collective notion for pf2 (pointer to static function), paf2 and pbf2 (pointers to member functions), f (an instance of a class containing corresponding operator()) is C++ callable entity that means an entity to which functor-call operator () can be applied:

What if we want to treat all of them in some generic way, e.g. push them in some "container" and make all the above calls at once via a single method call of such a "container"? The first way to implement this is to develop a special "container" class. Another one is to design a universal adapter class capable of storing and making calls to all the callable entities above. Then these adapter class instances can be stored in the std containers as usual. The latter way is more generic because this adapter class can be used in other applications. It is the class that is called generalized functor. The following pseudocode introduces this:

The above example is simple but has several important issues. First, it shows an application of generalized functors and the power together with the convenience that is achieved. Particularly, the fact that all fun, fun1-fun4 are of the same type (Functor2) gives a simple way to solve the above problem of "container" for different callable entities. Second, this example implicitly contains the requirements for a possible generalized functor implementation. We'll frequently refer to them later so let's emphasize:

R1.

Lines 1-3 imply a universal support for all kinds of callable entities and demand corresponding ctors.

R2.

Lines 4 and 5 imply the function call of the underlying callable entity instance to be type safe. This means that if the lines (5) are uncommented, they should not be compilable since they try to do a call either with inappropriate number or types of arguments or with inappropriate return type.

R3.

Lines 6-8 imply full support of value semantics, i.e. the need for proper default ctor, operator=, copy ctor and dtor. This is important and very convenient, taking into account C++ forbids assignment and conversions for raw callable entities of different kinds.

There are additional issues touched at line 0. They concern some other important implementation details:

D1.

The above listing shows an example of functor for callable entities with the function-call operator () taking two ints as parameters and returning void. Other parameters and return types should be supported of course that leads to a question about the way of parameterization of a generalized functor template.

D2.

But what about the different counts of the arguments of the underlying function-call operator ()? Of course, this should be supported, the problem is how to implement this as clear as possible. This is not easy in C++ as it will be shown later.

Known implementations

The two known implementations of the generalized functors are provided by Loki [2] and boost [3] libraries (the latter is proposed for C++ standard [4]). Providing full generalized functor semantics and functionality, they are completely different in internal implementation, from the concepts to the resulting complexity. Of course, there are other implementations as well. Going through the mentioned R1, R2, R3, D1 and D2 issues, we'll build our own one.

Let's consider D1 issues now. There are two main ways to parameterize the generalized functor template. The first uses function types, and the second uses a pair of return type and typelist containing the types of all the arguments of the corresponding function-call operator ():

Note: In case (b), TYPELIST and CreateTL are a macro [1] and a meta-template respectively, used to generate typelists.

Case (a) (used in boost) looks more elegant, but has several drawbacks. Functions with the same signature can have different modifiers. Cv-qualifiers (const and volatile) are the first important examples. They can be applied only to member functions, thus:

typedef Functor<void (*)(int, int) const> Functor2;

declaration will be rejected by the C++ compiler because the corresponding function type is invalid. So the question arises - how to create a generalized functor instance for B::f2const function in this case? It is possible of course, but at the cost of unnecessary code complication spoiling the initial elegance. Other examples are __cdecl, __stdcall, __fastcall (both for non-member and member functions) and implicit __thiscall (for member functions) modifiers. Yes, they are all non-standard extensions provided by compiler vendors, but some of the vendors are too important to ignore. The problem is that the function types with different modifiers are of different types, the same can be said about the functors instantiated with them. Summing up, the function type is a too low-level entity to use for parameterization of such a high level entity as generalized functors.

Cases (b) are free from the above disadvantages though they look a bit less elegant. But we can fight for elegance, providing functor creation functions for example. With their help we can bring the client code to the following nice syntax:

To implement MakeFunctor functions we need mapping between the function type and its return type and the typelist containing types of all its arguments, which can be defined using a common traits idiom:

Let's move to requirement R1 considerations. You can readily observe that a generalized functor class template is parameterized by the types completely unrelated to the types of the objects that are passed to its ctors. This means that those ctors should be defined as member templates:

Only the ctors know the callable entity type and it gets lost after the ctors exit. But some operations still require the callable entity's type knowledge, e.g. functor copying, assignment and destruction operations. The operator() should also forward the call to the underlying callable entity instance according to its actual type. So template ctors should store the callable entity type information in some "universal form", which will allow treating them later in some polymorphic way. The most straightforward solution is as follows:

In the above example, an abstract base class FunImplBase defines all the operations depending on the different callable entity types. FunctorImpl and MemberFnImpl are its concrete successors implementing those operations for non-member functions together with arbitrary functors and member functions respectively. Generalized functor template itself incorporates a pointer to FunImplBase, initializes it in the corresponding ctors and uses it where it is needed. It should be noted that the above example conceptually resembles Loki's functor template.

Let's address R2 and D2 issues now. Using typelists you can pack an arbitrary number of different types into one type and this is great for many generic algorithms. But this very arbitrary number of types should be fixed in the typelists design (and usually is chosen high enough to fit all uses). A similar problem - C++ has no facility to manipulate function parameter's count and so you can't treat them in a generic way. In our case having some TypeList<T1,T2,T3,...Tn> type of arbitrary length would be nice to generate the corresponding function e.g. operator()(T1,T2,T3,...Tn). This would completely solve the problem of type safe function-call. But as we already said this can't be done in C++ and we have to find out roundabout ways. There are two ways. In the following first way we can declare a Functor class template and then define partial specializations for all possible count of types in the operator():

But in this case each specialization also contains all the code independent of the operator() arguments count. One can try to factor out this code in some way into the base classes for example (as boost does) etc. But the result does not seem to be good.

In the second way we define a single front-end functor template and simply overload all possible operator():

This solves the problem but introduces some potential bugs. If we try to use the resulting generalized functor template as in the opening code listings and uncomment the commented line 5, the compiler will gaily compile them guaranteeing runtime crash! The solution is to take FunImplBase, FunctorImpl and MemberFnImpl declarations out of the Functor template and define specializations for all possible count of types in the call_ functions. See an example for FunImplBase:

There are a number of specializations here, but each one contains only one suitable call_ function. So the code for line 5 in the opening code listing is not compilable now. Though the compiler can find a suitable operator() in the Functor template, it fails to find a suitable call_ function in a corresponding FunImplBase specialization. But as you may notice, we've got back to the necessity of partial specializations - not for the Functor template itself but for some helper classes. This is a bad news. The good and the more important ones are that the specializations contain as little unnecessary code (independent from the count of arguments of the call_ function) as possible. Loki's implementation follows this way and achieves very good code factoring.

We've considered how generalized functor requirements and implementation detail issues (R1, R2, D1 and D2) can be solved and how they are solved in the existing libraries. We'll leave the requirement R3 issues untouched since they do not introduce any new ideas and are rather easy to implement.

Going forward

In the previous section, we discussed the generalized functor requirements, design problems and possible solutions. We've also got to a principle implementation usually used in the existing libraries. In this section, we'll try to investigate and analyze its disadvantages and remove them.

One of the disadvantages which can be discovered from the previous section examples is the need of heap allocation for FunctorImpl and MemberFnImpl instances. This is a squander - both from the point of view of speed and memory consumptions (because all general-purpose allocators allocate memory with some granularity wasting some space). Keeping this in mind, Loki provides for its functor template the optimized custom small object allocator. Boost implementation contains a trick that allows to refuse heap allocation for generalized functors for non-member functions, but rolls down to heap allocation with generalized functors in other cases.

How about refusing heap allocation in all cases? The possible solution is to incorporate a buffer of some fixed (but customizable) size into the functor class template itself and use it as a pool for memory allocation. This idea exploits the fact that for all callable entity kinds, only a few bytes of memory are needed to store the internal data. Thus with a functor for a static function only the function pointer (4 bytes on 32-bit systems) should be stored. With functor to a member function a pointer to the target object instance and a pointer-to-member for a function to be called are needed. The former takes 4 bytes again, while the size of the latter can vary from 4 to 20 bytes (see an excellent study here [5]). Internal data size for an arbitrary functor can't be predicted beforehand, but we know from our experience that functors are usually designed to have only a few data members or no members at all.

Functors designed in the above way, waste some space in the case of least memory-consuming functors for static functions. But, firstly, when using heap allocation, some memory wastes occur due to the allocation granularity. And, secondly, the profit is admirable - all functor operations involved in copy-by-value support (ctors, dtor, operator=) work with a giddy speed. These operations are intensively used in important generalized functor applications based on functor chaining and binding.

For the remaining minority of cases when sizeof(T) > size we should provide in Typeless::init() switching to normal allocator. The simplest way to do this is the following:

In the above code snippet, val_ is used either to store the value of T entirely, or to store only the pointer to the newly-allocated value. Note: sizeof(T)<=sizeof(Typeless) expression used in the Select meta-template is not enough and should be improved by adding alignment calculations (see [6] or [3]).

Digging in we discover one more disadvantage. After introducing val_, pimpl_ pointer becomes redundant! We always "almost" know that - it is equal to either &val_ or incorporated into val_ itself. So, if we could make such a choice we could throw pimpl_ away. Remember that each instance of a class containing virtual functions (either own or inherited ones) maintains an invisible pointer to the virtual methods table which directs the virtual function calls towards the right function of the right class. If we could deal with it directly and take it out from FunctorImpl and MemberFnImpl instances into the Functor template, we could make all the needed calls to the right FunctorImpl or MemberFnImpl functions and additionally save 4 bytes! Of course, we couldn't do it dealing with C++ virtual function mechanism immediately. But we could simulate this mechanism. This is a known and a rather powerful technique (see [6] or [7] for example). For our case it could be applied in the following way:

Note: This is not a plane translation of the previous example with virtual functions, this is an adapted translation, e.g. pointer to FunImplBase is thrown away and is replaced by a pointer to a function table, FunctorImpl::Call function etc. accepts Functor instance reference as the first argument (not a FunctorImpl pointer that could be in plane translation). The result seems to be acceptable.

Let's switch to R2 and D2 issues now. We can notice that when calling FunctorImpl::Call and MemberFnImpl::Call from operator(), all its arguments should always be copied. Because, this is a call via a function pointer (as with virtual function call as well by the way) the compiler could not apply any optimizations. So what if we pack operator() arguments into a tuple and pass it to FunctorImpl::Call or MemberFnImpl::Call? Arguments copying overhead will remain the same, but FunctorImpl and MemberFnImpl could become independent from operator() arguments count. It is not shown in the previous listings but in addition to the Call functions FunctorImplBase, FunctorImpl and MemberFnImpl should also contain functions for value semantics support. So their independence from operator() arguments count evidently improves the code:

Note: In the above listing InstantiateH - is a kind of GenScatterHierarchy<> template in Loki used here as a tuple generation engine (see [1] for details).

Packing operator() arguments into the tuple and passing them in internal calls can lead to additional benefits that exceed the bounds of Functor implementation itself. E.g. we can add the following overloaded operator():

and use it to pass parms reference around two or more functors. This can simplify the implementation and even improve the efficiency of functor chaining and binding support. We could even dream of named parameters in the Functor's operator() calls - the feature not supported by C++ for simple function-calls.

Conclusions

We've considered generalized functor requirements, existing implementation problems and disadvantages. While trying to solve them we've introduced several ideas, which combined together lead to an implementation that is rather different from the known ones and seems to be more clear and efficient.

Source update (14-Dec-2005)

Generalized functors binding is added. Check out FunBind.h in the source files given with the article.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

but whatever I define in cpp file I get unresolved external link error or undeclared identifier or template missing arguments, can You help me? I found hundreds like this implementations in h file, directly in class definition, but is there, at all, way to break this in .h and .cpp file?

Hi Ben,Thank you for your interest in the article. I should note that i'm a windows developer and so have few experience with gcc. I have tested the code with msvc++ 7.1 and 8 as well as withgcc 3.4.2 (mingw distribution). gcc 3.4.2 does not issue the errors you've mentioned. But yes, .template needs to be added in the reported lines (as you've done) to comform the standard. The 'this statement has no effect' warning is harmless here, it could be easily removed by replacing FunImplBase* pimpl = val_.template init<t>(v); pimpl; // throw away pimpl, we don't need it in this implementationwith say /*FunImplBase* pimpl = */val_.template init<t>(v); // throw away pimpl, we don't need it in this implementationI just have tested the test code with gcc 3.4.4 (mingw distribution), and after applying these changes it is compiled cleanly with -Wall. Btw the resulting executable does not issue an assertion failure like you've mentioned. I guess i need to fix the test itselft (sigh...) and simply remove floating point comparisons to make it more correct. You could quickly find if this is really the reason by replacing (in TestFunctors.h) all 'double' with say 'long' (and corresponding floating point constants with integer ones as well). Thank you again, i will do tests with other compilers when i'll have time, and then update the code.

I know the feeling .I hope you do up date it, as for me "first principles" was the most interesting part.Consider it a lesson - always test all the examples.As for unforgivable, I wouldn't go that far...

>With functor to member function a pointer to >the target object instance and a >pointer-to-member for a function to be called >are needed. The former takes 4 bytes again, >while the size of the latter can vary from 4 >to 20 bytes (see an excellent study here [5]).

In this case, maxmimum size for functor will be 24 bytes [worst case for non static member functions of class] and minimum of 4 bytes for static member functions of class and global functions.

Could you please explain me why we opted to have buffer of 16 bytes as default. Is it trade off ? Should not it be 4 bytes as lot of space will be wasted if am using only static member functions of class and global functions.

What I have understood is that, in normal case where we use static member functions and globale functions, we have already allocated a buffer for storing them as allocations on heap will be on granularity basis and hence wasting more memory.If we are working with non-static member functions, we will use npormal allocation syste m i.e. invoking new operator.

Thank you for the comment. And sorry, i've missed the email with your comment some way. I'm just suddenly visited this article online and find it here... In case if you are still interested in my answer here it is.

Dinesh Ahuja wrote:

Could you please explain me why we opted to have buffer of 16 bytes as default. Is it trade off ? Should not it be 4 bytes as lot of space will be wasted if am using only static member functions of class and global functions.

16 bytes is the size to fit pointers to most commonly used kinds of member functions. So no heap allocation will occur for non-member, most of arbitrary functors and most of member functions. It is just my guess to find the golden mean.

Yes, 16 bytes mean that we waste a lot of space if we are using generalized functors for non-member functions only. But as i mentioned in the article - if we use std new to allocate space for underlying pointer, then we waste some space anyway (heap allocator does invisibly waste it for small objects). And the whole idea was not to reduce memory wastes. It is to refuse heap allocation in as most cases as possible. So still having comparable wastes in space, we gain in speed. All operations for value semantics support (construction, copying etc) are done on stack allocated values that results in maximum possible speed for them.

>Internal data size for an arbitrary functor can't be predicted beforehand, but we know from our experience that functors are usually designed to have a few data members or without them at all.

This is a really interesting observation. My belief is that >90% of usage of functors is with simple member functions with no argument binding. But the small functor, simple binding case probably takes you up to 99% of practical cases.

> Boost implementation also contains some tricks that allow to refuse heap allocation for generalized functors for non-member and member functions, but still rolls down to heap allocation with generalized functors in case of arbitrary functors.

Are you sure about this? It is not true of the current version (1.32) of boost::function, although I haven't checked the latest CVS.

I'm not sure that it can be done without *some* reliance on undefined behaviour. Although it is possible to create member function pointers onto a char array using placement new, there are alignment issues on some processors, and even when it works, you are still relying on undefined behaviour. At least in every implementation that I've seen.

> Are you sure about this? It is not true of the current version (1.32) of boost::function, although I haven't checked the latest CVS.

Thank you for the comment, i've rechecked boost sources and should say that it uses new allocation in case of member functions too. There is no new allocation only in the case of non-member functions. i'll fix this in the article.

> I'm not sure that it can be done without *some* reliance on undefined behaviour. Although it is possible to create member function pointers onto a char array using placement new, there are alignment issues on some processors, and even when it works, you are still relying on undefined behaviour. At least in every implementation that I've seen.

Surely alignment issues should be considered and i know there are well-done alignment calculation implementations (that can be found in boost or Alexandrescu articles). Sorry, i was too busy/lasy to add them and leaved this as TODO. As to undefined behaviour - do you mean undefined behaviour that may occur after certain pointer to function or pointer to member function conversions?

The internal char buffer will conform to C++ alignment constraints (thus it is, I think, not undefined behavior anymore) and any function pointer whose size is less than or equal to the size of the member function pointer of multiple inheritance class will be stored into the internal char buffer instead of using the expensive heap allocation. Only member function pointer of the virtual inheritance class or incomplete class will be stored on heap memory.

Yes, when i wrote about alignment issue i too thought about a sort of union trickery you mentioned. Note though that proposed generalized functor template could also internally hold functor object of arbitrary class. So in general case we need max_align not only for the member function pointers but for any type.

BTW maybe the adapted Alexandrescu's approach would help too (e.g. see notes at the end of GOTW#85).

As a result the implementation could be made fully standard conforming and portable in the case of member function pointer, and "nealy 100%" standard conforming and portable in the case of arbitrary functor case.

It's a good article, but... As i understand the underlying idea is to have a delphi like delegates. But all solutions i saw before and this one is too complex to be the standard way. I guess it must be compiler support to be the really common way of callbacks in C++.

I have no definite opinion on the need of compiler (language) support for delegates in C++. I'm not interested much in this very problem actually. Yes, i used an implement-a-delegate-like code as an introductory example in the article, but only because this is probably the most simple and easy to understand and use application of generalized functors. The underlying idea is much wider. See [1] for very interesting generalized functors application examples. Imagine functors used to make deferred, asynchronous, remote, rebound to another parameters set calls. All of this can be done without generalized functors, of course. But generalized functors may introduce many important additional benefits. They raise the level of abstaction you deal with (not compromising efficiency in most cases). They are ready to various generic programming techniques. They are very handy and can easily serve as a glue for _all_ function-call-related stuff. And as a result they help to produce better code, in all senses programmers usually put in that two words. As to complexity in implementation, it is, oh really, too complex thing to discuss Do you think std::map is complex for example?

> I have no definite opinion on the need of compiler (language) support for delegates in C++. I'm not interested much in this very problem actually. Yes, i used ...Sorry, it was my posting, i've forgot to log in...

My expirience only in windows programming, mostly COM - needs of generalized functors is not so as in pure C++ world (pair of times i make them by hand and it takes not a long time). But for C++ your absolutely right, it'l be very convinient.