Flexible C++ #12: Imperfect enums, Part 2: Forward Declarations

By Matthew Wilson, May 01, 2005

This second part of the Imperfect Enums series looks at the issue of the forward declaration of enumerations. Why would one ever want to do that, I hear you cry? Certainly it's not a common need. But I have encountered situations where it's required, one of which we'll discuss later.

< html

This second part of the Imperfect Enums series looks at the issue of the forward declaration of enumerations. Why would one ever want to do that, I hear you cry? Certainly it's not a common need. But I have encountered situations where it's required, one of which we'll discuss later.

Notwithstanding any requirements to do so, the language does not allow the forward declaration of enumerations. Why is that so? Ploughing the newsgroups seems to give three reasons for the illegality of the forward declaration of enumerations:

The size of an enumeration varies according to the range of values it represents, and not just on the size of (one of) the ambient architecture's type(s)

The use of forward declaration was put into the (C) language to cater for structures that may container pointers to the same type.

No-one thought about it at the time, and no-one has subsequently deemed it worthwhile.

The usual argument for forward declaration of enumerations is physical decoupling. As the language has matured, and been used for larger and larger projects, this issue has raised greater prominence. (See John Lakos' seminal work [1] for more information on physical coupling than you could shake a stick at; even this classic work, however, fails to offer a cogent and forthright tactic for dealing with enumeration coupling. It discusses using static/const members to replace class implementation constants, and discuses, though does not wholeheartedly recommend, the use of integral types instead of enumerations.)

One common scenario where enumerations are both desirable and undesirable is as a return code. We may choose to define our return code type, RC as an enumeration, as follows:

(This enumeration is not-namespaced, see Part 1 [2], perhaps because it's intended to be used by C and C++, so is using the RC_ prefix for symbol disambiguation.)

The advantage of defining RC as an enumeration is threefold. First, we get type-safety in the assignment to instances of RC. Second, we get uniquely defined return code values by default, so long as no-one gets the bright idea of applying a value to any but the zeroth element. (It's common practice to explicitly give the zeroth element the value 0 to aid readability, even though the compiler would do so automatically.) The third advantage is more prosaic: Integrated development environments are more likely to render you a human-readable symbol rather than an integral value in the debugger.

But there are two disadvantages to using enumerations. First, the order of the return codes thus defined may never be changed. If some order-obsessed maintenance programmer decides to move some around, or prune some now-defunct values, all manner of nasties will occur if two link-units compiled at different times are brought in to play together. As discussed in Part 1 [2], the rule is that you should never remove or change the order of extant items. How your development team defines extant in this case may vary, but at a minimum it should include values that have been built into released components.

The second objection to using enumerations for return codes is that it introduces physical coupling, and a lot of it. Consider a common development scenario, whereby the code of different components in a given product suite share core library functionality, including a set of common return-codes and their manipulating functions. Naturally, as the components evolve, they will require the introduction of new return codes. As long as ordering is not disrupted that's all fine and proper, but it does mean that logically independent changes result in physically dependent rebuild requirements.

The converse option for return codes is, of course, to use an integral type, and to define the values as constants (#defines in C, #defines/constants in C++). The advantages and disadvantages of this approach are the mirror image of the enumerations. First, the coupling, though it does not go away entirely, can be much reduced. This is because it is feasible, and may even be preferable, to allot specific ranges to the subsystems, and split the definitions of the return code values across separate include files. Further, partitioning into sub-systems, and enforcing the extant rule on a group basis can similarly dilute the ordering and pruning restrictions. The disadvantages are that we lose type-safety (unless you use True Typedefs [3,4]), have to manually ensure that values are distinct (unless we auto-generate the return code headers from a database tool), and we're more likely to be looking at uninformative integral values in the debugger watch window.

A Requirement for Forward Declaration

In recent work for a client, involving a disparate product suite, I made use of several enumerations. One was a return code type that was imaginatively called RC. With that one, we opted for enumeration rather than integer, and lived with the coupling because the overall scale of the project was not great in lines of code. (It was pretty large-scale in commercial terms, which is why we tended to err on the side of correctness.)

However, though we could live with the lack of forward declaration for enumerations for the RC type, we had cause to forward declare a different enumeration for another reason--to leverage re-use of a component without compromising on type-safety. Let me explain.

The product suite was a set of networking processes that carried out multiplexing, routing, arbitration and translation, tying together legacy systems operating different communications protocols (e.g. TCP/IP) and middleware (e.g. TibCo EMS). To deal with such cheek-blanching complexity, I designed a foundation message passing architecture piggy-backed on top of the Adaptive Communications Environment (ACE) [5]. The messages were represented as a reference-counted interface, INotification, which could carry arbitrary data with them, and were represented by an identifier. It was the type of this identifier, NotificationId, which we required to be an enumeration in order that we could maximise robustness.

As the product suite evolved, the notification mechanism was naturally migrated to a common arena within our source structure, such that each separate component--programs and dynamical libraries--could use it. However, the NotificationId values used by the different system components were disjoint sets, and we did not want to have all the physical coupling, and increase in complexity, involved were we to have all components share a common set of the union of all notifications. The idea of parts of the codebase needing to be "aware" of enumerators that are not part of their respective problem spaces was not attractive. We needed forward declaration of enumerations, such that the INotification interface and the notification infrastructure classes might be defined independently of the actual values of the notification ids, but still have all that type-safety.

Since my client's development team were using Visual C++ (6 and 7.1), we could have taken the cheap tactic, and used forward declaration of enumerations; Visual C++ is a member of the non-too-small group of compilers that supports them as a proprietary extension. However, because we were following one of the central messages of my book, Imperfect C++ [4], which is to compile your sources with a variety of compilers in order to catch as many warnings and errors as possible, and because using proprietary extensions should raise the hackles of all good engineers, we wanted to a standards-compliant solution.

The Forward Declaration

In Part 1 of this series [2], I advocated the use of a dedicated namespace for all non-member enumerations, in order to remove the possibility--an all too common likelihood in the real world--of symbol name clash between the enumeration values and other constructs (or macros!) in the compilation environment. Well, we can take further leverage from the namespace in order to provide forward declaration of enumerations.

If we're going to forward declare an enumeration legally, we're really going to have to have it masquerade inside something that can be legally forward declared: a class/struct/union. In this case, I chose a struct. Let's look at how the forward declaration is done first. Bearing in mind our lesson from Part 1 [2] to avoid leakage of enumerator names by wrapping in a namespace--the Namespace-Bound Enumeration technique--what we're aiming at is emulating:

This means that passing our "enum" by value results in passing copies of the NotificationId__type structure, but this is not an issue, because it's a very simple structure indeed, and compilers will optimise such things in their sleep. Using the macro form we can now neatly forward declare the NotificationIdenumeration (or any other, for that matter), in a full standards-compliant and portable form:

Mechanism

So how does it all work? Well, to client code, the NotificationId::NotificationId needs to act like an enumeration, so it needs to be initialisable from any of the values [NotificationId::first, . . . , NotificationId::third]. This is achieved by giving NotificationId::NotificationId__type a conversion constructor from the actual enumeration NotificationId::NotificationId__enum. We also need to be able to use it in switch statements, which means it needs to have an implicit conversion operator to an integral type. Since enumerations are integral types, we can implicitly convert to NotificationId::NotificationId__enum itself, which is nice. Now we can write code such as the following:

Note that the enum-size objection to forward declaration of enumerations is now no longer an issue, since any user of any of the values of the enumeration will see the real underlying enumeration--NotificationId__enum. The forward declarative aspects are all encapsulated within the struct, and that's 100 percent legal. Sure, the full code's a bit verbose, but if you can bring yourself to use macros, then it's pretty straightforward. Naturally, you needn't forward declare macros that often, but now you can when you need to.

Cherry on the Cake

An added bonus with this is that you cannot declare uninitialized instances of the enumeration type, addressing a potential source of bugs in C and C++ that's been there since their inception!

NotificationId::NotificationId nid; // Compile error!

STLSoft's Version

These macros are included in STLSoft (http://stlsoft.org/) from version 1.8.3 onwards, in the form of the STLSOFT_DECLARE_FWD_ENUM(), STLSOFT_DEFINE_FWD_ENUM_BEGIN(), and STLSOFT_DEFINE_FWD_ENUM_END() macros. Just #include , and you're away!

About the Author

Matthew Wilson is a software development consultant for Synesis Software, and creator of the STLSoft libraries. He is the author of Imperfect C++ (Addison-Wesley, 2004), and is currently working on his next two books, one of which is not about C++. Matthew can be contacted via http://imperfectcplusplus.com/.