Advanced Run Time Type Identification in C++ : Property Library

Abstract

Run-Time Type Identification has many forms and implementations in different programming languages. Standard C++ also contains RTTI support, but RTTI should provide much more than a unique identifier of types available at run time. This article presents a C++ library (Oops) providing advanced type information. It is scalable and easy to use, but the main advantage is that the library introduces properties for providing detailed, hierarchical description of types, classes, and containers. The objects of applications can be investigated by using property iterators. Property iterators traverse through the tree hierarchy of classes, structures, and containers, give information about the pointed objects, describe their type, get, and set their value. The Property Interface gives access to objects with the help of properties without knowing the actual type of the objects at compile time.

An Oops Library opens new dimensions in component-based software development. Designing classes with a Property Interface helps to make better and more logical software components, while the development, testing, and fine tuning of applications is quicker and easier due to the help of property streams (persistency) and property editors (application generator). The library contains a lot of interesting programming techniques, it has a clear design, and its source code can be downloaded with plenty of test and example programs for proving its quality.

Introduction

The first part of this article (at http://www.codeguru.com/cpp_mfc/RTTI_Part1.html) describes the requirements of an advanced RTTI system, providing detailed type information at run time and fulfilling the requirements of Persistency and Application Generators. The Oops Library of Robot Control Software Ltd. http://www.rcs.hu provides all required features and services. This library will be discussed and presented in this article. The authors currently do not know about any other C++ library having competitive services.

The Oops (Object Oriented Property Stream) Library has two parts, the Property Library and the Stream Library.

The Property Library provides the advanced RTTI system and the so-called Property Interface for investigating and accessing the type description and the object hierarchy of applications. Consequently, using an Oops Library in an application has two steps. First, the RTTI description of every type, structure, and class has to be prepared. Oops provides tools, macros, and example programs for making this step as easy as possible. The second step is using the objects with the help of the Property Interface. The object hierarchy can be accessed and investigated with the iterators provided by the Property Interface.

The Stream Library makes it possible to save and load objects. It uses the Property Interface for iterating through the object hierarchy and getting or setting the values of properties. The third part of the article will discuss the Stream Library in detail, but after reading this article you probably will have some idea how it works.

A simple Application Generator program, called Property Editor, is available as well. It also uses the Property Interface to access the application's data structure and display it for the user. The graphical user interface provides a tree representation of the object hierarchy, where the User can view and modify the objects, add new objects, and delete existing objects.

The Basic Idea

In C++, every structure, class, template, and array or container is a new and different type. We need an automatic method for creating the RTTI description because writing it by hand is almost impossible. The most critical part of the RTTI description is how to convert a variable to another format used in object streams. The basic idea of the Oops Library is introducing the well-known properties in C++. If we can see the significant members of objects such as properties and we are able to investigate the name, type, and value of the properties, we can easily write programs for printing, saving, or loading the objects. These programs will be independent of the source of the accessed objects, and we do not need to write conversion functions by hand.

First, we need some base types that we can use as elementary property types. The type description and the conversion functions of Base Types are implemented by hand, but fortunately the number of Base Types is limited, and the Oops Library provides a default type description for all built-in types and many other types such as strings. (The type description is called Type Info Record in Oops.)

Classes and structures should be handled in a much simpler way. The programmer developing the class has to tell, somehow, that a given class has properties, and the members being property must be signed somehow. It should look something like this:

This solution requires the minimum effort for creating the type description and it has a great flexibility because the programmer has full control over which variables are available as properties. The solution provided by the Oops Library is slightly different, but the main point is represented very well, with this example.

Life becomes more difficult if member functions, arrays, containers, enumeration, and compound types are also used as properties. A correct implementation of this idea makes no difference at the user level because in the Oops Library any member, even a standard container, can be used as a property.

Defining the properties is the first step only; to use them, we need a standard interface. The best choice is the well-known iterators. All information of objects can be accessed and investigated with Property Iterators. This means that a library knowing nothing about the application is able to access any property if the appropriate Property Iterator is provided. The third part of this article will present such a library for saving and loading objects using only the Property Iterators.

Parts of Property Library

The implementation of the Property Library contains the following parts:

Interface classes of Type Info records.

Implementation of Type Info classes for built-in and basic types.

The interface of Property Descriptor.

Implementation of Property Descriptor classes for different property types.

The interface of Property Iterator.

Implementation of different Property Iterators.

Common base class of all classes having Property Description.

The detailed description of the implementation cannot be described here, but the following chapters demonstrate how the most important features are implemented, describe the critical parts, and the programming techniques used to solve problems.

The Implementation of Property Library

Type Info Record

The Type Info Record is the key element of the Property Library. Every type having an RTTI description must have its own Type Info Class and all Type Info Class instances have one and only one instance, called Type Info Record. The Type Info Record stores all type information of the given type and the member functions of Type Info Class provide some basic operations.

All Type Info Class instances are the descendants of a common base class and linked to each other to make it possible to iterate through all Type Info Records. The Type Info Class is the only part of the Property Library knowing anything about the actual types. All other parts must use the services of Type Info Records.

The base class of all Type Info Class instances has the following services.

Constructor

It has one constructor only, getting the name of the type as its first parameter. The virtual destructor is defined, but it does nothing.

Identification

All types need a unique identifier. Oops Type Info Records provide string and binary identifiers. The type name and type identifier are returned by TypeName() and TypeId() member functions. A Type name is given in the constructor, but the type identifier may be created somehow. For example, it can be computed from the relative address of the Type Info Record, which is an application-dependent way; therefore, other identifiers may be required for ensuring that TypeIds are global identifiers.

Creating and Destroying Objects

There is a set of functions called CreateObj() for creating objects, and others called DestroyObj() for deleting objects. Every Type Info Class has to override these member functions and calls the appropriate new and delete operators. The CreateObj() functions may call the default constructors or the copy constructors. The second case makes it possible to create objects similar to an etalon, which can be useful when a large set of similar objects is handled.

Information of Type

C++ types belong to different groups such as base types (integer, float, string), container types (array, vector, list), compound types (classes and structures), or abstract types. The Type Info Class has some functions that provide information about the classification of the described type:

IsCntnr() returns true if the type is compound type or container; that is, it contains a list of properties. The function name means that the type is a container of properties.

IsPropClass() returns true if the type is a class or structure having a Property Interface. A class has a Property Interface if it is descendant of a common base class, rProp_BaseClass_i, and implements its abstract functions. See details later.

IsAbstract() returns true if the type is an abstract class; that is, it has pure virtual functions. Abstract classes cannot be instantiated; therefore, their CreateObj() and DestroyObj() functions must not called and cannot create the instance of the class.

IsSTLCntnr() returns the address of a special descriptor for containers or NULL if the type is not container type. The type descriptors of standard containers provide some additional services for adding new elements, deleting elements, and iterating through the elements. IsSTLCntnr() actually returns a pointer to a descendant interface class of the Type Info base class.

Getting and Setting Value

The GetValXxx() and SetValXxx() functions convert the value of the described type to a common format. A text and binary format are supported (Xxx stands for Str or Bin). These functions get the address of the object to convert and append the text or binary representation to the given buffer. The caller of the functions must ensure that the pointers point to the correct place. These functions are called from Property Iterators; therefore, the user of the Property Library does not need to use these functions directly.

The text format ensures a platform-independent and human readable representation, while the binary format is quick and efficient. Binary conversion generally has nothing to do, but in some cases (e.g. integer types) it makes some simple conversion for ensuring a platform-independent format.

There is a special case when a member variable is used to describe the number of elements in an array. The Property Descriptor of the array must know the actual size of the array; therefore, such a Property Descriptor contains a reference to the property storing the size information. In this case, the GetValSize() function is used to get the value of the property.

Void2PropBase()

This function is implemented only for compound types and it is used to get an rProp_BaseClass_i type pointer of the object.

The C++ language supports multiple inheritance and type casting. The up- or downcast operation on a pointer may change the address. This is necessary because of different sub-classes placed on different addresses in the object's data block. Casting to a void pointer always returns the address of the beginning of the data block. Therefore, a Property Library uses void pointers to pass the address of the objects, but these pointers cannot be directly turned back to one of its sub-classes. The void pointer has to be cast to its real type first, and then it can be cast to the requested sub-class.

The Void2PropBase() virtual function is implemented for all Type Info classes of compound types and it contains this type casting. First, it casts the void pointer to the type of the pointed object, and then it casts the result up to the rProp_BaseClass_i type pointer.

Getting the Type Info Record

Type Info records are created as static variables and their constructors link them to a list. The First() member function returns the address of the first Type Info record. The Next() member function can be used to iterate through the list of all types.

The GetTypeInfo() functions are used to find the Type Info record from the type name or type identifier. It is recommended to use these functions instead of searching with Next() because they return the address of the record directly without iterating through the list of Type Info records.

Compare Type Info Records

Type name and type identifier must be unique; therefore, the equal operators compare the content of the Type Info records. It is also possible to compare addresses of the Type Info records because every type has only Type Info record. (In some cases, it may happen that a DLL has its own data segment, and therefore both the program and the DLL build their own Type Info records. In this case, the addresses cannot be compared.)

Containers

Handling containers require some additional services. Containers are special objects that store a list of elements such as arrays, lists, maps, vectors, and so forth. Arrays do not require any support from Type Info classes (because simple pointers can be used to iterate their elements), but containers represent a higher abstraction level, and the better form is that the Property Library does not know anything about their internal structure. The Property Library uses the standard interface of containers and handles them similarly to Base Types instead of handling them as compound types.

Standard iterators can be used for iterating through STL containers, but they cannot add new elements to the container. An iterator actually points to an element, and it does not know the container itself. The Property Iterator has to know, somehow, the type of the container and it somehow has to store an iterator pointing to an element of the container.

As it will be discussed later, Property Iterators do not know the type of properties. In Oops, the Type Info records are the only objects having any knowledge of data types. Therefore, some special programming tricks are used and the services of the Type Info Class are extended to access STL containers and use their iterators.

The Type Info Class provides an extended interface for STL container types. (The STL container type means the instantiated template type here, such as vector<int>.) Every STL container type must have its own Type Info Class and static Type Info record. The IsSTLCntnr() function returns the address of the extended Type Info Class (rTD_STLCntnr_i). This class is a descendant of the rProp_TypeInfo_c class and adds some additional functions operating on STL containers and their iterators.

STL containers work in different ways; therefore, several Type Info Classes exist to handle them in the best way. Please consider that random access should be used for vectors (vector, deque), but iterators for lists and associative containers (map). Some containers (such as a vector) invalidate iterators when new items are added; therefore, the operator[] is used to access elements. Others (list, map) do not invalidate the iterator, but do not support random access of elements.

The extended interface of the Type Info Class provides services for accessing STL containers, but most operations work with iterators, and these iterators must be stored in the Property Iterator. An advanced programming technique, the .Type Destroyer is Type Restorer. design pattern, is used to make this possible.

All functions operating on containers use void pointers to get the address of the iterator. The iterator itself is created by the Type Info Class, but stored by the Property Iterator. The appropriate Type Info Class does all operations on the iterator, whereas every Property Iterator can store its own copy of the container iterator as an array of bytes without actually knowing what is stored. This seems complicated, but it ensures the required independence of the Property Iterators and adds only a little overhead to the usage of STL iterators.

Property Descriptors

Property Descriptors make it possible to handle compound types without manually writing the type description and the conversion functions. Instead, a description of the members is given and the type description is generated automatically.

Types are used at different places in C++ programs for declaring variables, members of compound types (structure or class), function arguments, return values, and elements of arrays or containers. The Type Info Class gives information about the type itself. Is this information is enough to access a variable, an instance of the given type? Unfortunately, it is not. If we want to access a variable, we need to know its address as well. The address belongs to the instance; therefore, it cannot be stored in the type descriptor. If we want to access a simple variable, we have its address, but the address of a data member or an element of a container is not so trivial. We generally have the address of the class or container, and we should somehow compute or get the address of the member or element. This is the task of the Property Descriptors.

The Property Descriptor makes possible to access the members of classes and elements of containers. They store a reference to the Type Info Record and some other information necessary to access the data. Different types require different methods. Member variables can be accessed by their offset, member functions have to be called, and elements of containers have to be accessed through the functions of the container class or by using iterators. Therefore, the Property Descriptor has many forms, depending on the member it is designed for.

Every compound type has its own Property Descriptor Table for describing its properties. This Property Descriptor Table is a static array of rProp_Descriptor_c classes, and it mirrors the class declaration. The array has an item for every property and the type of the item depends on the type of the member (data or member function, pointer, array, or container, and so forth). Different Property Descriptors belong to base types, compound types, and containers, leading us to define a polymorph hierarchy of different Property Descriptors.

However, elements of a static array must have the same type. How we can store polymorph classes in a static array, and how can we initialize it before the program starts running? The first possibility would be by using an array of pointers, but it cannot be initialized with a list of values. The better solution is that rProp_Descriptor_c class is a wrapper class, and Property Descriptor Table is an array of rProp_Descriptor_c. Its constructor creates the actual Property Descriptor and the member functions only mirror the polymorph functions of the internal representation. Generally, only the Property Iterators use the Property Descriptors and applications rarely need to access them directly.

The rProp_Descriptor_c class is very simple. It has only one data member (a pointer for storing the internal, polymorph implementation), and many constructors for creating property descriptors for all possible properties. Member functions are simple gate functions to access the information stored by the Property Descriptor.

The real functionality of the Property descriptors is implemented in the internal descriptor classes. They have an abstract base class (rProp_DescBase_i), but to understand the functionality of the Property Descriptors, we have to understand how the descendant classes work.

The Abstract Base Class (rProp_DescBase_I)

There are three data members in rProp_DescBase_I, storing common information for all Property Descriptors: the name of the property, the address of the Type Info Record, and the property flags.

Property Name

The name of the property has the same purpose as the name of a variable. When the properties are searched, or a single item of a complex data structure is referenced, the Property Name can be used on the same way, as member names are used in C++ expressions. For example, you can make the following function call for setting the A.B.strName member variable:

The example assumed that the property names are the same as the variable names. The SetValStr() function searches for a property called .B., then it searches for a property called .strName. in B. If the proper string variable is found, the SetValStr() function of the Type Info Record is called with the address of the string data member and the string representation of the new value (.Mr. Smith.).

Property Flags

Property Flags give some additional information for the Property Iterator about the property. Some of the flags describe language dependent information, such as pointer, access right (public, protected, private); others describe additional information not provided by the C++ language, such as streamable.

Reference to the Type Descriptor

The third common data member is the address of the Type Info Record. It is the link to the type descriptor.

Member Descriptor

The simplest case of properties is the member variables described by the rPropDescMember_c class. Above the information stored in rProp_DescBase_i, it has only one data member for storing the offset of the variable. The real address of the member variable can be computed by adding this offset to the address of the object.

Member Function Descriptors

Properties sometimes cannot be set or get by directly accessing member variables. It is a common solution in C++ to use gate functions for reading and writing private or protected data members. These functions may have other functionality, such as checking the values, or they may introduce .virtual. properties, which are not represented by a single member variable. For example, a rectangle may store the coordinates of their corners, but it may have properties computed from these coordinates, such as .Height. and .Width.. There are special Property Descriptors that make it possible to use member functions as properties.

There are several problems that you will encounter when using member functions as properties:

Memory has to be allocated for storing the value of arguments when the member functions are called. The data value is given in ASCII format when SetValStr() is called. It is converted to a binary value before passing it to the appropriate gate function. The temporary variable cannot be in the Property Descriptor because it is possible that the same Property Descriptor is used in parallel by several threads. It is rather stored in the Property Iterator, as we will see later.

Properties are generally read-write values. Therefore, two gate functions are required to set and get the value of a given property. This seems to be simple, but sometimes these gate function pairs do not exist. For example, a rectangle class (generally used in graphical user interface classes) has a function for setting the coordinates by providing four parameters: the x and y coordinates, the height and the width (such as rect(x,y,w,h)), while it has four functions for getting the values, such as x(), y(), w(), and h(). These cases should be avoided by adding the some further member functions.

When the gate functions have several parameters, the situation becomes even more difficult. This can be handled if the property is treated as a compound property, and the parameters are collected to a structure having its own property description.

Gate functions are handled as callback functions. The technique used to implement gate functions is based on the idea published by [Jakubik], but it is simpler. The Property Descriptor class of gate functions has two members for storing the member function pointers of the gate functions. These members provide polymorph descriptors of gate functions for calling the set and get member functions. The base classes have a common interface and several template classes are derived for describing different function argument lists. (Even the simplest gate functions may have five different kinds of parameters. The argument can be passed by value or by address, when a pointer or reference can be used, which may have a constant or not.)

There are two solutions when the gate function has several arguments. The simplest case, when the type of the arguments is the same, is when they can be described and stored as an array. When the type of the arguments is different, a structure has to be created to store the temporary values of the parameters.

Arrays

The simplest containers are the traditional arrays. The Property Descriptors somehow have to know the size of the array and the number of elements actually stored in the array. There are three possible ways to use C-style arrays:

Fixed size arrays always have a constant number of elements, equal to the size of the array.

Dynamic arrays do not have constant size. The number of elements is stored in a variable somewhere.

Null-terminated arrays also store fewer elements than the size, but the number of elements is not stored, but a null terminator is used to mark the end of the elements. C strings are the most common example of null-terminated arrays.

Both fixed size and dynamic arrays may use a null terminator, and dynamic and null-terminated arrays may be allocated dynamically (on the heap). The Property Descriptors of arrays support all these cases, with two Property Descriptors for fixed size and dynamic arrays, both supporting a null terminator.

Fixed Size Arrays

This Property Descriptor has three additional data members:

_ArrSize describes the size of the array. This is a simple integer variable and stores the constant size of the array.

_PDItems is the Property Descriptor of the elements stored in the array. If the array stores pointers (maybe to polymorph objects), the type of the pointer is described here because the pointers and not the actual objects are the elements of the array.

The Property Descriptors work closely together with the appropriate Property Iterators. The Property Iterator uses the information stored in the Property Descriptor when the address of the next property is computed. Then, it uses the services of the Type Info class to access the value of the property.

Dynamic Arrays

The place of the actual size of the dynamic arrays has to be described by the Property Descriptor. It is assumed that the actual number of elements is stored in another property of the class. Therefore, the Property Descriptor stores the address of another Property Descriptor and uses it to get the actual size of the array by calling the GetSize() function of the Type Info record.

The additional data members are slightly different from a fixed size array:

_MaxArrSize describes the size of the array. This is a simple integer variable and stores the allocated size of the array.

_pPDSize is a reference (pointer) to another property storing the number of elements currently stored in the array.

_PDItems is the Property Descriptor of the elements stored in the array. If the array stores pointers (maybe to polymorph objects), the type of the pointer is described here because the pointers and not the actual objects are the elements of the array.

Standard Containers

Containers could be handled as classes by defining some properties. However, they provide an important abstraction that should be reflected in the property description. The internal structure of containers is complicated and hidden, making it difficult to add and use the property description. For example, the standard set template class generally stores the elements in a binary tree, but it does not make sense to the user, who simply wants to see the list of the elements.

Here we are talking about the containers of the Standard Template Library. The Type Info Class makes it possible to use only a few functions of the containers for accessing its content, and the Property Descriptors use the services of the Type Info Class. They do not directly access the container. It is possible to use any container having the appropriate interface and it is also possible to extend the list of Property Descriptors to use containers with different interfaces. However, the implementation discussed here fits most of the applications.

The Property Descriptors of STL containers do not need to store any additional data about the variable they belong to. The address of the container object is already stored in the base class, and all other information is stored in the container itself. Even the Property Descriptor of elements stored in the container is gotten from the Type Info Record. However, different Property Descriptor classes work in different ways, and they vary depending on the type of the container. There are three different possibilities:

rPropDescSTLRandom_c sets and gets elements of the container using random access operators. This is the natural way of handling vectors and dequeues, and makes it easy to generate the name of the item from its index (e.g. .[25].).

rPropDescSTLList_c uses iterators to handle the containers and .insert. function to add new elements. Consequently, it can be used with any container—even with sets and maps.

rPropDescSTLAssociative_c is designed to handle associative containers such as maps and multimaps. It is somehow similar to random access, but instead of using a simple integer number for identifying elements, any key value can be used. It means that the property description of the key and the element have to be created with the Type Info Record of the container instance. The name of the properties in the container is generated from the key (e.g. [.John Smith.]) and the name and the value are used together when new elements are inserted into the container. (Associative containers have not implemented in Oops yet. Maps and multimaps can be handled as list type containers.)

Services of Property Descriptors

Property Descriptors generally work with Property Iterators, but sometimes it is necessary to use them directly. For example, when a global container stores objects, the first Property Iterator has to be gotten by calling the CreatePropIterator() function of the container's Property Descriptor.

Property Iterators use the internal polymorph version of Property Descriptors; therefore, the rProp_Descriptor_c wrapper class gives access to the internal representation and provides only a few functions calling the appropriate function of the internal object directly.

Information of Property

Some member functions of Property Descriptors give information about the property. This information is used to decide what to do with the property, and how to handle it.

Size() returns the size of the property.

TypeInfo() returns the address of the Type Info Record, which belongs to the property. If this property is a pointer of polymorph classes, the Type Info Record of the pointer's type is returned.

TypeInfo( void *apCurPropAddr ) returns the Type Info Record of the real type, i.e. the type of the object pointed by the pointer. Polymorphism is taken into account.

Name() returns the name of the property. This is generally the programmer-given name of the property, but in case of containers an index is appended, which makes a difference between elements.

IsCntnr() returns true if the property contains other properties. It returns true for classes, structures, arrays, and containers as well. It is used to check whether a new Property Iterator can be opened for iterating through the content of the property.

IsPropClass() returns true if the type of the property has a Property Interface, i.e. it is descendant of rProp_BaseClass_i.

IsExpandable() returns the address of Property Descriptor of elements stored in the property, if it is expandable. Expandable means that the property is a container, and its size can be increased when new elements are added.

Flags() returns the flags of the property. These flags are assigned to the property when the Property Descriptor is created. The following flags are the most important:

rcProp_Ptr—the property is pointer.

rcProp_Owner—the property owns the pointed object.

rcProp_Readable and rcProp_Writeable—describe whether the property can be read or written.

rcProp_Strm signs that this property should be written to a stream when the parent object is saved (persistency).

IsPtr() returns true if the property is pointer. This function simply checks the appropriate flag.

IsOwner() returns true if the property owns the stored object. This function simply checks the appropriate flag. When this flag is set together with the pointer flag, the object pointed by the property is deleted when a new object is assigned. If this flag is not set, Oops will never delete the object, which may lead to a memory leak.

These functions are used by Property Iterators to get the necessary information about the pointed properties.

Address of Object

The address of the property is necessary to set or get the value of the property. The main task of the Property Descriptor is to calculate this address. The simplest example is getting the address of a class' data member. The Property Descriptor stores the offset of the variable in the object. The Property Iterator can get this offset from the Property Descriptor and it can calculate the address of the variable by adding the offset to the base address of the object. The situation becomes more complicated in case of polymorph objects, arrays or containers, but the Property Descriptor hides this complexity.

The following functions are used to calculate the address of the property:

Offs() returns the offset of the property, if it makes sense.

PropAddr() returns the address of the property or the pointer to the property. This function is used when the pointer has to be accessed.

RealAddr() returns the address of the property. This function takes into account pointers and polymorphism, and returns the real address of the property.

ApplyPtr() returns the address of the pointed object, if the property is pointer.

PropBaseClass() returns the address of the Property Interface when the property is the ancestor of the rProp_BaseClass_i class.

Access the Value of Property

When the property already exists, the GetValXXX() and SetValXXX() functions can be used to access the value (XXX stands for Str or Bin). These functions call the appropriate functions of the Type Info Class. The Property Descriptor's version only transforms the given arguments using its knowledge.

When the property has not exist—for example, if it is a pointer—the object has to be created first. The AddNewStr() and AddNewBin() functions create the object and sets its value, while the AddNewPtr() only creates the object and stores the address, but the value or the content can be set later. This is the case when polymorph objects are loaded from a stream. The objects are created and their addresses are stored in a container, then a new Property Iterator is used to iterate through the properties of the new object and set the value of their properties.

Iterating through Compound or Container Objects

When the property's type is not Base Type, the Property Descriptor has to give access to the properties of the property. This is the task of the CreatePropIterator() function, which creates a new Property Iterator pointing to the first or the last sub-property. Property Iterators will be discussed later in detail, but the CreatePropIterator() function is the key to understanding how Oops handles the tree structure of data. Property Iterators are used to step through one level of the tree. When the IsCntnr() function of a property returns true. the CreatePropIterator() function can open the branch by creating a new Property Iterator. This new iterator iterates through the properties of the branch.

Inheritance

Inheritance is one of the most important features of every object-oriented language. C++ supports multiple and virtual inheritance, which should be supported as well by all RTTI description or library providing persistency. Unfortunately, only a few libraries support multiple inheritance (e.g. Microsoft's MFC does not allow using multiple inheritance), because it is quite difficult to handle. Let's see how Oops handles inheritance.

Every class describes only its own properties in the Property Descriptor Table. The properties of ancestor classes are described in the Property Descriptor Table of the ancestor classes; therefore, when the Property Iterator iterates through the properties of a class, it has to find the Property Descriptor Table of ancestor classes to get the inherited properties.

The property description of the class has to contain the list of ancestor classes as well. For example, the Property Descriptor Table may have special items for the ancestor classes. Unfortunately, the real world is not simple; therefore, Oops provides two different solutions for describing inheritance, both having some advantages and disadvantages. The programmer provides this information when the property description of the class is created.

The original description of inheritance is too difficult to use and slow for accessing properties at run-time; therefore, it is used only to initialize a faster internal representation. This internal representation is quite simple and it is built when the RTTI description of the class is used first. The internal representation handles inheritance with a list of records describing all ancestor classes, not only the direct ancestors of the class.

The programmer only defines the direct ancestor in the property description; therefore, the list has to collect information by iterating through all the classes. The list contains the address of the ancestor class' Property Descriptor Tables and the offset of the ancestor class' data inside the object. It is built when it is used the first time by calling the CreateIteratorTable() function of the class, which builds the list, if it has not built yet, and returns its address.

The internal representation of ancestor classes has another major task. When the list of all sub-classes is created, the CreateIteratorTable() function checks whether the next sub-class is already added to the list. This way, Oops can properly handle virtual inheritance because the offset of the virtual sub-class does not depend on the root that was reached. How we can get the offset of the virtual sub-class is another problem. The compiler-independent solution used in Oops is discussed later.

Structures and Classes

There is only one little difference between structures and classes in C++. The default access right is public in the case of structures and private in the case of classes. This small difference is not important most of the time, but the philosophy is extended in Oops.

There are two ways of defining the property description of compound types, and both work for structures and classes as well. There are two important differences:

The structure-like description can only access public members, whereas the class-like description makes it possible to access all—private and protected members as well.

The structure class description does not touch the original definition of the compound type, whereas the class-like description forces you to use a base class and implements some abstract functions to add the Property Interface. It also makes the Type Info Record and the Property Descriptor Table the friend of the class.

Both solutions have advantages; therefore, Oops implements both of them, and the programmer making the property description can decide which one he/she prefers in the given situation.

The structure-like description is not intrusive. It can be used even if the source code of the class is not available (only the object code) or it cannot be modified for some reason (for example, adding a property interface to a library). The Property Descriptor Table could have access to the private and protected members if it had made a friend of the class, but the friend statement has to be added to the class definition.

Another advantage of the structure-like description is that the ancestor classes can be simply added to the Property Descriptor Table. This makes the property description simpler and easier to understand. The disadvantage of this solution is that, even if it can properly handle multiple inheritance, it fails on virtual inheritance.

The class-like description handles the inheritance in a different way, working properly even with virtual inheritance, but the price is that this method is intrusive. However, the base class also gives a standard interface, called Property Interface, which makes it possible to pass the object for any program by knowing this interface only.

The following examples illustrate the difference. B_s is a structure and B_c is a class. Both have two ancestors and a member variable. Please note the difference in how the property description handles the inheritance.

The ancestor classes are described at the beginning of the Property Descriptor Table and the definition of the class is not modified. Actually, there is only one thing the programmer has to declare somewhere above of the Property Descriptor Table, the declaration of the rProp_GetTypeInfo() function for structure B_s.

The property description is more complicated in this case. Class B_c has to be derived from rProp_BaseClass_i, its definition has to contain an additional macro (rProp_DeclareInterface_d), and above the Property Descriptor Table, it must implement the property description with the rProp_ImplementInterface_d macro. The ancestor classes are passed to this macro and they are not listed in the Property Descriptor Table.

These macros seem to be mysterious at the first glance, but they are quite simple. There are two reasons for using them:

They implement a higher level of abstraction and make the property description easier to read.

They make it possible to remove the property description from a library by simply defining empty macros.

Users of Oops became familiar with the macros quickly. The documentation and the tutorials explain how they work.

Property Base Class

All compound types having a class-style type description are derived from the rProp_BaseClass_i interface class. (Classes derived from rProp_BaseClass_i are called Property Classes in this article.)

The Property Interface provides a good starting point when the application's data structure should be passed to a program dealing with properties, like the Save() of the Stream Library. These functions generally need a Property Iterator to work with the data structure, and Property Classes can create Property Iterators for themselves. It is a simple and convenient solution if all classes have a common interface, where we can get access to their properties.

The rProp_BaseClass_i class has the following member functions:

Similarly to the Type Info Class, it makes it possible to convert the class to string or binary format (SetValXXX(), GetValXXX(), AddNewXXX()).

The second group of member functions creates Property Iterators to the object's properties (Find(), Begin(), RBegin(), End(), and CreatePropIterator()).

GetTypeInfo() returns the Type Info Record.

There are some functions for casting the address of the class to different sub-classes (RealBase(), DynCast()).

Above these public services, Property Classes have a function (CreateIteratorTable()) for filling the list of ancestor classes.

The CreateIteratorTable() function

This function creates the list of ancestor classes from the information given by the programmer about inheritance. It is a recursive function. Every compound type knows its immediate ancestors; therefore, CreateIteratorTable() calls the CreateIteratorTable() function of all immediate ancestors. The list of all ancestors is passed as well, and records about the new ancestor classes are added to the list.

The description of the sub-class contains the reference to the Property Descriptor Table and Type Info Record, and the offset of the sub-class from the address of the whole object. This offset depends on the compiler, and multiple and virtual inheritance makes it difficult to determine it. This is why we need CreateIteratorTable(). It is used to determine this offset and fill the list of ancestor classes properly.

CreateIteratorTable() has different implementations for structure and class type property description.

In the case of a structure-like property description, CreateIteratorTable() is the member of the Type Info Class and the offset of the sub-class is stored in a Property Descriptor Table entry describing the ancestor class. This solution is simple and handles multiple-inheritance properly; however, it cannot access private and protected ancestors and cannot calculate the offset of virtual sub-classes properly.

The class style property description handles all cases of inheritance properly. This is why it is preferred even if the class getting property description has to be modified. CreateIteratorTable() is a virtual member function of the class and every class has its own version of CreateIteratorTable(). As a member of the class, CreateIteratorTable() knows the "this" pointer and it can cast it to the type of the ancestor classes. This way, the compiler properly calculates the offset even in case of virtual multiple-inheritance. CreateIteratorTable() also checks whether the sub-class is already added to the list. This way, virtual sub-classes are added to the list only once.

It is not trivial how to implement this solution because the CreateIteratorTable() function has to be implemented for every class, and the implementation has to directly call the ancestor class' CreateIteratorTable() functions. The only way to automate this task is to provide a macro that generates the body of the CreateIteratorTable() function. The list of the ancestor's function calls is passed as argument to the macro. Macro arguments are separated by commas and C instructions are separated by semicolons; therefore, a macro argument may contain a list of C instructions. (This is a good example for using macros. We have not found any replacement of this macro.)

Property Iterators

Using Type Info Records and Property Descriptor Tables is quite complicated. The property-based RTTI system needs a well-known and easy to use interface to make the user's life easier and the programs accessing properties simpler. Property Iterators were developed for this purpose, based on the well-known iterator pattern and STL iterators.

Property Iterators can be treated as pointers to properties, not exactly like C pointers, but in the term of the original meaning of the word. A Property Iterator belongs to a Property Container (an instance of compound types or containers) and points to one of its properties. It is a class having some member functions for accessing the value of the property and getting information about it. As with any kind of iterators, Property Iterators can be incremented and decremented for stepping through the list of properties and they can be compared with other Property Iterators as well.

However, Property Iterators are designed to traverse a hierarchical data structure. When a Property Iterator points to a property having sub-properties (Property Container), the Begin() and End() functions of the iterator can be used to create a new Property Iterator to access the internal data structure.

What does the Property Iterator do?

The main task of Property Iterators is to maintain some variables required to access the selected property of a given object. This information contains the address of the object, the Property Descriptor Table, the Type Info Record, and some kind of index information, which varies for different Property Containers. It may be an index into the Property Descriptor Table, an array, or it may be an STL iterator.

This information is used to calculate the address of the property, which is used in most of the member functions to call the appropriate function of the Property Descriptor. The calculated address is generally cached. Therefore, the Property Iterators may become invalid if the object or the container is changed.

When the iterator is incremented or decremented, the business logic of the Property Iterator updates the internal index information. This job is quite difficult. In case of compound types, the inheritance and the internal structure of Property Descriptor Tables are handled, whereas the appropriate STL iterator has to be used to access elements of containers.

Encapsulation

Unfortunately, a lot of different Property Iterators exist for different kind of properties, similarly to Property Descriptors. In fact, Property Iterators and Descriptors work together, and Property Descriptors have their Property Iterator pair.

To make life simpler, Property Iterators are encapsulated in a wrapper class. There is only one public Property Iterator, which hides the actual type of the Property Iterator and makes it possible to use the same type everywhere in application programs.

Property Iterator Groups

Every Property Iterator works in a different way, but there are three basic groups.

Iterating Compound Types

The Property Iterator has to access a member of an object, and to do that, it needs the address of the data member. The iterator knows the base address of the compound object, the offset of the data member (from the Property Descriptor Table), and the offset of the sub-class (from the list of ancestor classes), and it can calculate the required address by adding these offsets to the base address.

Actually, the index of the Property Descriptor Table is incremented (or decremented) when the iterator is incremented (or decremented). The iterator checks whether the next item in the Property Descriptor Table matches to the given filtering options and increments the index as long as an appropriate item or the end of the table is found. When the end of the Property Descriptor Table is reached, the iterator gets the Property Descriptor Table of the following ancestor class, and continues searching for the next appropriate item.

Iterating Arrays

The array iterators are the simplest Property Iterators. They maintain an index of the array and calculate the address of items by adding the index multiplied by the size to the base address of the array. The number of elements can be determined with the help of the Property Descriptor of the array, but several array iterators exist for different cases (fixed size, variable size, or null-terminated arrays).

Iterating Containers

STL container iterators are the trickiest part of the Property Library. The basic task is simple: The Property Iterator has to store an iterator of the container and it has to make it possible to insert new elements into the container. The problem is that Property Iterators know absolutely nothing about the STL container! (Remember that only Type Descriptors have any knowledge of the handled data types.)

There are two tricks for solving the problem:

The Type Info Class of STL containers provides an extended interface. The member functions of this interface make it possible to get the number of elements, insert elements, create, and handle iterators.

The Property Iterator stores an appropriate iterator pointing to the iterated container. The iterator is stored as an array of bytes without knowing anything about their meaning. When the Property Iterator wants to use the STL iterator, it calls a function of the Type Info Record and passes the address of the memory storing the iterator. This technique is known as the "Type Destroyer is Type Restorer" pattern.

Property Iterators have nothing to do when new elements are inserted; they just call the appropriate function of the Property Descriptor. New elements can be inserted to containers through the Property Iterator or the Property Descriptor of the container as well.

Services of Property Iterators

Property Iterator has many member functions. Most of them are gate functions of the Type Info Record and Property Descriptor of the pointed property, and some of them are used to create a new iterator for iterating through the properties of the pointed property. Property Iterators make it possible to select certain properties and step over all properties not matching the given filtering condition.

Creating Property Iterators

The well-known Begin() and End() functions create a new Property Iterator for the pointed property. This way, the branches of the hierarchical data structure can be opened. The Begin() and RBegin() functions create an iterator pointing to the first and last property, respectively. The End() member function, however, creates an invalid iterator that can be used in both cases. It simply creates an empty Property Iterator wrapper class (storing a NULL pointer).

All these functions have three different forms. The first function set has no arguments. It creates a new iterator and returns it by value. This is the most convenient way, but unfortunately it is slow because the functions have to make a copy of the created iterator when it returns, and the copy constructor clones the internal representation of the iterator.

rProp_Iterator_c SubIter = Iter.Begin();

The second function set initializes the iterator from its argument. The argument is the Property Iterator of the parent object. The iterator that the Begin() function is called for will point to the first property of the object pointed to by the iterator passed. (Please consider this Begin() function as an initialization function, doing something similar, rather than a constructor.)

rProp_Iterator_c SubIter;
SubIter.Begin(Iter);

This solution is not so common, but works faster then the first one. It also may cause some confusion.

The third function set is not so interesting. It makes it possible to initialize the Property Iterator from a Property Descriptor item, which is useful when the root object is a container or an array.

Filtering

Filtering is an advanced feature of Property Iterators. Sometimes it is required to make a difference between properties. For example, properties may be redundant; therefore, it may not be necessary or possible to save all properties to the stream. In that case, Property Iterators can filter the properties regarding the given mask, which is compared with the flags stored in Property Descriptors. For example, the rcProp_Strm flag is used to signal that this property should be saved to the stream.

All functions and constructors creating a Property Iterator have an additional argument (with a default value of 0) to define the filtering condition. The default value means that the filter flags of the parent iterator will be used.

Property Library does not define all fields of the filtering flags. Applications may introduce their own filtering flags to customize the filtering feature.

The filtering flags are taken into account when the Property Iterator is incremented or decremented. If the next property does not match the filtering flags, the iterator is moved further as long as an appropriate property or the end of the properties is found.

Accessing the Pointed Property

Member functions of Property Iterators provide access to all information stored by the RTTI system, such as the name, type, and size of the pointed property, or its classifications (container, pointer, Property Class). It is also possible to set and get the value of the property. If the property is extendable, new elements can be inserted through the Property Iterator as well.

Some Special Cases

Template Classes

Template classes are handled in the same way as normal classes. The only difference comes from the fact that every template instance is a new type. Therefore, every template instance has to have its own type description.

Standard containers are handled on the same way. Every instance of STL containers gets its own Type Info Record and Property Descriptor.

Type Definition

In C++, you can introduce new types with the "typedef" keyword. These are not really new types, just alias names for other types. (For example, the compiler uses the original type in the error message instead of the defined type name.) However, type definition is very useful.

Fortunately, the Property Description is also transparent for type definition because the global rProp_GetTypeInfo() function returns the Type Info Record. The argument of this function is a typed pointer. As long as the compiler is able to convert a variable to one of the types having type description, this function will return the address of an appropriate Type Descriptor. (This may lead to problems. The automatic type conversions make it possible to find a type descriptor, even if the given type does not have a type description at all.)

Enumeration

Enumerations can be handled as integer numbers. This is a simple solution, and does not require any explanation. It works similarly to type definition.

However, it would be much better if we could use the names of the enumeration instead of their values. It also would be helpful to avoid out-of-range values. There are reasons of using enumeration, and Oops should make it possible to use them correctly.

For example the date record should look like "day=22; month=May; year=2003" instead of simple numbers ("day=22; month=5; year=2003"). Please note that it is unambiguous how the months are represented: 5 or 4 means May.

The solution is very simple from the Property Library point of view. A new Type Info Class has to be written for enumerations, which can convert the enumeration values between binary and string representation. The SetValStr() and GetValStr() functions have to be able to do the conversion.

Oops provides a template Type Info Class for handling enumerations. The class stores a map of the value and string pairs, and the SetValStr() and GetValStr() functions use this map to do the conversion. The constructor initializes the map.

Only one problem left. C++ compilers do not know anything about enumeration names at run-time; therefore, the program has to provide this information when the Type Info Record is created. Generally, programmers write a function returning the string representation of the enumeration value, and they use this function when an enumeration value is printed or displayed. Similarly, the Oops Library uses a function to initialize the map. It also would be possible to provide a string containing all enumeration values, but the conversion function has some advantages, when the enumeration values does not form a simple series of numbers, i.e. there are holes, invalid values between them.

How to Use Oops

The implementation of Oops Library is complicated and not easy to understand in detail. However, it is quite easy to use it. There are two steps to using Oops:

Creating the property description.

Using objects and libraries having a Property Interface.

It is easy to define the properties and type descriptors with the provided macros, while these macros make it possible to remove the Property Interface in one application and use it in another one.

On the other hand, Property Iterators make it easy to use the classes and objects having a property interface. For example, the following simple recursive function lists properties to the standard output:

Creating Type Info of Base Types

The users of the Oops Library rarely need to write Type Info Classes. The library implements them for all built-in types and strings. It also provides templates for STL containers and enumerations.

However, it is not difficult to write a new Type Info Class if your application has types better described as Base Types. The Type Info Classes provided by Oops are a good starting point. Just pick one of them and modify the member functions as needed to handle your data type. There are a few functions for converting data to string and binary formats. Writing these functions is 90% of the job.

Adding Property Interface to Classes

Adding property interface to your classes is very simple. You basically have to tell which classes you want to have a Property Interface and which members you want to be properties.

Using the appropriate macros the Type Info Record and the Property Descriptor Table of the class have to be declared and implemented. The Oops Library has a lot of example programs and tutorials showing how to do it in different situations.

Using macros is not nice and may be confusing because people do not know what they are doing. There are two reasons for using macros to define the property description:

They make it possible to remove the property description when it is not needed. Just define all macros empty.

The macros make it easier to create the definition of property description. They avoid writing the same thing several times and make the code more readable when you used to use them.

However, you do not have to use the macros. All code can be written by hand if you wish. It is also a good way to debug the property description if your macros expand the macros by hand.

Conclusion

The problem of persistency has existed since the objects or data records was invented. Consequently, many solutions exist for saving and loading objects. However, Oops is one of the best and the only one providing property-based persistency. Even Oops is not a new solution as well. Its development started in 1996 and the first application (Speech Corrector, http://www.rcs.hu/boxoftricks) using property-based persistency has been on the market for several years.

The competitor libraries generally use different solutions. They can be divided into two groups:

Automated solutions

Persistency with read() and write() functions

In the first case, objects are saved as a block of binary data without knowing what it is. When the raw binary data is loaded back, some pointers (virtual method table) have to be updated. There are several tricky solutions for doing that, but even if they work, they are not an elegant solution. Moreover, the data stream will not be robust or human readable.

The second case is somehow similar to the solution described here. It also requires some kind of run time type identification and functions for converting the objects to the format of the stream. The libraries provide little help to the programmer, who has to write one or two functions for reading and writing the object to the stream. This way, the programmer could do anything in these functions, but in most cases the variables are simply written to the stream. It is the responsibility of the programmer to read and write the same variables in the same order. The stream might be human readable, but it is never robust.

Property-based persistency implemented by Oops is somehow similar to the second solution. The conversion functions of the Type Info Records have similar functionality in the stream of an object passed to the read/write functions. Defining the Property Description requires about the same amount of work from the programmer as implementing the read/write functions. You can consider Oops as an automated, robust, modular, and scaleable implementation of that idea.

The Property Library handles the following features of C++ properly:

Supporting base types, compound types, and containers

Supporting type definitions and enumerations

Property can be anything:

Data member

Pointer

Member function

Array

Standard container (vector, list, and so on)

Multiple inheritance

Virtual inheritance

Sub-classes

Template classes

This list contains almost all possibilities of the language; however, there are some restrictions:

The classes must have a default constructor.

Reference members cannot be used as properties and they generally cannot be initialized in the default constructor. Therefore, it is difficult to add properties to classes having reference data members.

The Oops Library has some important features not provided by any of the competitors:

Not all member variables become properties automatically.

Member functions and data members as well can be properties.

Containers can immediately be used as properties.

Supports multiple and virtual inheritance.

The Stream Library of Oops implements property stream classes (text, XML, binary), that are human readable and robust; that is, it tolerates different ordering, missing, and unknown values.

Glossary

Application Generator. Application Generators are program development tools making the program development process quick and easy. They provide a set of components and a nice graphical user interface where someone can build an application by adding and configuring components. The users of these programs do not have to be programmers to build an application and these tools provide a higher abstraction level than the traditional programming languages.

Base Type. The RTTI system provides an immediate description for base types. They are leaves of the property tree, and their internal structure cannot be investigated through the Property Interface.

Class. This word is used for a type introduced by a C++ class. The word "class" is used for the type, while the word "object" is used for its instances.

Compound Types. C++ class or structure. The type is represented as a collection of members called properties.

Container Type. There are special types and classes in C++ designed for storing objects of different types. They are called containers. Static or dynamic arrays, the standard vector, list, map, and set are all containers.

Object. This word is used for an instance of C++ class or other types. The word "class" is used for the type, while the word "object" is used for its instances.

Property is common word. Here, "property" is a member of a class or an item of a container made visible by Oops. All or just a few members (both member functions and data members) of classes may be properties. Any application can access properties without knowing the class itself.

Property Descriptor is an object describing a property. It stores information such as the offset of the data member, the address of the Type Info Record describing the type of the property, and the name of the property.

Property Descriptor Table is an array of Property Descriptors describing the properties of a class. Every compound type has its own "Property Descriptor Table." It is related to Type Information stored by Oops.

Property Iterator acts like a pointer pointing to a property. It can be incremented or decremented to get the next property. Property Iterators give access to the property and make it possible to go into the details of the pointed property by creating a new Property Iterator.

Run Time Type Identification (RTTI) is a common term for type information available at run time. Generally, programming languages provide access to type information at compile time only, but sometimes this information is required at run time. Persistency is the best example of RTTI applications.

Type Information is a collection of the properties of types. Generally, it contains a string and a binary identifier of the type, some classification (base type, container, compound type), the size of the object, and a way to create and destroy objects of the given type.

Type Info Record is the instance of the type descriptor of a given type. Every type must have its own Type Info Record. The "Type Info Record" is given by the Oops library, or written by the user, or created automatically when the properties of a class are defined.

Top White Papers and Webcasts

U.S. companies are desperately trying to recruit and hire skilled software engineers and developers, but there is simply not enough quality talent to go around. Tiempo Development is a nearshore software development company. Our headquarters are in AZ, but we are a pioneer and leader in outsourcing to Mexico, based on our three software development centers there. We have a proven process and we are experts at providing our customers with powerful solutions. We transform ideas into reality.

When individual departments procure cloud service for their own use, they usually don't consider the hazardous organization-wide implications. Read this paper to learn best practices for setting up an internal, IT-based cloud brokerage function that service the entire organization. Find out how this approach enables you to retain top-down visibility and control of network security and manage the impact of cloud traffic on your WAN.