Allocating objects vs. allocating storage

In my July column, I explained that Standard C and C++ offer somewhat different facilities for allocating and deallocating dynamic memory.1 C provides a small collection of memory management functions: malloc, calloc, free, and realloc. Although C++ also provides these functions (for compatibility with C), C++ offers the new and delete operators as an arguably better alternative.

In C, you typically allocate dynamic memory for an object of type T by using an expression of the form:

pt = malloc(sizeof(T));

where pt is presumably declared as a "pointer to T". In C++, you typically use a new-expression instead of calling malloc, as in:

pt = new T;

The differences between these two notations are not just superficial. C and C++ handle dynamic allocation in a fundamentally different way: whereas malloc allocates raw storage of indeterminate value, new can create objects of abstract types with coherent initial values. Using new-expressions instead of malloc reduces the possibility of runtime errors arising from questionable pointer conversions and improper initializations.

This month I'll explain how new-expressions interact with constructors and allocation functions in C++. I'll also explain how C programmers can employ a style of memory allocation that derives much of the benefit of using new-expressions.

New-expressions and constructorsIn C++, a constructor is a special class member function that initializes objects of its class type. A constructor's function name is always the same as its class name, as in:

class widget
{
public:
widget(); // a constructor
...
};

Constructors provide guaranteed initialization for class objects. Although you declare constructors when you define a class, you don't write calls to those constructors--the compiler generates them for you. Whenever you define an object with a class type, the compiler automatically plants a call to the object's constructor at the right place in the program.

For guaranteed initialization to really be guaranteed, the compiler must generate a call to a constructor wherever the source code creates an object, including in new-expressions. Thus, for a class type such as widget, a new-expression such as in:

pw = new widget;

doesn't just allocate storage for a widget; it applies widget's constructor to that storage to produce a properly constructed object.

The primary reason to avoid calling malloc in C++ is that doing so voids the initialization guarantee. Although the conventional C style for calling malloc uses a sizeof expression applied to a type, as in:

pw = malloc(sizeof(widget));

the call actually just passes an integer (the size of the type in bytes). Since malloc never knows the type of whatever it's allocating, it always returns the address of the allocated storage as a void *. Thus, the compiler loses the compile-time type information it needs to choose a constructor.

As I explained in my previous column on dynamic allocation, the conventional C style for calling malloc provokes a warning or an error message when compiled as C++.1 If you want the assignment to compile in C++, you must use a cast, as in:

pw = static_cast<widget *>
(malloc(sizeof(widget)));

Casting the pointer result from void * to widget * doesn't affect the contents of the allocated storage. It just forces the compiler to yield to your request to change the pointer's type. Executing this statement leaves pw pointing to an uninitialized widget.

You could say that it's the cast, not malloc, that voids the initialization guarantee. I wouldn't argue with you. However, it's nonetheless difficult to use malloc in C++ without a cast, so the conclusion is still the same: prefer new-expressions to malloc calls.

As with almost any other function, a constructor can have parameters, possibly many. For example, this widget class has a constructor with a single parameter of type int:

class widget
{public:
widget(int n); // a constructor
...
};

Using this class, a new-expression such as:

pw = new widget;

won't compile because widget's constructor now requires an argument, which this new-expression doesn't provide. In this case, you must provide a parenthesized argument list after the type name in the new-expression, as in: