Stefan has been working in the games industry as a programmer since 2004. He has worked on multi-platform technology for PC, Xbox 360, Playstation 3 and Wii during the last years, and now focuses on building middleware technology. Stefan can be found on LinkedIn, Facebook, and shares his thoughts on his programming-related blog.

Building a memory system – Part 1: Fundamentals

Today, I want to start a series on how to build your own memory system to be used in your game or engine. The series will cover how to handle allocations with vastly different lifetimes using specialized allocators, how to handle alignment restrictions, how to implement debugging features like memory tracking and tagging, and more. Before we can start, we need to delve into the inner workings of new, new[], delete anddelete[] today – you may be surprised about some of the subleties involved.

In order to keep things simpler and only concentrate on the crucial elements, we don’t deal with per-class new/delete, and we don’t want to mess with exceptions either, as they are rarely used in run-time game code.

new operator / operator new

The first thing to understand is that there is a difference between the new operator and operator new. Let’s look at a very simple statement involving the keyword new:

T* object =new T;

This is the simplest form of the new operator, probably used in many, many places in your code. What does it really do behind the scenes?

First, a call to operator new (note the difference!) is made to request storage for a single T.

Second, the constructor for T is called which constructs the new instance of T at the memory address returned by the previous call to operator new.

If T is of fundamental type (e.g. int, float, etc.), or does not have a constructor, no constructor will be called. The above statement will call the simplest form of operator new:

void* operator new(size_t bytes);

Notice the size_t argument – the compiler will automatically insert code for calling operator new with the correct size for a given type, which is sizeof(T) in our case. Because operators behave like ordinary functions, they can be called manually, and can have different overloads as well. Those overloads can also be invoked by using different versions of the new operator, with the compiler generating code for calling the corresponding version of operator new. In fact, there’s another standard version of the new operator, with so called placement-syntax:

This can be used to construct instances of classes at a certain place in memory, which in essence is the only way of calling constructors “directly”, because no memory allocation is involved here – the above calls a different overload of operator new, which is the following:

void* operator new(size_t bytes, void* ptr);

Even though this form of operator new takes size_t as the first argument, it does not allocate any memory, and just returns the pointer given in the second argument. That’s why our example simply invokes the constructor T::T() at address 0×100.

The placement-syntax of the new operator is very powerful, because it allows us to invoke different overloads of operator new with an unlimited number of custom arguments. The only rule is that the first argument to every operator new must always be of type size_t, which will automatically be passed to it by the compiler.

The magic of calling operator new is simply done by the compiler. Furthermore, remember that every overload of operator new can be called directly like ordinary functions, and we can do whatever we want with the different overloads. For example, we can even use templates if we want to:

This comes in handy later when we’re about to use different allocators, and want to provide additional arguments like e.g. alignment boundaries. The placement-syntax allows us to conveniently allocate memory with e.g. the following single-line statement:

T* object =new(allocator, alignment, __FILE__, __LINE__) T;

delete operator / operator delete

This is probably no big surprise, but again, it is crucial to understand that there is a difference between the delete operator and operator delete. Calling the delete operator on a previously new‘ed instance will first call the destructor, and then operator delete. Apart from the reverse order of operations, there’s another difference between new and delete: Regardless of which form of new we used to create the instance, the same version of operator delete will always be called (which is rather unfortunate when trying to implement advanced memory system techniques in later parts of the series):

The only time the corresponding operator delete is called by the compiler is when an exception is thrown inside operator new, so the memory can correctly be freed before the exception is propagated to the calling site. This is also the reason why every overload of operator new must always have a corresponding version of operator delete, even if it’s never called. But let’s not digress, we don’t want to deal with exceptions further.

Like operator new, operator delete can also be called directly (like an ordinary function):

If instances are created wih the simple placement-form of new, the destructor must always be called manually. Using delete on such an instance would invoke undefined behaviour (because the memory was never allocated with a call to new). Keep this in mind whenever you use placement new!

Having thoroughly discussed new/delete, let us take a look at their array siblings, new[] and delete[].

new[] / delete[]

Even though you have probably used it a thousand times already, you may not realize that in something so fundamental such as new[] and delete[], there’s already compiler magic involved. The reason for this is that the C++ standard just mandates whatnew[] and delete[] should do, but not how. Let us take a closer look, again starting with a simple example:

int* i =newint[3];

Similar to the new operator, the above allocates storage for three ints by calling operator new[] (requesting memory), and since int is an integral type, there are no constructors to call. Similar to operator new, we can overload operator new[] and use placement-syntax as well:

delete[] and operator delete[] behave similar to delete and operator delete – we can call operator delete[] directly if we wish, but must make sure to call the destructors manually (in reverse order). Nothing too fancy, but what happens with non-POD types?

Even though sizeof(Test) == 4 (MSVC 2010, Windows 32-bit platform), our version of operator new[] will get called with an argument of 16 bytes, instead of 12 bytes. Why? Think about how the array needs to be deleted:

delete[] objects;

The compiler must somehow know how many instances of type Test are to be deleted – otherwise it can’t call the instances’ destructors. So what almost every compiler does upon a call to new[] is the following:

Most of the time – if you don’t specify a particular alignment for a class using __attribute__((aligned)) (GCC) – the amount of extra bytes requested will be 4, but that depends on both your class’ alignment restrictions and the compiler. For example, if you specify Test to be 16-byte aligned using __declspec(align(16)), MSVC 2010 will request a total of 64 bytes in the example above.

As an example, let us use the definition of class Test from above: If your overload of operator new[] returns the memory address 0×100, Test* objects will point to 0×104, because of the extra 4 bytes requested by the implementation! The memory layout of the 16 bytes would then be:

When delete[] is used later on, the compiler inserts code which reads the number of instances N by going back 4 bytes from the given pointer, and calls the destructors in reverse order – if the type to be deleted is non-POD. Otherwise, there’s no 4 byte overhead added because no destructors need to be called (like in the new int[3] example above). If you ever wondered what the vector deleting destructor() in MSVC is for, there is your answer. Unfortunately, this compiler-defined behaviour causes problems when using our own overloads for operator new[] and operator delete[].

As an example for using custom overloads, we might want to pass alignment restrictions to operator new[], and return correctly aligned memory (e.g. on a 16-byte boundary). However, the compiler-implicit offset that is added to whatever we return will definitely screw with our alignment. Additionally, when we want to call operator delete[] directly, we somehow need to figure out how many destructors to call (if any).

Which we can’t.

The reason is that we can never be sure whether the compiler inserted some extra bytes (4, 8, or possibly more) in the allocation or not. This is totally compiler-dependent. It might work, but it could also horribly break with some user-defined types. And other compilers could do it differently altogether.

This is also the reason why using delete on instances allocated with new[] invokes undefined behaviour, and vice versa. The compiler-generated code simply tries to access memory which doesn’t belong to it (using delete[] for allocations via new), or not all instances of an array are correctly destructed (using delete for allocations vianew[]). This can have zero consequences (if the types don’t have a destructor), or crash your code (if the types have a destructor).

However, with the knowledge of what happens behind the scenes with calls to new, new[], delete and delete[], we can build our own allocation functions which correctly handle simple and array allocations for all types, can use our custom allocators, provide additional information like file name and line number, and more. The next post in this series will show how.

In the meantime, in case you want to read more about global operator new and class operator new (which we didn’t discuss here), here are recommended links: