The Placement New Operator

I’d like to shed a little bit of light on a dusty corner of the C++ language: there’s more than one “new” operator! Well, since you’ve likely encountered the vector new (new[]) operator, I should say there’s more than two “new” operators! I want to cover the concept of a “placement new” operator.
In C++, you cannot call a constructor directly. There’s no way for you to say: someTest->Test(). This makes sense; how would you get an instance of someTest without having called one of it constructors first? But this leaves an interesting problem for the programmer — what do you do when you want to take care of the allocation side of things yourself, but want the compiler to still call your constructor?

There’s the “shotgun” approach where you simply define an operator new function at the global scope. Now every time new is called, your operator new function is fired and you’ve got control over the allocation. But what do you do when you only want the special behavior for some particular class or classes? Then you can do the “rifle” approach where you define an operator new function within the classes you wish to control allocation of. In this way, your new allocator will be called when an instance of your special class is required, but the default new allocator will be called every other time. But this still leaves another case unhandled: what if you want to use the special allocator sometimes, but not other times? That’s where the placement new operator comes in — you can overload this operator, and then the choice goes to the developer as to whether they want to use the special allocator or not when creating the class instance.

Why would you ever want to do this?

Before I answer that question, I want to go into what the placement new operator looks like, how it’s typically used, and what problems can crop up from it.

Creating the placement new operator function is quite similar to creating a regular new operator function.

You have complete control over what parameters show up after the size parameter, and what parameters you pick depend entirely on what you want to use the function for. The C++ language specification defines two different placement new operators (well, technically there are four, since two are scalar and two are vector):

The first form is a way for you to call new without having it throw an exception if problems occur during allocation. The second form is a way for you to call new and say “do not allocate memory, just use this memory I’ve already provided for you.” The way you use these at a call site is:

However, the placement delete operator is a bit of an oddity — the only time placement delete will be called is if the initializer or constructor called during a placement new throws an exception. This is required because otherwise there’s potential for a memory leak to happen. However, once the object is fully constructed, the placement delete operator will never be called for that instance!

This brings up an interesting question — from our example code above, how do we destroy the SomeClass object for sc2? Normally, we’d say “call delete on it”, except what would that do? It would fire the destructor method, which is good. But then it would call a deallocation function (operator delete), which can’t possibly know how to free the memory we used in the placement new. The answer is: you explicitly call the destructor yourself!

sc2->~SomeClass();
// Presumably the memory for sc2 will be freed or reclaimed some time later

So what about the sc object? Do we have to call the destructor explicitly there too? The answer to that lies in the C++ specification, Section 18.6.1.1 Clauses 7 and 8. You can imagine the std::nothrow_t version of new to be implemented like this:

Because the non-throwing version of operator new simply calls through to the regular version of new, it is safe to use delete to destroy classes created in this fashion. This brings up a very key distinction that may be easy to miss: there’s nothing magical about the placement versions of operator new, they simply change the way you need to think about allocation and deallocation. That is all these operators are designed to do — so if your class’ placement new function allocates memory, you can define an operator delete method on that class to handle deallocation as well. Just remember that you still need to have the placement delete in case construction fails!

Here’s a contrived example (of something you should never, ever do, I might add!) that allows us to allocate more space than the class requires so that we can “lump together” the class and some of its contained data. Please note that I don’t condone using this for production use (I’ve not exactly thought this code all the way through).

If you run this example code, you’ll notice that the data pointer truly does follow after the class pointer, and the data is valid. The way it works is by calling the placement new operator that is defined within the Test class, which calls through to the global operator new function. It simply is increasing the size of the requested memory allocation to account for our extra data. In the constructor, we skip to the end of our class’s space in memory and copy in our string data directly following the class. We can write our constructor like that safely because we know we have removed the ability to call the non-placement new! (Note, if your compiler does not support deleted methods, you can simply remove the “= delete” and not define the method body.) When the user has finished with their class instance, they call our operator delete method, which calls through to the global operator delete to perform the destruction.

There are a few interesting details here though. First, since we know the constructor does not throw any exceptions, why do we define a placement delete? The specification does not make any demands that a matching placement deallocation function be present, however it is still good form. Whenever you have a custom allocator, making a custom deallocator to match it is just good coding practice. Some compilers even enforce this, such as VC++ (at least, as of Visual Studio 2010 this is true). The second question which comes up is, why do we define a class-level operator delete that simply calls through to the global one? Again, it seems that some compilers simply require this (likely due to it being good practice). However, according to Section 12.5 Clause 4, if the class-level operator delete is not found, it should search the class hierarchy and end with the globally defined operator delete. Yet in VC++ (2010), the compiler claims the delete operator cannot be reached without defining it on the class. The last interesting detail is in the signature of the placement new and placement delete functions themselves: why use unsigned short?

At first, I would have used std::size_t, because that is the result of strlen, and is the most appropriate type to use. However, it would cause a problem with the placement delete function, because operator delete actually has two standard forms:

Either one of those is considered to be a “usual” deallocation function, according to Section 3.7.4.2 Clause 2. So this creates for ambiguity with the placement version of operator delete — is it acting as a placement version, or acting as a usual deletion? This exact scenario is called out in Section 5.3.4 Clause 20, and if we were to do it, the program would be ill-formed. However, VC++ does allow you to do this, and will give you proper behavior.

Earlier, I asked you why you would ever want to use the placement new operator. There are three answers to that question:

We’ve already discussed the non-throwing version of new, but let’s talk a bit about discrimated unions and arenas.

Let’s say we have the following union declaration:

union {
int i;
float f;
std::string s;
} u;

Since we cannot call constructors directly, the only way to properly initialize the string member of that union is by using the placement new operator. The union has already allocated the space for the class, and so we can simply use that with the default placement new. When we are done, we don’t need to free any memory (the union owns that), but we do need to call the string’s destructor. So our usage would look like this:

Effectively, this is the only way to call non-trivial constructors and destructors with discriminated unions.

Arenas are the last interesting reason why you might want to use placement operators. Let’s say that you have an application which communicates via shared memory. You can use a placement new operator to pick whether a class is allocated into the shared memory region, or the private memory region of the application. These memory regions are generally called “arenas” or “zones.” As a programmer, you can create custom allocator classes which allow you to allocate into different arenas, and the caller can determine whether they wish to use placement syntax to use one arena over another.

The placement new operator is not something you are likely to run across frequently (except for, possibly, the std::nothrow_t version). However, it’s one of those areas of C++ that is helpful to know exists. It’s not a general purpose tool within the language, but when you need the level of control it provides, it can be invaluable.

13 Responses to The Placement New Operator

Placement new is one of those things that I find to be used more in the embedded world than outside of it, even though it’s a very useful tool to know about (as you’ve pointed out).

(Aside: Many desktop developers I’ve worked with have used virtual inheritance, but haven’t ever used – or even heard of – placement new. I’m exactly the opposite – I know what virtual inheritance is, I know what problem it solves, I know a little about its workings… but I’ve never used it.)

Anyway, regarding placement new, 2 of the main uses for it in my own work are (1) modeling hardware devices and (2) control over static (global) object construction.

For modeling hardware, placement new allows you to construct objects at specific addresses, including memory-mapped hardware regions. For example, a UART might consist of six 32-bit registers located at a specific address range. Obviously you don’t want to allocate any memory – the address range isn’t memory, it’s hardware registers – but you want to “overlay” an object over those hardware registers.

Suppose UART is a class whose layout matches the hardware registers. One of the benefits of placement new over simply using a cast like:

UART *uart = reinterpret_cast(0x02FF0000);

is that placement new will invoke the object’s constructor. This can be useful because sometimes you want to reset the hardware to a known state (clear the receive FIFO, set baud rate, etc.) There is a lot more to say on the topic but that’s use case #1 for me.

The 2nd usage, which I sometimes have to use with safety-critical & life-critical systems, is having explicit control over object initialization. Ignoring for now the “globals are evil” mantra, which I generally agree with, sometimes you have objects that live neither on stack, nor on the heap. They have static storage duration, they are defined outside of any function. These are objects which are constructed before the program begins, before you enter main(). Most people never consider the loading / initialization process, everything that happens before main(), but it’s important.

One of the things that happens before main() is that objects of static storage duration are created. This means running all the objects’ constructors. Generally that’s what you want. (“WAIT!” you say… “what do you mean “generally?!?!?” “) In some of the systems I work on, certain faults / failures / exceptions (hardware exceptions, not C++ exceptions) result in a system shutdown / recovery process. This usually involves some sort of error logging, restoring hardware to a safe state (perhaps stopping a motor, disabling a high-voltage operation, etc.) and then re-initializing. But here’s the kicker: the “re-initialization” is often different than the cold-boot initialization. RAM contents in the device are intact (after all, we didn’t lose power). Maybe there is some data in an object that tells us why we failed, and gives us guidance on what to do as we’re coming back. Or maybe there is some data that was in the process of being serialized and written out to a non-volatile memory device when some other task (thread) did a “divide by zero” and necessitated a clean start.

So in these cases, we DON’T want the code that runs before main() (usually call this the C/C++runtime startup code) to unconditionally run all static object constructors – we would overwrite/lose potentially valuable data lingering in some objects. This implies of course that the pre-main() startup code is non-trivial, very intelligent, probably complicated and definitely very important. But when it’s an implanted cardiac device, a guided missile system or a jumbo jet, it’s important.

Anyway, just thought I’d once again weigh in from the “embedded” side.

Keep up the writing. I don’t know how you manage to keep the pace, you obviously have a passion for it.

@Dan — that’s a great point about firing constructors instead of simply typecasting to a memory location, good idea! As for the pre-startup code, that’s a really interesting point. I suppose when you control the hardware, you have a bit more stability with regards to memory addresses. You can’t really get away with that on non-embedded system due to things like address space layout randomization, and process partitioning. But you likely don’t have to worry about that sort of stuff on a nuclear warhead.

But how do you reliably ensure your code runs pre-main? According to spec, anything that happens before main is implementation defined. I suppose you figure out whatever your particular compiler and libc are doing, and plug yourself in?

Good question: “But how do you reliably ensure your code runs pre-main? According to spec, anything that happens before main is implementation defined. I suppose you figure out whatever your particular compiler and libc are doing, and plug yourself in?”

That’s exactly it. It’s one of the nice things I like about embedded work, at least deeply-embedded work – you have complete control over everything. The linker script determines where things are located, and the startup code performs all the initialization. Usually there isn’t an MMU, so you have to fix at link-time where code & data reside. (Let’s just forget about relocatable code for now).

For startup code, toolset vendors always give you something to start with, and many people just use the default startup code (“initialize everything the way I expect”), but sometimes you need more control. An example of such of a file is here:

Look near the bottom of the file (before the branch to main) – the section titled “copy .data section (Copy from ROM to RAM)” is how static-duration variables like

int fred = 5;

get their values before entering main.

And the code commented “Clear .bss section (Zero init)” is how uninitialized static-duration variables like

int barney;

are cleared to 0 before entering main.

So, in some of the products I work on, these 2 steps are not unconditionally executed as they are in the example assembly file — there is some work to be done right before these 2 steps, before the memory gets overwritten / initialized.

The above startup code is for C, not C++ — with C++ there is additional code to run the constructors and it’s located in the same area, I just can’t find a good example right now. But hopefully you get the gist.

Another approach I’ve seen – the 2 sections described are just commented out in crt.s – they’re never run, and there is no additional logic in the startup file. What that means is that upon entry to main, “fred” doesn’t (necessarily) hold “5”, and “barney” doesn’t (necessarily) hold 0. The C code is responsible for explicitly initializing all variables when it’s ready to do so, e.g.:

Obviously this is slower & bigger code than the tiny assembly “block copy” loop in crt.s – imagine if you have 1000 variables, it doesn’t scale well! Whereas the assembly code stays the same size, it just copies (or zeroes) a bigger region.

I’m quite familiar with that sort of startup code, as I’ve had to write a bit of it in the past as a compiler writer. The Mach-O linker that I helped write needed plenty of startup code because each application effectively has its own loader embedded in it. So I’ve done my fair share of puttering around in crt. ;-)

But that’s interesting that you effectively have control over crt at the source level as a matter of course. I figured even in the embedded world you had static libraries, and that the crt (at least parts of it) would be available to you. But now I’m guessing that you’ve got nothing but two rocks to bang together, so you get to roll your own standard library functions.

@Justin — the default global versions of placement new and delete may not be replaced (see [new.delete.placement]p1. However, a user can specify their own global version of placement new with a different signature, or override the global allocator with a class-specific allocator.

@aaron May be you should add below article as prerequisite your this blog post. Placement new is something which is rarely used, I recently came to know about it , though I still dont need it. The below article helped me understand your article without which I was hardly able to understand your article.

promises that it wont throw any exception, but in “::operator new( size );” , if a exception is thrown, wouldn’t “unexpected()” will be called in that case rather than exception being handled by “catch (std::bad_alloc)” block?

@Anonymous — that’s not a lame question at all. The non-throwing exception specification (throw() or noexcept) is a promise to callers that an exception won’t be thrown “out of” the function. If an exception is thrown inside the function and caught within the function, everything is fine. Check out [except.spec]p9 in the C++ Standard for the exact wording that allows it.

Your email address will not be published. Required fields are marked *

Comment

Name *

Email *

Website

Who

Aaron Ballman is a software engineer for GrammaTech. He has almost two decades of experience writing cross-platform frameworks in C/C++, compiler & language design, and software engineering best practices and is currently a voting member of the C (WG14) and C++ (WG21) standards committees.

In case you can't figure it out easily enough, the views expressed here are my personal views and not the views of my employer, my past employers, my future employers, or some random person on the street. Please yell only at me if you disagree with what you read.