Forcing compiletime initialization of variables in C++ using constexpr

Every now and then I work on some complex C++ code (mostly stuff running
on Arduino nowadays) so I can write up some code in a nice, consise
and abstracted manner. This almost always involves classes, constructors
and templates, which serve their purpose in the abstraction, but once
you actually call them, the compiler should optimize all of them away as
much as possible.

This usually works nicely, but there was one thing that kept bugging me.
No matter how simple your constructors are, initializing using
constructors always results in some code running at runtime.

In contrast, when you initialize normal integer variable, or a struct
variable using aggregate initialization, the copmiler can
completely do the initialization at compiletime. e.g. this code:

Would result in four bytes (0x12, 0x00, 0x34, 0x56, assuming no padding
and big-endian) in the data section of the resulting object file. This
data section is loaded into memory using a simple loop, which is about
as efficient as things get.

This will result in those four bytes being allocated in the bss section
(which is zero-initialized), with the constructor code being executed at
startup. The actual call to the constructor is inlined of course, but
this still means there is code that loads every byte into a register,
loads the address in a register, and stores the byte to memory (assuming
an 8-bit architecture, other architectures will do more bytes at at
time).

This doesn't matter much if it's just a few bytes, but for larger
objects, or multiple small objects, having the loading code intermixed
with the data like this easily requires 3 to 4 times as much code as
having it loaded from the data section. I don't think CPU time will be
much different (though first zeroing memory and then loading actual data
is probably slower), but on embedded systems like Arduino, code size is
often limited, so not having the compiler just resolve this at
compiletime has always frustrated me.

Constant Initialization

Today I learned about a new feature in C++11: Constant
initialization. This means that any global variables that are
initialized to a constant expression, will be resolved at runtime
and initialized before any (user) code (including constructors) starts
to actually run.

A constant expression is essentially an expression that the compiler can
guarantee can be evaluated at compiletime. They are required for e.g array
sizes and non-type template parameters. Originally, constant expressions
included just simple (arithmetic) expressions, but since C++11 you can
also use functions and even constructors as part of a constant
expression. For this, you mark a function using the constexpr keyword,
which essentially means that if all parameters to the function are
compiletime constants, the result of the function will also be
(additionally, there are some limitations on what a constexpr function
can do).

So essentially, this means that if you add constexpr to all
constructors and functions involved in the initialization of a variable,
the compiler will evaluate them all at compiletime.

(On a related note - I'm not sure why the compiler doesn't deduce
constexpr automatically. If it can verify if it's allowed to use
constexpr, why not add it? Might be too resource-intensive perhaps?)

Note that constant initialization does not mean the variable has to be
declared const (e.g. immutable) - it's just that the initial value
has to be a constant expression (which are really different concepts -
it's perfectly possible for a const variable to have a non-constant
expression as its value. This means that the value is set by normal
constructor calls or whatnot at runtime, possibly with side-effects,
without allowing any further changes to the value after that).

Enforcing constant initialization?

Anyway, so much for the introduction of this post, which turned out
longer than I planned :-). I learned about this feature from this great
post by Andrzej Krzemieński. He also writes that it is not really
possible to enforce that a variable is constant-initialized:

It is difficult to assert that the initialization of globals really
took place at compile-time. You can inspect the binary, but it only
gives you the guarantee for this binary and is not a guarantee for the
program, in case you target for multiple platforms, or use various
compilation modes (like debug and retail). The compiler may not help
you with that. There is no way (no syntax) to require a verification
by the compiler that a given global is const-initialized.

If you accidentially forget constexpr on one function involved, or some
other requirement is not fulfilled, the compiler will happily fall back
to less efficient runtime initialization instead of notifying you so you
can fix this.

This smelled like a challenge, so I set out to investigate if I could
figure out some way to implement this anyway. I thought of using a
non-type template argument (which are required to be constant
expressions by C++), but those only allow a limited set of types to be
passed. I tried using builtin_constant_p, a non-standard gcc
construct, but that doesn't seem to recognize class-typed constant
expressions.

Using static_assert

It seems that using the (also introduced in C++11) static_assert
statement is a reasonable (though not perfect) option. The first
argument to static_assert is a boolean that must be a constant
expression. So, if we pass it an expression that is not a constant
expression, it triggers an error. For testing, I'm using this code:

We define a Foo class, which has two constructors: one accepts an
int and is constexpr and one accepts a long and is notconstexpr. Above, this means that a will be const-initialized, while
b is not.

To use static_assert, we cannot just pass a or b as the condition,
since the condition must return a bool type. Using the comma operator
helps here (the comma accepts two operands, evaluates both and then
discards the first to return the second):

However, this doesn't quite work, neither of these result in an error. I
was actually surprised here - I would have expected them both to fail,
since neither a nor b is a constant expression. In any case, this
doesn't work. What we can do, is simply copy the initializer used for
both into the static_assert:

This achieves the same result, but looks nicer (though the
ensure_const_init function does not actually enforce anything, it's
the context in which it's used, but that's a matter of documentation).

Note that I'm not sure if this will actually catch all cases, I'm not
entirely sure if the stuff involved with passing an expression to
static_assert (optionally through the ensure_const_init function) is
exactly the same stuff that's involved with initializing a variable with
that expression (e.g. similar to the copy constructor issue below).

The function itself isn't perfect either - It doesn't handle (const)
(rvalue) references so I believe it might not work in all cases, so that
might need some fixing.

Also, having to duplicate the initializer in the assert statement is a
big downside - If I now change the variable initializer, but forget to
update the assert statement, all bets are off...

Using constexpr constant

As Andrzej pointed out in his post, you can mark variables with
constexpr, which requires them to be constant initialized. However,
this also makes the variable const, meaning it cannot be changed after
initialization, which we do not want. However, we can still leverage this
using a two-step initialization:

This isn't very pretty either, but at least the initializer is only
defined once. This does introduce an extra copy of the object. With
the default (implicit) copy constructor this copy will be optimized out
and constant initialization still happens as expected, so no problem
there.

Here, a user-defined copy constructor is present that is not declared
with constexpr. This results in e being not constant-initialized,
even though e_init is (this is actually slighly weird - I would expect
the initialization syntax I used to also call the copy constructor when
initializing e_init, but perhaps that one is optimized out by gcc in
an even earlier stage).

This code is actually a bit silly - of course f_init and g_init are
const-initialized, they are declared constexpr. I initially tried this
separate init variable approach before I realized I could (need to,
actually) add constexpr to the init variables. However, this silly
code does catch our problem with the copy constructor. This is just a
side effect of the fact that the copy constructor is called when the
init variables are passed to the ensure_const_init function.

Using two variables

One variant of the above would be to simply define two objects: the one
you want, and an identical constexpr version:

Foo h = Foo(1);
constexpr Foo h_const = Foo(1);

It should be reasonable to assume that if h_const can be
const-initialized, and h uses the same constructor and arguments, that
h will be const-initialized as well (though again, no real guarantee).

This assumes that the h_const object, being unused, will be optimized
away. Since it is constexpr, we can also be sure that there are no
constructor side effects that will linger, so at worst this wastes a bit
of memory if the compiler does not optimize it.

Again, this requires duplication of the constructor arguments, which can
be error-prone.

Summary

There's two significant problems left:

None of these approaches actually guarantee that
const-initialization happens. It seems they catch the most common
problem: Having a non-constexpr function or constructor involved,
but inside the C++ minefield that is (copy) constructors, implicit
conversions, half a dozen of initialization methods, etc., I'm
pretty confident that there are other caveats we're missing here.

None of these approaches are very pretty. Ideally, you'd just write
something like:

constinit Foo f = Foo(1);

or, slightly worse:

Foo f = constinit(Foo(1));

Implementing the second syntax seems to be impossible using a function -
function parameters cannot be used in a constant expression (they could
be non-const). You can't mark parameters as constexpr either.

I considered to use a preprocessor macro to implement this. A macro
can easily take care of duplicating the initialization value (and since
we're enforcing constant initialization, there's no side effects to
worry about). It's tricky, though, since you can't just put a
static_assert statement, or additional constexpr variable
declaration inside a variable initialization. I considered using a
C++11 lambda expression for that, but those can only contain a
single return statement and nothing else (unless they return void) and
cannot be declared constexpr...

Perhaps a macro that completely generates the variable declaration and
initialization could work, but still a single macro that generates
multiple statement is messy (and the usual do {...} while(0) approach
doesn't work in global scope. It's also not very nice...