Blog Archive

2012-06-03

C++11 tuple implementation details (Part 1)

In his "Variadic Templates are Funadic" talk at GoingNative 2012 Andrei Alexandrescu outlines a "recursive" implementation of std::tuple. He also notes that most implementations arrange tuple members in memory in the "wrong" order, that is, std::tuple<int, double, std::string> will have a memory layout where the std::string comes first and the int comes last. The straightforward recursive implementation he presents also has this property.

I checked out the implementation of std::tuple in libc++, and their implementation has the "sane" memory layout, and it's also non-recursive, which — according to Howard Hinnant, the primary author of libc++'s tuple implementation — has other benefits, like faster compile times, too.

In this post I'll try to explain the implementation details of a flat (non-recursive) tuple. The implementation presented here is based on the libc++ version — in fact, it's mostly copy-waste, I just simplified some of the hairy bits and omitted some parts, like the allocator related constructors, exception specifications, tie and tuple_cat. I'll return to them in a follow-up post. Some parts (like our internal TupleSize and TupleElement) are a bit more complicated than the ones in the standard library, because we can't just assume/dictate implementation details in the std namespace and we need compatibility with the standard tuple-like classes.

Notes on namespaces

I'm planning to use this Tuple implementation in a project (a simple OpenGL game on android), thus the Tuple class is in namespace migl2, the namespace used in this project. All implementation details are in namespace migl2::detail, and some template specializations are in (or lifted into) namespace std. I'm using namespace std inside namespace migl2::detail, so std::forward, std::enable_if and others will be unqualified there. The overall layout is something like this:

A slightly odd-looking thing here is the using std::swap; line right after using namespace std. At first glance it'd seem superfluous, but it's in fact necessary. We'll use unqualified calls to swap() to enable Argument Dependent Lookup. Unqualified calls to swap would work for user defined types if they also provide their swap override, but it won't work for primitive types (as they're not in namespace std) or user-defined types that don't provide their specialized swap(), since the generic std::swap() is not picked up with ADL in this case, as we'll have our own swap() defined for TupleLeaf in the detail namespace, and without that "extra" using std::swap the compiler would try to use that for all types that don't provide their own overload.

The non-recursive approach

The basic idea behind the non-recursive tuple implementation is that tuple elements are stored in TupleLeaf base classes, but whereas the recursive implementation uses a deep class hierarchy, we'll use multiple-inheritance. In pseudo-code:

Here we declare our TupleLeaf template, and set the IsEmpty value to true when we can use Empty Base Class Optimization. A little bit of trickery is needed here: if the implementation supports final classes, then we must check if our ValueType_ is final or not. Since we can't subclass a final class, we cannot use EBCO in that case.

Lines 11-15:

Swap two TupleLeafs by swapping the contained elements. We'll use this from the swap member functions.

Lines 17-75:

The TupleLeaf implementation for the generic case (no Empty Base Class Optimization). We disable the normal copy-assignment (line 17), as we'll always use the forwarding copy-assignment (lines 53-57). The only interesting bit here is the return type of swap(lines 59-62). Normally, it should be void, but we'll later call this swap member from our TupleImpl::swap with template-pack expansion, and to do that we need a function with a return type. See the swallow function below.

Next, we specialize our TupleLeaf for the case when we can use the Empty Base Class Optimization, so as to not waste space for empty tuple members:

MakeTupleIndexes creates a type that encodes the indexes from Start_ to End_, for example: TupleTypes<0, 1, 2>. It could be a bit simpler, but we'll extend the standard tuple with a constructor that takes less parameters than the actual number of tuple elements (and default-constructs the rest), and for that we'll need to be able to create TupleIndexes starting from an arbitrary index instead of 0. The same applies to MakeTupleTypes below. Both templates use a helper template to build their values incrementally. The end result is a typedef for the complete indexes and types.

Note: Since the time of writing, c++14 introduced std::index_sequence, since the concept was found to be useful in many situations. The implementation of the sequence generation can also be optimized: instead of the above described linear recursion one can use a logarithmic implementation, or the sequence generation could even be a compiler intrinsic. The tuple implementation could be updated to use the new index_sequence here.

Here we declare our special type-holder, TupleTypes and then we can define TupleSize for that using the sizeof... operator. TupleSize for our own Tuple uses TupleTypes. For all other types, we fall back on the standard tuple_size template, thus our internal TupleSize will work for std::tuple, std::pair and std::array, too. The cv-qualified types are forwarded to the implementation for the unqualified TupleTypes.

Similarly our internal TupleElement is defined for TupleTypes and Tuple, but falls back on the standard tuple_element for other types:

TupleElement is a classic recursive template. We chomp off the types from the beginning of our list as we decrement the index (I_ in the main template at lines 19-23). The stop condition is when we reach I_ == 0(lines 13-17). If the remaining TupleTypes is empty before we reach our stop condition, we signal an error with a static_assert(lines 7-11). The end result — if all goes well — is a typedef (TupleElement::type) corresponding to the I_th element in our TupleTypes. The specialization for our final Tuple uses TupleTypes adjusted for const and volatile qualifiers.

Note:TupleElement could be implemented by letting overload resolution find the correct type for us instead of the above presented linear recursion. See this other nice blog post about tuples for details.

Now we can return to our helper template that creates TupleTypes for us:

MakeTupleTypes extracts the types from Tuple_ in the range [Start_ .. End_] and stores them in our TupleTypes type. The element-types are determined with the above defined TupleElement, and appended to a temporary list. As std::tuple_element (and consequently our TupleElement) doesn't work on reference types, we use a typedef for the reference-stripped type of Tuple_(TupleType at line 8), but we convert the result type from TupleElement to a reference if the original Tuple_ was an lvalue_reference (types that were already reference-types in the original tuple are not touched by this transformation).

TupleImpl and swallow()

We're ready to assemble these pieces into our internal TupleImpl class that will be the basis of the final Tuple.

The swallow() functions is just a little trick to allow us to use template-parameter-pack expansion at places where it wouldn't be possible otherwise: when we want to call foo() for each element of the template-parameter-pack, we call swallow() with the expansion like this:

The parameters of swallow() bind to anything except void, hence we'll need to make sure that our foo() does return something (is not void foo()). That's why we have the "strange" signature of our TupleLeaf::swap() function. Since the function itself doesn't do anything, the actuall call will be optimized away, and the only visible effect will be that foo() will be called for each element.

Lines 7-8:

The first template parameter of our TupleImpl template will be our TupleIndexes, and the rest are the actual types stored in the tuple. We'll immediately specialize this template declaration using actual indexes (as non-type template parameters) below.

Lines 10-12:

Here's our (only) specialization using actual indexes, and fixing the first type parameter to be TupleIndexes, and as we mentioned at the beginning, we're using multiple inheritance from TupleLeaf to build our class. The indexes and corresponding types are expanded in tandem, forming the base-classes.

Lines 13-23:

This is our "standard" constructor. As mentioned before, we'll allow construction of our final Tuple with less parameters than tuple elements. To allow this, this TupleImpl constructor splits the parameters in two, and expands them separately. The sum of the parameters of course must match the actual number of elements, but we'll take care of that in the final Tuple implementation.

Lines 25-31:

This is our "converting" constructor. We extract the elements from OtherTuple with get() expanded for each index our TupleImpl has, and forward them to the corresponding TupleLeafs. Since std::forward requires an explicit type specifier, we determine the correct type to use by building TupleTypes from OtherTuple and extracting the required type using TupleElement.

Lines 23-40:

This is our "converting" (or forwarding) assignment operator. It's pretty similar to our "converting" constructor above, we just use the swallow() function to call operator= of our TupleLeafs with corresponding elements from OtherTuple

Lines 42-46:

We need to define the "standard" assignment operator, too, because we explicitly disabled the standard operator= in our TupleLeaf (we only use the forwarding assignment operator), so the compiler cannot create the default assignment operator for us (as that would use TupleLeaf's default assignment operator, too).

Lines 48-51:

TupleImpl::swap() swaps each leaf piecewise using the swallow() trick.

Note: The swallow() implementation presented here processes the arguments in implementation-defined order. We could force left-to-right evaluation though by turning swallow into a struct with a variadic constructor and using brace-init like this: swallow{(foo(Args), 0)...}. Also note the use of the comma operator in this expansion: using this method to call swallow lets us use even void functions (no need for the trickery with the internal swap function above).

Some SFINAE helpers: TupleLike, TupleConvertible and TupleAssignable

The following templates will be used by our final Tuple class to disable some template instantiations (via SFINAE).

TupleLike is a simple traits-template to check if the given type can be handled like a tuple, that is: tuple_size, tuple_element and get will work with it. The default implementation just inherits std::false_type, then we specialize it for our final Tuple, our internal TupleTypes and the standard tuple-like classes (std::tuple, std::pair and std::array).

TupleConvertible is similar to std::is_convertible: it just checks that each tuple element of From_ is convertible to the corresponding element of To_. The default TupleConvertible template inherits std::false_type(lines 23-28). The two bool template parameters are defaulted to true if From_ and To_ are both TupleLike. We then specialize this template for the case when both are indeed tuple-like and defer the work to TupleConvertibleImpl(lines 30-37). We need this trick, because our TupleConvertibleImpl's first parameter is a bool indicating that From_ and To_ have the same number of elements. We check that with TupleSize, but it would blow up for non-tuple-like types, hence we can only use it when we're sure that both From_ and To_ are indeed tuple-like.

Once we get the size-match handled, TupleConvertibleImpl is pretty straightforward: we start with a default false_type(lines 3-6), then specialize for same-size tuples. This checks if the first elements are convertible then recursively checks the rest of the elements (lines 8-16).
The stop condition is when we reach empty TupleTypes<>, which are considered convertible (lines 18-21).

TupleAssignable is just like TupleConvertible, but we're using std::is_assignable in place of std::is_convertible:

These are the usual recursive templates parametrized on the number of tuple elements, that define an operator() to evaluate the relation for tuple elements up to the given index. The stop condition for both recursions is when I_ == 0, where TupleEqual return true and TupleLess returns false.

After the optimizer is done with the code, a call like

TupleEqual<2>()(tuple1, tuple2)

ends up as if we had hand-written

get<0>(tuple1) == get<0>(tuple2) && get<1>(tuple1) == get<1>(tuple2).

Tuple

Finally we can leave the detail namespace and put everything together in our final Tuple template:

The user-visible tuple_size and tuple_element templates just inherit from the corresponding "hidden" templates. Nothing fancy here.

Lines 13-18:

The generic Tuple template builds a detail::TupleImpl and holds that as its only data member.

Lines 20-33 and 107-131:

The get() functions are made friends so that they can access our private impl_ member. The three versions return reference, const-reference or rvalue-reference types depending on the type of the input parameter. We get the desired element by converting our impl_ member to the correct TupleLeaf type via a simple static_cast.

Lines 36-42:

This is the simplest of our constructors: it takes its parameters as const references, matching both the number and exact type of our tuple members. We pass these along to our TupleImpl's constructor, and since we fill out all parameters in the first set, the second set of parameters is empty (empty indexes and types).

Lines 44-59:

This constructor is responsible for constructing a Tuple from values convertible to our tuple-elements. We allow less parameters than the number of elements we hold, the rest of the elements will be default constructed. As a special case, if the number of supplied parameters is 0, then all of our tuple-elements will be default constructed, so this constructor acts as our default constructor, too. This constructor is enabled only if the supplied types are convertible to the corresponding types of our Tuple. This is checked via the above defined TupleConvertible template after creating suitable TupleTypes from the supplied parameter pack.

Lines 62-67:

Our last constructor takes a single tuple-like argument, and forwards it to the corresponding TupleImpl constructor, thus this constructor is both our copy and move constructor, depending on context. This constructor is enabled only if OtherTuple is convertible to our Tuple. This implicitly checks that OtherTuple is in fact tuple-like. Without this tuple-like check there would be an ambiguity (or other compile error) if we tried to instantiate our Tuple with a single (non-tuple) parameter.

Lines 69-75:

The forwarding move/copy assignment operator is analogous to the copy/move constructor above, the only difference is that instead of checking for TupleConvertible, we check for TupleAssignable to enable this operator.

Lines 77-79, 103-104 and 133-139:

The swap() member function just calls TupleImpl::swap() (the implementation for empty tuples is of course a no-op). We also define a standard swap() override, so that it could be picked up by ADL when needed.

Lines 82-105:

The empty Tuple<> doesn't need to hold any data at all, so we specialize our template for that case to avoid unnecessary memory waste. The constructors are of course simpler than in the general case. We also won't need get functions, and our swap is a no-op.

Lines 141-171:

The two basic relational operators (equals and less-than) are implemented via the corresponding helper templates, the rest are just built using these two, as usual.

Finishing touches: integration with the std namespace

In the last section we make our Tuple drop-in compatible with std::tuple by specializing the std::tuple_size and std::tuple_element templates for it, and providing overloads for std::get by lifting the migl2::get functions into the std namespace:

Disclaimer

I'm returning to C++ after 9 years of Java development, so take everything presented here with a grain of salt, and please point out any mistakes in the comments section below, so that I can fix them and not confuse others. As mentioned in the preamble, this code is based on the tuple implementation in libc++, which was created by Howard Hinnant, so hereby I thank him (and all other developers of libc++) for the inspiration.

The complete source for the above Tuple implementation can be found on bitbucket.

4 comments:

I tried compiling your code using the "VS2012 November 2012 CTP" compiler which supposedly supports variadic templates (and it does to some extent). However it didn't swallow your inheritance in the TupleImpl template.

Did you try this compiler yourself? Its free with VS express.

I thought maybe I should file a bug report. MS has been quite good at correcting compiler bugs before. I just would want to tell them what compiler it works in.

According to my reading of the standard, std::tuple does not support tuple constructor calls with fewer arguments than the number of elements in the tuple (with the rest of the elements being default constructed). The libc++ implementation has the same FirstIndexes/FirstTypes, LastIndexes/LastTypes constructor machinery as you use to allow this (although it doesn't actually seem to permit such use). Any idea why they have this extra complexity if it is not actually supported?