Learn how to capture by move

I think one of the most attractive feature of C++11 is about lambdas. They simplify and encourage the usage of STL algorithms more than before, and they (may) increase programmers productivity. Lambdas combine the benefits of function pointers and function objects: like function objects, lambdas are flexible and can maintain state, but unlike function objects, their compact syntax don’t require a class definition.

The code above defines a lambda with no return value and receiving a const string& parameter. What about the “[ ]”? That identifier is the capture specification and tells the compiler we’re creating a lambda expression. As you know, inside the square brakets you can “capture” variables from the outside (the scope where the lambda is created). C++ provides two ways of capturing: by copy and by reference. For example:

The only difference is the way we capture the string name: this time we do by reference, so no copy is involved and the behavior is like passing a variable by reference. Obviously, if name is destroyed before the lambda is executed (or just before name is used): boom!

After this introduction, in this post I’m going to discuss about an issue on capturing I encountered few days ago at work: what if I want to capture by moving an object instead of both copying and referencing? Consider this plausible scenario:

This fragment of code prepares a vector of HugeObject (e.g. expensive to copy) and returns a lambda which uses this vector (the vector is captured by copy because it goes out of scope when the lambda is returned). Can we do better?

“Yes, of course we can!” – I heard. “We can use a shared_ptr to reference-count the vector and avoid copying it“:

I honestly don’t like the use of shared_ptr here but this should work well. The subtle (possible) aspect of this attempt is about style and clarity: why is the ownershipshared? Why can’t I treat hugeObj as a temporary to move “inside” the lambda? I think that using a sharing mechanism here is like a hack to fill a gap of the language. I don’t want the lambda to share hugeObj with the outside, I’d like to “prevent” this:

The move_on_copy wrapper works but it is not completed yet. To refine it, a couple of comments are needed. The first is about “usability“: the only aim of this class is to “replace” the capture-by-copy with the capture-by-move, nothing else. Now, the capture by move makes sense only when we operate on rvalues and movable objects, so is the following code conceptually correct?

// due to universal referencing, T is const T&, so no copy/move will be involved in move_on_copy's ctor
const vector<HugeObject> hugeObj;
auto moved = make_move_on_copy(hugeObj);
auto toExec = [moved] { ...operate on moved.value... };
// hugeObj here is the same as before

Not only is it useless, but also confusing. So, let’s impose our users to pass only rvalues:

We “enable” this function only if aValue is an rvalue reference, to do this we make use of a couple of type traits. Strangely this code does not compile on Visual Studio 2010, so, if you use it, try to settle for:

What aroused my suspicions was the syntax of lambda expressions: if you copy-capture an object, the only way to access its non-const members (aka: make changes) is to declare the lambda mutable. This is because a function object should produce the same result every time it is called. If we want to support this requirement then we have to make a little change:

Personally I’m doubtful if this is the best way to capture expensive-to-copy objects. What I mean is that working with rvalues masked by lvalues can be a little bit harder to understand and then maintaining the code can be painful. If the language supported a syntax like:

HugeObject obj;
auto lambda = [move(obj)] { ... };
// obj was moved, it is clear without need to look at its type

It would be simpler to understand that obj will be in an unspecified state after the lambda creation statement. Conversely, the move_on_copy wrapper requires the programmer looks at obj’s type (or name) to realize it was moved and some magic happened:

HugeObject obj;
auto moved_obj = make_move_on_copy(move(obj)); // this name helps but it is not enough
auto lambda = [moved_obj] { ... };
// obj was moved, but you have to read at least two lines to realize it

[Edit]

As Dan Haffey pointed out (thanks for this) “the move_on_copy wrapper introduces another problem: The resulting lambda objects have the same weakness as auto_ptr. They’re still copyable, even though the copy has move semantics. So you’re in for trouble if you subsequently pass the lambda by value”. So, as I said just before, you have to be aware some magic happens under the hood. In my specific case, the move_on_copy_wrapper works well because I don’t copy the resulting lambda object.

Another important issue is: what semantics do we expect when a function that performed a capture-by-move gets copied? If you used the capture-by-move then it’s presumable you didn’t want to pay a copy, then why copying functions? The copy should be forbidden by design, so you can employ the approach suggested by jrb, but I think the best solution would be having the support of the language. Maybe in a next standard?

Since other approaches have been proposed, I’d like to share with you my final note about this topic. I propose a sort of recipe/idiom (I think) easy to use in existent codebases. My idea is to use my move_on_copy only with a new function wrapper that I called mfunction (movable function). Differently from other posts I read, I suggest to avoid rewriting a totally new type (that may break your codebase) but instead inherit from std::function. From a OO standpoint this is not perfectly consistent because I’m going to violate (a bit) the is-a principle. In fact, my new type will be non-copyable (differently from std::function).

In my opinion, inheriting from std::function avoids reinventing the wheel and allows you to use mfunction where a std::function& is needed. My code is on ideone, as usual.

[/Edit]

Make the choice you like the most don’t forgetting readability and comprehensibility. Sometimes a shared_ptr suffices, even if it’s maybe untimely.

Lambdas are cool and it’s easy to start working with. They are very useful in a plenty of contexts, from STL algorithms to concurrency stuff. When capturing by reference is not possible and capturing by copy is not feasible, you can then consider this “capture by move” idea and remember the language always offers the chance to bypass its shortcomings.

The move_on_copy wrapper introduces another problem: The resulting lambda objects have the same weakness as auto_ptr. They’re still copyable, even though the copy has move semantics. So you’re in for trouble if you subsequently pass the lambda by value: http://ideone.com/YG37md

I’ve not tested this, but I think you could express your rvalues-only restriction on make_move_on_copy more clearly/directly with std::remove reference instead of enable_if. With some luck, it might even work on VS2010!

Hi Jeff, thanks for your suggestion. I just think you have to add an extra level of indirection to maintain the same interface and let the compiler deduce the template argument T. To be short, this won’t compile:

Re: comment @ November 30, 2012 at 11:48 pm (I don’t seem to be able to actually reply to said post)

Marco – thanks for the reply, and doh, it didn’t occur to me that the change would prevent type deduction.

Fixing the type by adding the extra level makes perfect sense, although the extra code takes away any readability advantage it had over enable_if.
Still, it still ought to produce better error messages. In gcc47, I get “could not convert Foo to Foo&&” instead of “move_on_copy was not declared in this scope”

I made a quick example to test it out this time, and added a refinement of yours that moves the remove_reference stuff around to simplify it a bit. It’s at https://gist.github.com/4183939; I’ve put the full error messages in there as comments for comparison

Thanks Jeff! I think you could write something about this argument, it is not obvious how to require an rvalue reference parameter in a templated function.My approach works but yours is clearer and simpler to read!

Your solution is interesting but I think it’s harder to mantain and extend. What about employing it in an existent codebase, full of std::function and lambda expressions? Probably, as I said, a shared_ptr wins (for both behavior and clarity – the latter, as you know, is pivotal when you work on a big and complex codebase). Another important point is: what semantics do we expect when a function that performed a “capture-by-move” gets copied? If you used the “capture-by-move” then it’s presumable you didn’t want to pay a copy, then why copying functions? You’re right, the copy should not be possibile by design. So you have at least three solutions:

write another wrapper, to make non-copyable functions (as you did) – paying compatibility with existent code (you introduced a new type)

use a different approach (e.g. a shared_ptr) – paying nothing but sharing ownership, when it is not necessary

use the move_on_copy_wrapper judiciously – paying compatibility with semantics of std::function (we have to remember not to copy a function that performed a capture-by-move – or, at least, we should know what happens under the hood)

The best solution could be having the support of the language.
Thanks for your code and for your comment!

Your `mfunction` is probably a bad idea. It inherits from `std::function`… but `std::function`s are often passed around by value (rather than by reference). Passing an `mfunction` to a callee that expects a `std::function` will cause slicing (i.e., constructing a new `std::function` using `std::function`’s copy constructor), which is probably not what the programmer intended. And unfortunately I don’t think there’s any way for `mfunction` to override or disable the “mfunction-to-std::function” conversion.