I see this myth coming up a lot. People think that lambdas can cause a heap allocation – on Reddit, on StackOverflow, on Channel9 comments, in personal conversations. I don’t blame people for thinking this – C++ is complex. But it always seems to be some vague notion of “heap allocations can happen under some circumstances”. So let’s think about it more clearly.

Lambdas are a language feature. The compiler will create a closure object representing the function literal you write. The closure object is a value type, created as a prvalue and with automatic storage when assigned to auto (as you must, because its type is unutterable). The only way heap allocation can occur is if you by-value capture a variable whose copy constructor does heap allocation (and this isn’t a consequence of the lambda per se, as I hope you can see).

You can use lambdas all day long on the stack, assigned to auto variables, and never incur a heap allocation. Since lambdas are a language feature, it’s unclear if they could do allocation even if it were desirable – in C++, allocation is in the realm of library code, and I think it would almost certainly entail semantic difficulties for the language, as well as difficulties with exceptions and the like.

Where the confusion comes from, perhaps, is a conflation of lambdas with std::function. (Because lambdas are commonly assigned to std::function for storage and passing around?) std::function is a library construct that wraps a callable using type erasure, and may very well incur heap allocation. In order to wrap any callable, it has to create an internal class in the standard type-erasure way, and that might involve an allocation.

However, std::function does have a couple of common optimizations up its sleeve. First is the so-called “small functor optimization” – a buffer inside a std::function object that is typically big enough to store a few pointers and can be used to store the internal object, assuming it will fit. This allows std::function to avoid heap allocation in common cases (typically just one or maybe two captures).

The second optimization is a space optimization. The typical type-erasure pattern involves calling a virtual function on an internal base class, whose derived class is parametrized on the actual type passed in. But every lambda has a different type, so a naive implementation of this could result in many vtables being generated. So std::function commonly optimizes the call machinery, basically by supplanting the normal C++ virtual call process with a free function implementation that doesn’t cause vtable bloat.

And that’s about it. Lambdas don’t (intrinsically) cause any heap allocation. Ever. When you assign to a std::function, that may cause an allocation, but for a small number of captures, the small functor optimization will probably apply.

They’ll recognize the erase-remove idiom and correctly say that it’s removing even numbers from the vector so that what’s left is { 1,3,5 }. So then you ask them how it works, and likely as not, they say something like:

“The remove_if moves all the elements you want to remove to the end, then the erase gets rid of them.”

This isn’t what remove_if does. (And likewise, the other remove* algorithms.) If it did that – which is more work than it does – it would in fact be partition. What remove does is move the elements that won’t be removed to the beginning. A suitable implementation of remove_if, which you can see in the libc++ source, is:

The elements that got removed just got overwritten. They didn’t get moved to the end. After the call to remove_if (before the erase call), the vector was { 1,3,5,4,5 }.

This means that remove can potentially invalidate invariants, e.g. if we expect the sequence to contain unique values (the erase restores the invariant, so erase-remove should always be paired in this case). And if this had been a container of pointers, in all likelihood, memory would have been leaked. Hence item 33 in Effective STL, “Be wary of remove-like algorithms on containers of pointers”.

So remember, remove doesn’t move things to the end. Next time you hear that, congratulate the programmer saying it – they’re one of today’s lucky 10,000!

]]>http://www.elbeno.com/blog/?feed=rss2&p=10562How many bugs are left? The Lincoln Indexhttp://www.elbeno.com/blog/?p=1051
http://www.elbeno.com/blog/?p=1051#commentsFri, 06 Mar 2015 21:05:44 +0000http://www.elbeno.com/blog/?p=1051You’re working on some software, and you have some QA folks testing it. How do you know how many bugs are left? You know the bugs that the testers find, but how can you estimate the number that aren’t yet found?

If your testers work independently, and if your feature is not under continuous development (i.e. you’re not adding bugs), (and maybe a couple of other ifs) you can use a thing called the Lincoln Index.

It’s quite simple.

Assume that there are N total bugs (found and as-yet-unfound).
Tester a finds A bugs. A = PaN, where Pa is the probability of a finding a bug.
Similarly, tester b finds B bugs. B = PbN, by the same reasoning.

There will be some bugs found by both testers, call that number C. It is evident that C = PaPbN. (The probability that both find a bug is the product of the individual probabilities.)

Now:

AB / C = PaNPbN / PaPbN = PaPbN2 / PaPbN = N2 / N = N

So a simple calculation gives us an estimate for N, the total number of bugs.

]]>http://www.elbeno.com/blog/?feed=rss2&p=10510Adopting C++11: no-brainer featureshttp://www.elbeno.com/blog/?p=1031
http://www.elbeno.com/blog/?p=1031#commentsSat, 21 Feb 2015 06:52:05 +0000http://www.elbeno.com/blog/?p=1031C++11/14 is a significant change from C++98/03, and features like move semantics take a while to get used to. Also, people tend to be quite conservative about adopting new features (especially if they look unfamiliar). It took us in the games industry a while to move to C++ from C. But here are what I consider the no-brainer, programming-in-the-small things to adopt. Extra safety and expressiveness with zero runtime impact and pretty much no potential for misuse (never say never, but you’d really have to go out of your way).

1.using instead of typedef

Not only is it a whole two characters fewer to type, it’s easier to read in the common function pointer case, and you can use it for template aliases, which often saves a lot more verbosity in the rest of the code. No more typename everywhere! Compare:

// old and busted: typedeftemplate<class T>struct Foo
{// I can't typedef Foo<T>::type // and I have to use typename everywheretypedef T type;};// Typical usage: a function pointertypedefvoid(*funcptr)(int);

Compile-time asserts; what’s not to like? Odds are you have a homegrown C++98 TMP version of this somewhere; now it’s part of the language. Extra safety for zero runtime cost.

3.nullptr

Again, extra safety for zero cost. Ditch your zeros and NULLs, and you can safely have functions overloaded on pointer and integral types.

4. scoped enums (enum class)

The compiler can help prevent accidental confusion between types, and the enum values don’t leak any more. Hurrah! (It’s like Haskell’s newtype for integers!)

5. forward declarations of enums, underlying type for enums

This works on (new) scoped and (old) unscoped enums alike. You don’t have to put in fake bit-width values to force the size of an enum any more, and you can hide the values separately from the declaration, so you don’t need to recompile everything when you add a value.

6.override

When (for example) the class you’re deriving from changes upstream, and now you’re accidentally not overriding a member function you thought you were… you’d really like to know that. With override, a bug that might take you a couple of hours to track down becomes a trivial-to-fix compile error. (Often mentioned in the same breath as override is final, and it’s fine, but much less usefully applicable.)

Now, there are also a couple of things to stop using.

1.rand()

Stop using rand(). Really, it’s bad. Deep down, we always knew it was “bad” but maybe we convinced ourselves it was OK for “small”/”quick-and-dirty” tasks. It isn’t. And now, it’s not any easier or faster than just doing the Right Thing with <random>. STL explains it all in rand() considered harmful.

2.bind

Lambdas supersede bind in every way. They’re more straightforward to write, easier to read, as powerful and at least as efficient if not more so, and just… you know, not so weird. And if you think lambdas take some getting used to, well, I don’t think I ever got used to bind…

]]>http://www.elbeno.com/blog/?feed=rss2&p=10241A problem with C++ lambdas?http://www.elbeno.com/blog/?p=1018
http://www.elbeno.com/blog/?p=1018#commentsThu, 19 Feb 2015 07:43:07 +0000http://www.elbeno.com/blog/?p=1018C++ lambdas are wonderful for all sorts of reasons (especially with their C++14-and-beyond power). But I’ve run into a problem that I can’t think of a good way around yet.

If you’re up to date with C++, of course you know that rvalue references and move semantics are a major thing. At this point, there are a lot of blog posts, youtube/conference videos, and even books about how they work, and also about how forwarding references (formerly known as universal references) work with templates and std::forward to provide nice optimal handling of objects with move semantics.

In C++14 the ability to handle lambda captures with move semantics was added with generalized lambda capture; another piece of the puzzle as far as move-semantic completeness goes.

Another lesser-known but important piece of the complete picture is that class member functions can have reference qualifiers. (You can find this in the standard in section 9.3.1 [class.mfct.non-static]). This means that just as we write member functions with const modifiers, we can overload member functions with & or && modifiers and they will be called according to the value category of this. So you know when you can call std::move on data members safely.

Now, lambdas are conceptually like anonymous classes where the body is the operator() and the captured variables are the data members. And we can write lambdas with a mutable modifier indicating that the data members are mutable. (By far the common case is for them to be const, so the ordinary usage of const on a member function is inverted for lambdas.)

I said conceptually because it turns out they aren’t completely like that in at least one important way: lambdas can’t have reference qualifiers. Maybe for good reason – how would the compiler implement that? How would the programmer who wanted that behaviour specify the overloads (s)he’s wanting? These are tricky questions to answer well. But it is a problem – as far as I can tell so far, there is no way to know, inside a lambda, what the value category of the lambda object is. So the performance promise of the move semantic model falls down in the face of safety concerns: I don’t know whether I can safely move from a captured variable inside a lambda.

If anyone has any ideas about this, please let me know. Google and StackOverflow can tell me all about how move captures work with lambdas, but nothing about how to move things out of lambdas safely, or divine the value category of a lambda object. All the things I’ve tried have either not worked, or have resulted in suboptimalities of various kinds. (And frankly, if anything had worked, at this point I’d put it down to a compiler quirk and not to be relied on.)

As far as I can tell, it’s a definite shortcoming of C++ that there’s no way to do this in a compile-time, type-inferred, lambda-land way. I don’t see this in the core language issues – is it a known problem, or is there a known way to solve it that I don’t know yet?

An empty initializer list {} shall not be used as the initializer-clause for an array of unknown bound. (note 105)
105) The syntax provides for empty initializer-lists, but nonetheless C++ does not have zero length arrays.

So if GCC/Clang are going to allow zero-length arrays, I think they should be consistent about it and do specialization “correctly”. A friend reports that Clang/LLVM 3.3 does what GCC does here and uses the base template. Clang/LLVM 3.5 uses specialization 2, which is “more” correct. I have yet to try with Clang/LLVM 3.6+.

Refactoring! My initial plan for customizing opener/closer/separator for containers turned out to be unwieldy: I realized that it wouldn’t be possible for me to provide default specializations and also allow clients to specialize. Also, you may have noticed that the code for printing pairs, tuples, strings and arrays didn’t allow for customization at all. So I reworked the customization, allowing a formatter object as a parameter to prettyprint() and providing a default one:

And now I can pass the formatter object through to each specialization of stringifier_select, and use it appropriately for pairs, tuples, strings and arrays, as well as iterables. When I want to override the formatter, I simply specify a new formatter type, implement the opener/closer/separator functions for the type in question the way I want to, and pass an instance to prettyprint.

So far, we can print containers, but what about arrays? And what about “pretty-printing” strings – perhaps we need to wrap them with quotes. Well, we know that with the existing code, both arrays and strings count as outputtable. Both std::string and char* (and its variously const friends) can be passed to operator<<, and so can arrays, because they decay to pointers.

So, we should be able to deal with this using common-or-garden specialization on stringifier_select. Specializing for std::string is easy:

Specializing for arrays (other than of char) is also easy, and gives us a chance to abstract out the code that we used for the iterable printing. By happy circumstance (i.e. by design!), arrays support std::begin() and std::end(), so we can write the following:

And the code for printing iterable things changes likewise. Unlike the situation with char*, we don’t need to deal with const and non-const separately because here, T itself is inferred to be const or not.

And that’s pretty much it – just a couple more things to add. I mentioned enum classes back in part 1, and here’s how we print out their values:

Simple. Two final things to add: first, specialize for pretty-printing bool equivalently to using std::boolalpha; second, distinguish a few “unprintable” things and output something for them – classes without operator<<, unions, nullptr. The code that does this is very similar to what we’ve already seen.

So now, we can pretty-print all “normally” printable things, containers, pairs and tuples, callable things, certain unprintables that we can meaningfully label, and really unprintable things with a fallback. I think that’ll do for the time being. It’s been a fun journey, exploring TMP techniques, C++ type support, mapping over tuples, and the amazing void_t.

This is getting to be a large nested “if-statement”, but it’s easy to follow, and the compiler doesn’t mind, so I don’t. Basically, this is the preferential order I want to use for outputting things. But then I discovered something puzzling, that took me a while to figure out. (In my defence, I discovered it late at night when I was probably not too sharp!)

// this outputs "1"!cout<<[](){}<< endl;

Lambdas (and in fact, functions) can be passed to operator<< – which means they’ll get is_outputtable_tag and produce 1 when printed. Not good. I want to print “<callable (function)>” or “<callable (function object)>”. (I’m OK with lambdas and function objects coinciding here.) So why does printing a plain lambda work at all? Well, the answer (which took me too long to see, and doubtless you, learned reader, have already seen) is that a non-capturing lambda has an implicit conversion to a function pointer. And that, like all pointers, has an implicit conversion to bool. Anything non-zero (like a perfectly good pointer) is true, and when you print true (without using boolalpha), you get 1.

Hm. Think think think.

So, I think there is going to be a compromise here. And that compromise is going to be triggered if someone deliberately writes a function object with a conversion to bool and supporting operator<<. Because lambdas have operator() and a conversion to bool, and I don’t want to use operator<< on them. So, it’s void_t to the rescue again, and by now I have macroed the detection code.