Musings on C++

Recently I have been skimming through the presentations from CPPCon 2015. I came across Andre Bergner’s short look at an implementation for currying functions in C++. Admittedly I jumped through it without close attention, so I don’t know what his design goals were (above the generic currying of functions) nor whether he was working under any limitations. However I did see it was basically storing the partially applied arguments in std::tuple‘s and used std::index_sequence machinations for unpacking the tuples.

I thought surely this can instead be done easily with a lambda to the advantage of a lot less syntactic noise.

C++ Considerations

With this in mind I decided to have a look at what an implementation might look like were it to use lambdas. I had to set some design goals and without having an actual use for currying a function, I chose them arbitrarily. Heres the list of them:

Issue: It’s not possible to determine what arguments every function might accept as a compile-time calculation without testing against specific arguments. Functions in general may be overloaded, may have default parameters, may use variadic templates, or variable argument lists. I either have to limit the curry implementation to accept only functions with fixed signatures, or alternately test whether the curried function is invokable every time a new parameter is given.

Decision: Use the more general (and simpler) implementation and have the curry implementation support C++ functions with unknown number and type of arguments – testing whether it can execute with the arguments provided at each step.
This means a function with default parameters will never accept entries for them in its curried form. It would have invoked the target function when the last non-default parameter was provided.
It also means that if one passes incorrect arguments through, the curried function would continue to accept more ad infinitum. It could never be sure whether some future argument might be acceptable to the curry’d function.

Issue: C++ functions are very particular about what type qualifiers their arguments have: either for selecting the best match when the function is overloaded or when references come into play. The choice is how to capture the arguments’ types for storage.

Decision: I don’t have any use-case or argument to make for supporting parameters whose type qualifiers are preserved. As currying is applied more in functional programming, I figured capture-by-value would be simpler, more true to its origins and make the code more readable.
An implementation which follows this would mean currying functions which accept references for parameters doesn’t make sense.

The Implementation

The trick here is really in how to get the arguments into a lambda one-by-one, and then how to get them out of the lambda when we need to eventually invoke the intended function.

Every time the caller provides an additional argument to the curried function we’ll get the compiler to check whether the function can now be called with the arguments collected so far. If it can we simply return the result. If it cannot we need to return a new function object which has captured all the arguments provided until now and is able to accept a new argument – at which point it will again check whether it has sufficient arguments to invoke the original curry’d function.

It makes sense to split out the captured arguments from the function we’re going to pass them to. So really what our capture looks like is a lambda which takes a function as a parameter and it calls that function with all the arguments in its capture. That way we have a method of easily getting things into the capture (ie: through the lambda’s capture) and out of the capture (invoke the lambda with a function and it will pass all of its captured values to that function).

The implementation is tricky to describe. Hopefully it kind of speaks for itself. There are a few things I’d like to highlight though:

– Clearly we need an entry point which accepts a regular C++ function and returns it’s curried version. So this is curry(). It simply defers the actual currying to curry_impl() as the latter needs to take the current capture. To this end curry() provides the initial capture (which is empty because no arguments have been provided yet).

– We already know that we need two compilation paths: one for when we still need more arguments (returns a function object accepting the additional argument) and another for when we have enough and can invoke the intended function to exact its payload. The split here is in the two curry_impl() functions. Some template SFINAE magic helps the compiler choose which one should be called whenever a new argument is provided to the curried function.

– Looking at the curry_impl() function for the case when the caller still needs to provide more arguments: note the make_new_capture lambda. Thats how we take the new argument the caller has just provided and add it to all the ones already provided. We ask the previous capture (with all the previous arguments) to call the new one with its argument list. We then make a new capture which includes both the new value and all the previous ones. Pretty neat huh? But mind bending!

SFINAE Gotcha

I’m going to assume anyone reading this is familiar with SFINAE (if not, follow the link). There is a little trick to using it in the code sample. It may not be obvious on first look.

Since SFINAE only applies to ill-formed template expressions or template deduction when generating a type, any error caused by ill-formed code in the body of a template function is still a hard error. So in the code above it might be tempting do the switch between which implementation of curry_impl() to call by doing this: ask the compiler to look at decltype( capture( func ) ) which would in effect say “try to get the capture to pass its arguments to the function and tell us the return type”.

This results in a hard error because the mismatch actually occurs in the body of operator() on the capture lambda, and not in the decltype() clause we’d use in the template argument list.

What we need is for the body of the lambda to always be well-formed code but still be able to somehow use SFINAE to determine whether function f can be called with the arguments captured in the capture lambda.

Since the capture lambda is designed to call _any_ function we give it, the solution is simple: define a function object (in this case called is_callable_with) which declares an operator() that accepts any number of arguments. The function object is templated on the type we’d like to test. Then in a decltype() expression we can ask the compiler what the return type would be if an instance of is_callable_with<FUNC> were to be passed into the capture lambda. The capture lambda can never fail to have a well-formed body since the is_callable_with<> function object accepts any number and type of argument.

In this case is_callable_with declares that in the case where F would be invokable with the arguments it sees, its operator() would return type yes. This then becomes the return type of the lambda capture which would call it and so we can do some SFINAE magic based on that.

Of course since we never actually invoke is_callable_with<>, it only needs to be a declaration.

Hopefully the code is somewhat readable, if mind-bending. It was a fun exercise but I have no immediate use-case I can think of for it.

Future Considerations

I think some of the downsides to this implementation could be done away with. It is possible, for instance, to see whether adding machinery to curry() could figure out a known signature for the function it is wrapping.

Should the signature be known (or alternately given to it as a template parameter) then we could do away with the SFINAE test and instead have a recursive template function if the number and types for the arguments are known when curry() is called.

Additionally if all argument types are known then the curried form can easily be type-erased by a std::function after every application of an argument. Since I don’t have use-cases I would like to solve by currying a function, I don’t know how big a deal this would be.

The existing implementation could be the backup for when the signature is not known. I haven’t given this more than cursory thought because, as mentioned at the start of this post, I have no actual use-case in C++ for any of this.

I also think that with the right incantations of &&, std::forward and its cronies, it’s possible to present the captured types to the final function with all (or most) of their qualifiers. However having a partially bound function where the arguments could be references sounds like a recipe for disaster.

I’ve had the fortune to try out some C++11 features in my day job. Having other programmers start to use some of them in addition to the use-cases being on an actual developing project has been great. Amoung the features I was excited to put to use were auto and decltype.

I’ll look at decltype first since it is less contentious than auto. decltype almost always gives what an entry-level C++ programmer would expect. It always gives what the seasoned programmer would expect. It really only has one potentially unexpected outcome to the entry-level programmer: an lvalue expression’s type is a reference type.

So were x a local variable, decltype(x) differs from decltype((x)). The former being the type of x, the latter being a reference to the type of x. One place where this matters is for functions with decltype deduced return types wherein some programmers may be accustomed to wrapping their return values in parentheses:

The function inadvertently returns a reference to a temporary object now!

I think this is a fairly minor quibble though. The benefits of decltype for generic programming clearly outweigh this minor source of confusion.

auto, on the other hand I think is more dangerous. However not for the reasons which seem to be most loudly trumpeted on the blogs and forums I read.

If I may try summarize auto’s deduction in C++ laymans terms: it deduces the simplest mutable form for storing a copy of an object. This is fairly straight forward and the only potentially confusing deductions will, unlike decltype, be highly unlikely for the average programmer to stumble upon.

So what makes auto dangerous?

Hiding the type, in my experience, is very rarely an issue. Most deductions are known by the IDE (and made available to the programmer as code is written). In the infrequent situation that I need to see the variable declaration, the deduction of auto is usually apparent to me from the code it is given to deduce (usually a function call like .begin() or initialization from static values).

Hiding a copy operation doesn’t factor because I understand that if I need a reference, I have to add it (eg: auto&). This is much like I understand that in a pre-auto world I still need to make the type a reference when I want one. If I really have to mimic the exact type in a generic way then decltype comes to the rescue. Furthermore (N)RVO still applies as usual.

In fact I’d go as far as to say that for the longest time the almost-always-auto rule went a long way to writing less dependant code and I was becoming a big proponent of it.

Until I tried to squeeze in a proxy object, that is. Suddenly code written in the pre-auto style would have worked:

Now the code has introduced an additional requirement: that the object returned by the query may not be a proxy object which modifies the original! This introduces a very hard to track down bug because on first blush the more generic version of the code using auto would appear to do the same thing as it might have pre-auto. However this could not be farther from the truth.

In a pre-auto world the proxy object is free to use references or pointers internally which allow it to behave much like an l-value reference to another object (without actually having a reference on its type). The caller would specifically state the storage type of obj to be some non-proxy type and the proxy object would presumably have an implicit conversion to correctly construct a copy of the object it represents.

In the post-auto world the more generic implementation has subtly introduced an implicit requirement. It requires that the type returned from GenericQuery() can only be a copy of the object or a reference to the original object. In both cases auto would correctly deduce the storage type to something semantically appropriate. However in the case of the proxy object, auto would create a copy of the proxy object to be worked on. So in actual fact the example function would attempt to make changes to the original object via the proxy instead of the copy.

Furthermore capturing a proxy type intended to be stored as another can lead to further frustration if that type is then passed on to other template’d types with specializations dependent on the expected storage type not the proxy type.

Upon some consideration I really wish classes could define what their deduced storage type should be. However this would not only make auto confusing (now a programmer would have to look up a type declaration to see what auto will deduce) but also might have knock-on effects for template type deduction rules.

Unfortunately the only solution that came to mind is to define a storage_type() function which does any transformation necessary to turn an object into a type suitable for storage. The above example code would then be:

However this method has it’s own raft of potential issues based on visibility of the storage_type function. While these can be mitigated, that’s a separate discussion altogether.

I think this is a game-changer for me. auto now requires extra care and consideration when I am writing code. Clearly auto is needed to capture lambda types and still saves headache when a type is a highly specialized template with lots of noisy cruft. However its use in generic code has to be more considered rather than less so now.

It’s been quite a while since my last blog post. I’m still doing various things with C++, but haven’t had anything worth blogging about. Recently I began doing some testing with std::tuple to see what compilers might be able to optimize away and what they cannot. One of the first things I wanted to do was use it to store parameters – either for object construction or for use on a callable.

EDIT: A comment by FOONATHAN on this entry has a much tighter implementation of the code which can be found here: http://ideone.com/lUbOFA
The only small thing missing is that the tuple should ideally be forwarded inside apply. Other than that it is much cleaner, but the current VS2015 CTP seems to have a compiler bug and it fails to deduce one of the template parameters for this code.

The Mandate
Write some code to expand a std::tuple into the argument list for any callable, using forwarding mechanics to ensure the rvalue bindings are preferred when the tuple itself is an rvalue (or moved) so that move-only contents of the tuple are also supported.

So basically the method is fairly straight forward: a helper function recursively calls itself, adding on each successive std::get for each ‘layer’, until it reaches index 0 for the tuple – at which point it can pass the variadic arguments it has built up on to the function we want to unpack the tuple into. Since the helper function will need to have a partial specialization for the terminal case when the tuple index is 0 (and the desired function is finally called), it needs to be wrapped as a function object so the struct can be specialized.

In the situation where tuple_into_callable is called with a function as the first parameter (as opposed to a lambda, for example) there is a case where the function could be overloaded. In the case of an overload it would ordinarily be up to the caller to specify which overload is intended. We want the compiler to choose which overload is called based on the tuple’s contained types. So the solution is a handy macro called wrap_overload() which simply turns an overloaded function signature into a generic lambda which is not overloaded. This way the tuple_into_callable is able to have it’s template parameter deduced (theres only one to choose from so the caller does not need to specify), and the whole overload selection is delayed until the actual call is made.

Currently under Visual Studio 2015’s latest CTP there is a bug where a function returning decltype(auto) deduces an incorrect return type in the case where it’s return type is that of a function being passed in as a parameter. The TUPLE_FWD_RETURN macro is a little hack to get around that. (For those interested, the return type deduced is a && which causes the object being returned to go out of scope too early causing an incorrect call to the destructor before the object is used. Correct behavior would be the deduction of a plain value type (non-reference) so that either RVO or move construction can bubble it up.)

I think the test code kind of speaks for itself. It tests that the correct overload for a target function is called, tests having rvalue and lvalue tuples, and also tests being able to work with a tuple containing a move-only type. The only thing to remember is that if the tuple contains a move-only type, then it needs to be moved into the tuple_into_callable function.

In this blog entry I present a fairly simple implementation of the djb2 hash function using constexpr which enables the hash to be computed at compile-time.

Until C++11 it has not been possible to provide an easy-to-use compile-time hash function. Template meta-programming does not come to the rescue as it toys with template expansion, which means varying template arguments as various templates expand with different specializations and constraints. TMP suited to type manipulation and to a limited degree doing some work with integer values (as they can be used for template arguments).

Additionally an index into an array was not a compile-time operation. So both the mechanism for doing compile time calculations (TMP) and the actual accessing of the elements to be hashed (indexing into an array) could not be done nicely at compile time.

I say ‘nicely’ because if one were to separate the elements, it is possible to implement a hash using a macro with a usecase looking similar to this:

enum {
JPEG_HASH = HASH('J','P','E','G')
}

Now in C++11 the specification adds a new kind of keyword called constexpr. I won’t describe it in detail because you’re either aware of its meaning or able to look up a much nicer description of its meaning than I could write.

However it is worth noting that this, combined with other constraints around what operations can be done at compile-time being more relaxed, allows us to actually implement a compile-time hash function with ease. I think the code is simple enough to really speak for itself.

Note that at this time constexpr is not available in VS2013. You will need one of the latest CTP releases of the compiler to test it. It is supported by GCC currently.

This doesn’t seem like much of a blog entry – certainly if you search you’ll find other hash functions being implemented similarly. However this is likely to be step 1 in a more ambitious experiment. If it pans out, I’ll have a few more blog entries which use this specifically.

In this blog entry, I present a code snippet demonstrating how a templated math vector class can have a construction interface which allows it to be composed of any number of smaller or equal-dimension vector types. It’s only requirement is that the value type is convertible to it’s own value type.

Over the last while I have been doing a fair bit in OpenCL at work. It has been quite an interesting experience in discovering where performance bottlenecks lie when coding for the GPU.

However, all that aside, there was one feature in OpenCL which I found quite handy: how composable the vector types are.

It’s quite handy and for a brief moment I considered all the C-style math libraries I have used during my time writing C++ code in the video games industry and lamented that I didn’t have anything this handy for C++. So I figured why not try implement it and see how it comes out.

To get great composability, it’s really just a variadic template for one constructor. It would need to use SFINAE to restrict its applicability so it doesn’t consume any parameter types which the vector clearly cannot be composed with, and allow other constructors to be able to handle those cases.

Furthermore there is room to report error conditions with static_assert (such as not supplying enough parameters).

Now Visual Studio 2013 has a few bugs in it. It seems to be a little fragile when it comes to variadic template expansion. I had a few internal compiler errors where GCC was quite happy with the code. So some of the code is more verbose than it needs to be. Were VS2013 as compliant as GCC is, I wouldn’t need a ‘detail’ namespace as the code therein would be located inside the vector class itself, and it would be more succinct too.

Thus the code is more verbose and looks a bit more messy than it ideally would be.

In a rather naive test I write some code using a typical C style struct for a vector, did some operations and then did similar operations with this class. For a release build with Visual Studio the assembly generated was pretty much the same (aside from a difference in choice of registers) so it at least indicates this should (as expected) be similar in performance.

Of course proper testing would be needed.

So heres the code for it – just the shell to demonstrate the aforementioned variadic constructor and swizzle functionality.

In this blog entry, I present a code snippet to determine whether a given type contains a specific member function. The caller can indicate what types of parameters he intends on passing into the function.

A short while ago I wanted to implement some concepts. All I wanted were some fairly rudimentary ones which simply allowed me to categorize a type based on it’s interface so that later template algorithms could then be optimized based on how much functionality the given type had.

Think iterator classification but with their category being inferred from a compile-time inspection of the iterator’s member functions rather than the traits it ‘exports’.

The Basic Technique

This kind of thing has been around for a while. There are any number of variations which all differ on the details but the general technique remains largely the same:

The member function is mentioned in a meaningful way somewhere in a function signature for a templated function overload. SFINAE kicks in and if the binding cannot be done because the member function does not exist, the function overload is rejected from the pool of considerations.

It allows you to query has_callable_Foo<>::value with any type as the template parameter. The compiler will statically determine whether value is true or false by evaluating an expression contained in the sizeof() clause.

sizeof() is used to determine which overload of the check() function would be called by the evaluated expression. You’ll note the two versions of check return differently sized values (one is a char, the other a long to help determine which one would be called).

The first implementation of check() contains something which tries to refer to the member function we’re testing for on the TYPE/CHECKTYPE which is given. If that does not exist, the ‘catch-all’ version of check is used.

The Issue With the Old Way

Typically the meat of this function is in how the compiler is asked to check for the existence of Foo. Typically this has been done using a member function pointer.

One of the main limitations with this technique is that it asks the wrong kind of question.

It asks “does the given type contain a function with these parameter types?”. The problem with this question is that it is too restrictive.

If Foo is declared as:

class MyObject
{
public:
void Foo( const int& );
};

…and I ask whether MyObject contains a function Foo() which takes an int, clearly the answer is no. It does not take an int. It takes a const int&.

However I think a much more useful question to ask is “does the given type contain a function which can _accept_ these parameter types?”. If we can get an implementation which asks _that_ question, we’ll get a very different answer.

Yes, while Foo formally accepts a const int&, it certainly can accept an int too! Both an lvalue and rvalue int will happily bind to a const int&.

A New Way of Implementing

Luckily with C++11 we actually have language support to do this! With decltype(), declval() and variadic templates we can actually answer a better question AND improve the usability of the function across the board.

External to the struct definition I give is an implemention of refval() as I have called it. I want to mention two things about it – one is why I have it, and the other is why you may want to place it inside the struct has_callable rather than outside.

To explain why I have it, it’s easier to give an example of the problem.

If you’re going to ask whether a function will accept a parameter of type int, it’s actually an ambiguous question. There are two kinds of value you may have. One is as an lvalue and the other an rvalue.

int i = 1;
Foo( i ); //using an lvalue
Foo( 5 ); // using an rvalue

When we ask the compiler to evaluate an expression (such as inside the decltype in the has_callable snippet) we have to be specific which kind of parameter is going to be passed in. If the caller has an lvalue of type int, then it can bind to an int&. However an rvalue of type int cannot. Since the decltype is only evaluated and not compiled, a compile-time technique is needed for giving the compiler a valid expression for each parameter.

Were declval() to be used it would incorrectly assume that when the parameter is by-value, it will be an rvalue. So the refval() simply assumes any by-value parameter might be an lvalue by giving it a reference.

So:

refval< int >() // returns an int& -- behaves like an lvalue is being passed in
refval< int& >() // returns an int& -- remains the same
refval< int&& >(); // returns an int&& -- behaves like an rvalue is passed in

Now if you’re wanting to see whether the function would take an rvalue, you need to use the new C++11 notation for it: int&&

Another place where this makes a big difference is when the parameter is move-only (ie: it cannot be copied). Admittedly it is a niche case, but using refval() ensures that when the caller indicates the intended use is by-value with an lvalue then the has_callable snippet will correctly evaluate to false. If the caller indicates the intended use is specifically by rvalue, then it will evaluate to true since the rvalue can be moved.

I said there were two things I wanted to mention about the refval(). The second point is that I have declared it outside of the has_callable struct. I like to keep macro’s as short as possible. Thats why. However if you put the entire snippet inside a namespace, but then instantiate the macro (HAS_MEMBER_CHECK) outside of the namespace then clearly the macro will not have access to refval() which will be in a different scope. The easy fix here is to just put refval() as a member of the has_callable struct so it is guaranteed to always be in scope.

Finally one final remark about this code snippet: It cannot know about friend status. Now that I’ve mentioned it, it might seem obvious but it’s worth keeping in mind. Should you have a function or class A which is a friend of object B, then asking this snippet whether B can call a private member function on A will result in ‘false’ when actually the member function is visible within the intended context.

Thats it 🙂 An interesting experiment in C++ none-the-less, and probably a temporary one at that (see: Concepts).