In parts 2 – 5 we have seen four different implementations of the monad. Now, it is time for a shootout. I want to see which version has best time performance and compare it with a regular task to evaluate the cost of using a monad.

The monad implementations are:

The first implementation that copies all its data.

The lazy implementation that evaluates at explicit call.

The references and move semantics implementation.

The pointer based implementation.

The Task at Hand

The challenge is to create a task that is hard to optimize by cleverly removing parts of the functionality. Imagine we have an algorithm that fills a vector with arbitrary data, which is then thrown away unused because the vector goes out of scope. We wouldn’t want the compiler to decide that it is useless to create the vector altogether and skip the code.

Further, I will make the task less predictable by extensive use of the rand() function. I know, rand() is not a great pseudo random engine. However, rand() is not used here to simulate randomness, but to get values that are available only at runtime, not at compile time – when the code optimizer runs. This way, the code cannot have optimizations based on compile time knowledge of constant values.

The algorithm that will be implemented by the functions that are ‘applied’ by the monad, and that will also be implemented by two regular functions is as follows (we will be working with the ValueObject class used in earlier episodes, but with a method added that retrieves a pointer to the payload):

Receive a (pointer / reference to a) ValueObject object.

Retrieve a char from the ValueObject object’s payload, with a random index, stored it in a global data structure.

Create a size for a new ValueObject object, that depends on the size of the received object, and the pseudo random generator.

Main Routine

In the main routine, we have a composition of 25 applications of the onemore function by the bind function. We execute this composition iterations times, which we call a run. We measure the time of each run. We average over the runs, and print the result tot screen.

The number of iterations and runs are given by rand(), which is seeded with a number obtained from the user.

Scoping is such that all objects are created and destroyed in measured time.

This is the code for the reference and move semantics based implementation:

As earlier, the timer code is from Simon Wybranski. The outer braces are just to assure different implementations do not interfere.

The iterations for the lazy monad differ from the above articulation in that it also makes a call: m2() to evaluate the constructed expression.

Regular Algorithms

The monad implementations are compared to regular algorithms that are functionally equivalent, but do not use monads. We will use one variant with references, and one variant with pointers and heap allocation.

The algorithm used is the same as above. The call in the main loop is terrible:

Procedure

Time performance was measured using a release build, with default compiler switches for release builds, and the runs were executed outside of Visual Studio. That is, the program was started from a console window, not from Visual Studio.

Results

And now the results:

So, what do we see?

The monad implementation based on references and move semantics (yellow) is the fastest. But the pointer based implementation has about the same performance.

The regular implementations are only slightly faster (red, orange).

The lazy monad (blue) is very slow, as expected.

Conclusions

The choice is clear. In following episodes, we will work with the monad built on references and move semantics. If required, we might use the pointer based variant as well. I will not use the lazy monad, unless it is clear that very, very special circumstances make it a favorable choice.

I was surprised by the small difference (if at all) between the fast monad implementation and the regular algorithms; the monad covers about twice the source code as the regular function – it also includes the bind function. So (hypothesis), the cost of using the references and move semantics monad is negligible.

Next

Next time we will start looking at various types of monads and how we could build compositions of them.

Like this:

Harder to C++: Monads for Mortals [5], Pointers

In part 2 we have seen a small and elegant implementation of the monad. In part 4 we have seen how references and move semantics can be used to make the monad’s performance independent of the size of the wrapped value. In this part we will see another implementation the monad based on a pointer.

Implementation of a Monad with a Smart Pointer

We have seen in part 2 that the type constructor is represented by a template in C++. In part 2 this is a simple struct, in part 3 it is a std::function holding a lambda expression – so as to implement delayed evaluation, and in this part, it is a smart pointer, the std::unique_ptr to be precise.

I think the most important result here is what we don’t see: calls to the move constructor and the accompanying call to the destructor of the ValueObject.

Pro’s, Con’s, and Questions

The above image of the output is nicely constrained. Functions return the unique_ptr by value, i.e. it gets copied which is ok, and the bind function takes a reference to a unique_ptr to elicit the raw pointer from. We see the calls to ValueObject destructor occur after assignment to m2, and when leaving the anonymous scope.

So, pro’s for this implementation are its simplicity, clarity, and efficiency: the overhead of a unique_ptr is comparable to the overhead of a raw pointer. Values are not copied, so performance is size independent.

You may wonder, though, whether it is a good idea to create all the monad’s values on the heap, which is not so efficient. In part 6 we take a look at what we have so far and see what works best.