Introduction

After attending a free introductory course on functional programming at Coursera I got fascinated by the power of lazy (non-strict) collections. I was taught the basic functioning of the Stream class in Scala (basically a simply linked list with non-strict evaluation). This was a radical change in my view on how collections could be structured and how their elements could be evaluated.

Coming from a mostly imperative world (as a C++ hobbyist), functional programming was an exotic mythical beast I was yearning to tame into a language I felt much more comfortable with.

Browsing in the Internet, my first inspiration came from Example 2 in this Louis Botterill’s techy blog insightful post. In this example, Louis depicts a home-made but very inspiring imitation of the Scala Stream class. I took his design and tried to poured it into an equivalent C++03 class, with no success at all after several days. It was not until I moved to C++11 when I began to see the light at the end of the tunnel… But it is not yet time to show the innards of the resulting implementation.

The fascination of infinite streams

Let´s have a look at this simple instruction:

lazy_stream<int> stream_a = lazy_stream<int>::from(3);

To put it simply, stream_a is the set of all integers bigger than 2, i.e.: {3, 4, 5,…}. It represents an infinite set on which we can apply some intriguing operations we may not be used to. The first one is filtering:

We have just created another infinite set called stream_b. In this case it is the subset of all multiples of 4 in stream_a. The filtering is achieved by means of an anonymous (lambda) function, in the above example the pattern: [](...){...}. The operation represents the set {4, 8, 12,…}.

Ok. Nice, but how can I hold an infinite set in memory? In fact you can’t, but you do not need to. At the point stream_a is created, the only known element is the head of the stream, i.e.: the number 3. The rest (called the tail) is only an embryonic, not yet born stream that can be created by invoking a function we might call tail generator. To invoke the tail generator we use the tail method. Thus:

stream_a.tail();

creates a brand new lazy stream whose head is the number 4 and whose tail is another tail generator (just a function).

But not all the operations are equally lazy. from is pure lazy, but filter is not so. In fact it internally calls tail until the first filter-compliant occurrence is found (number 4 in our example). It is easy to see that if no such occurrence is found we will get into trouble (remember: the stream is infinite).

We have created a new lazy stream stream_c by mapping every value element of stream_b into the pair (value, value+1). stream_c is then: {(4, 5), (8, 9), (12, 13),…}.

Once an infinite stream is created, we may be soon interested in exploring its elements. To take the (say) first 10 elements of stream_c, we write:

lazy_stream<pair_type> stream_d = stream_c.take(10);

Observe that stream_d is a finite stream. In fact, we can take its size:

std::cout << stream_d.size() << std::endl; // Prints 10.

As you may imagine, size is a strict (non-lazy) operation. You have to evaluate all the elements of a lazy stream in order to know how many they are.

If we are only interested in getting a single element, we can use the get method. It just retrieves the element of a specified index n {n=0, 1, 2,…}. Unfortunately, it must evaluate the prior n-1 elements as well.

std::cout << stream_b.get(2) << std::endl; // Prints 12.

A worn-out but very illustrative example of a lazy stream is the extraction of the first n prime number by means of the so called Sieve of Eratosthenes.

The not so fancy finite lazy streams

Not so gorgeous, but they do not look that bad after all. To compose a finite lazy stream, we just prepend elements to an already existing finite stream. The simplest finite lazy stream is the empty one, also called nil.

lazy_stream<int> stream_e = 1 & (2 & lazy_stream<int>::nil);

The use of parentheses is a pity, but all the right associative operators in C++ can be overloaded only as members of a class, so I have been compelled to use a left associative operator &. The equivalent Scala operator #:: needs no parentheses, for instance.

Some interesting notes:

The size of stream_e is 2.

lazy_stream<int>::nil is equivalent to lazy_stream<int>().

We can always prepend an element to an already lazy stream, no matter whether this is finite or infinite.

Save the generation function from, the rest of operations (filter, map, take, etc.) are equally available for finite streams. For integral types we can create, for instance, finite ranges:

There are other operations it is not worth mentioning. I just would like to highlight one very important reducing function in functional languages: fold_left. This operation first takes a starting value of type U (let’s call it start). Then, with each one of the elements in the stream it operates a given function op which takes the start value plus the element value and stores the result in start. It then proceeds with the next element an so on. The final result is a single value of type U.

Let’s add the double 0.5 plus all the elements in stream_f:

// Observe that stream_f is a lazy stream of type int.
// The result is of type double instead.
// (Types of start and result must be the same).
double addition = stream_f.fold_left<double>(0.5)(
[](double accum, int element){
return accum + element;
});

Purpose of the lazy_stream<> class

With the writing of lazy_stream<> I did not intend to make a full-fledged, optimized and efficient piece of code. I just wanted to realized how the new C++11 specification could significantly move the scope of the language to a more functional (less imperative) view.

In fact, one of features of lazy_stream<> class is that all its objects are immutable, i.e.: once created, they cannot be modified. This is a common trait of functional programming objects, generally disregarded by imperative languages, because immutability sometimes leads to a lesser memory efficiency; many times it is much cheaper to modify an object than copying it.

On the other hand, I think this design would have been very tricky if not impossible with previous specifications of C++. I will discuss this idea later.

Implementation details

Both finite and infinite streams keep three smart pointers to dynamically allocated information in the heap, namely:

A std::shared_ptr smart pointer, called head_ptr_, to the head (a value of generic type T).

A std::shared_ptr smart pointer, called tail_ptr_, to the stream tail (another lazy_stream<T>).

Pointers [2] and [3] are mutually exclusive in order to keep the design consistent. Thus, one of them is directly set to nullptr since object creation.

The notation std::function<lazy_stream<T> ()>, aliased in the code as tail_gen_type, is quite weird to a C++03 developer eyes. The above type can wrap any function with no input parameters and returning a lazy stream. That includes:

Plain standalone functions

Member functions (static or non-static)

Lambda (anonymous) functions, with or without closures.

The use of tail_gen_type is the keystone of the whole design. The use of lambdas (with or without closures) representing tail generators is widespread along the code. All those functions can be wrapped (or represented) by the same type: tail_gen_type.

As an illustrative example, let’s have a look at take method implementation. Remember that this method takes an input stream and returns a new stream with the first n elements of the former.

This method, as well as the rest of methods in lazy_stream<>, is const.

If this stream is empty or if we ask for less than one element, we just return an empty stream [1]. Otherwise we evaluate the tail stream, invoking tail() [2]. Eventually we create and return a brand new lazy stream [3] whose head is this’s head and whose tail generator is given by the lambda:

[tail_, n]() -> lazy_stream<T> {
return tail_.take(n-1);
});

Observe that once the new lazy stream is created and returned, no further actions are taken. At this point we only know the head of this second stream (*head_ptr_). If we now invoke tail() on this newborn stream, a third lazy stream will be created by evaluating:

tail_.take(n-1);

in the lambda, where tail_ is the tail of the first stream.

We have been lazy, but in fact we could have been even more. Consider this alternative (but not implemented) code:

The above implementation is indeed a little more efficient than the original one: we do not pass the stream head into the lambda. Besides, it is a little lazier: the first invocation of tail() is inside the lambda. I have disregarded it only for a subjective reason: it loses clarity with respect to the original one and it is a little more verbose. Feel free to choose what you like more.

Because the lazy streams herein developed are immutable, they cannot be assigned (assignment operation is disabled). Copy-construction, on the other hand, is available at a reasonable cost. As we already know, a lazy stream object stores its information in the heap, keeping only two active smart pointers pointing to it. When copying, we are actually sharing resources ownership.

Implementation pitfalls in C++03

My first attempt after the Scala course was to write a lazy stream class in C++03. After racking my brains for a long while, I gave up. In C++03 there are no lambdas, so in order to approach them I tried to use function classes (functors). For instance, let’s turn the following lambda into a functor:

The essential difference lies in the fact that every lambda function with symbolic: type ()=>lazy_stream<T> (in Scala notation), no matter how it may be, can be represented by a std::function<lazy_stream<T> ()>. As far I know an analogous representation for functors is not possible in C++03. Moreover, std::function does not even exist in C++ specifications prior to 2011. Thus, when using C++03 I was not able to give all the functors I wrote a common representation. What unique type in C++03 can hold a functor with any closure parameters representing a function of type: ()=>lazy_stream<T>? I think none. So the key problem was I could not figure out how to write a tail generator in C++03.

Source code

The lazy_stream<> implementation code has been attached, along with a main sample code. Only a subset of the corresponding Scala methods (the most insightful, from my viewpoint) has been translated. The class allows extension and further optimization.

The code has been written with Code::Blocks and compiled with MinGw (mingw-get-inst-20120426).

History

December 2012. Initial version.

May 2014. Version 1.1. Main changes:

Constructors have been templatized in order to support rvalues arguments. (Thanks to Lukasz Gwizdz, another codeproject member, for his insight).

Share

About the Author

I work as a senior industrial engineer for Public Administration in Spain. I have experience in developing software for ballistic computations. I like maths and programming and, above all, riding my mountain bike. I run my own website on computation and numerical calculus at rumigaculum.com. Contact me here.

Thanks for your interest, Mike. I remember having read your article in the past and I found it very interesting from a theoretical point of view.

Only as a caution for newbies eager for exploiting this approach, we should warn them about memory leaks. Regarding this, let me transcribe here one of your article paragraphs (the bold highlighting is mine):

"The above consumer illustrates one drawback to programming in this style: the need for garbage collection. When the consumer replaces s with the stream object’s reference returned by s->tail(), there is often no longer any reference to the object that s had pointed to, but the operating system is unable to reclaim the object’s memory. We did not attempt to provide garbage collection because we are only interested in the concept of programming in this style.".