Abstract

Along with higher-order functions, one of the hallmarks of functional
programming is lazy evaluation. A primary manifestation of lazy
evaluation is lazy lists, generally called streams by Scheme
programmers, where evaluation of a list element is delayed until its
value is needed.

The literature on lazy evaluation distinguishes two styles of
laziness, called even and odd. Odd style streams are ubiquitous among
Scheme programs and can be easily encoded with the Scheme primitives
delay and force defined in R5RS. However, the even style delays
evaluation in a manner closer to that of traditional lazy languages
such as Haskell and avoids an "off by one" error that is symptomatic
of the odd style.

This SRFI defines the stream data type in the even style, some
essential procedures and syntax that operate on streams, and motivates
our choice of the even style. A companion SRFI 41 Stream Library
provides additional procedures and syntax which make for more
convenient processing of streams and shows several examples of their
use.

Rationale

Two of the defining characteristics of functional programming
languages are higher-order functions, which provide a powerful tool to
allow programmers to abstract data representations away from an
underlying concrete implementation, and lazy evaluation, which allows
programmers to modularize a program and recombine the pieces in useful
ways. Scheme provides higher-order functions through its lambda
keyword and lazy evaluation through its delay keyword. A primary
manifestation of lazy evaluation is lazy lists, generally called
streams by Scheme programmers, where evaluation of a list element is
delayed until its value is needed. Streams can be used, among other
things, to compute with infinities, conveniently process simulations,
program with coroutines, and reduce the number of passes over data.
This library defines a minimal set of functions and syntax for
programming with streams.

Scheme has a long tradition of computing with streams. The great
computer science textbook Structure and Interpretation
of Computer Programs, uses streams extensively.
The example given
in R5RS makes use of streams to integrate systems of differential
equations using the method of Runge-Kutta. MIT Scheme, the original
implementation of Scheme, provides streams natively. Scheme and the Art of Programming,
discusses streams.
Some Scheme-like languages also have traditions of using streams:
Winston and Horn, in their classic Lisp textbook, discuss streams, and
so does Larry Paulson in his text on
ML. Streams are an important and useful data structure.

Basically, a stream is much like a list, and can either be null or can
consist of an object (the stream element) followed by another stream;
the difference to a list is that elements aren't evaluated until they
are accessed. All the streams mentioned above use the same underlying
representation, with the null stream represented by '()
and stream pairs constructed by (cons car (delay cdr)),
which must be implemented as syntax. These streams are known as
head-strict, because the head of the stream is always computed,
whether or not it is needed.

Streams are the central data type -- just as arrays are for most
imperative languages and lists are for Lisp and Scheme -- for the
"pure" functional languages Miranda and Haskell. But those streams
are subtly different from the traditional Scheme streams of SICP et
al. The difference is at the head of the stream, where Miranda and
Haskell provide streams that are fully lazy, with even the head of the
stream not computed until it is needed. We'll see in a moment the
operational difference between the two types of streams.

Philip Wadler, Walid Taha, and David MacQueen, in their paper "How to add laziness to a strict language
without even being odd", describe how they added streams to the
SML/NJ compiler. They discuss two kinds of streams: odd streams, as
in SICP et al, and even streams, as in Haskell; the names odd and even
refer to the parity of the number of constructors (delay,
cons, nil) used to represent the stream.
Here are the first two figures from their paper, rewritten in Scheme:

The problem of odd streams is that they do too much work, having an "off-by-one"
error that causes them to evaluate the next element of a stream before it is needed.
Mostly that's just a minor leak of space and time, but if evaluating the next element
causes an error, such as dividing by zero, it's a silly, unnecessary bug.

It is instructive to look at the coding differences between odd
and even streams. We expect the two constructors nil and
cons to be different, and they are; the odd
nil and cons return a strict list, but the
even nil and cons return promises.
Nil?, car and cdr change to
accomodate the underlying representation differences.
Cutoff is identical in the two versions, because it
doesn't return a stream.

The subtle but critical difference is in map and
countdown, the two functions that return streams. They
are identical except for the (delay (force ...)) that
wraps the return value in the even version. That looks odd, but is
correct. It is tempting to just eliminate the (delay (force
...)), but that doesn't work, because, given a promise
x, even though (delay (force x)) and
x both evaluate to x when forced, their semantics are
different, with x being evaluated and cached in one case but
not the other. That evaluation is, of course, the same "off-by-one"
error that caused the problem with odd streams. Note that
(force (delay x)) is something different entirely,
even though it looks much the same.

Unfortunately, that (delay (force ...)) is a major
notational inconvenience, because it means that the representation of
streams can't be hidden inside a few primitives but must infect each
function that returns a stream, making streams harder to use, harder
to explain, and more prone to error. Wadler et al solve the
notational inconvenience in their SML/NJ implementation by adding
special syntax -- the keyword lazy -- within the
compiler. Since Scheme allows syntax to be added via a macro, it
doesn't require any compiler modifications to provide streams. Shown
below is a Scheme implementation of Figure 1 to 3 from the paper, with
the (delay (force ...)) hidden within
stream-define, which is the syntax used to create a
function that returns a stream:

It is now easy to see the notational inconvenience of Figure 2, as
the bodies of map1 and map3 are identical,
as are countdown1 and countdown3. All of
the inconvenience is hidden in the stream primitives, where it
belongs, so functions that use the primitives won't be burdened. This
means that users can just step up and use the library without any
knowledge of how the primitives are implemented, and indeed the
implementation of the primitives can change without affecting users of
the primitives, which would not have been possible with the streams of
Figure 2. With this implementation of streams, (cutoff3 4 (map3 12div
(countdown3 4))) evaluates to (3 4 6 12), as it should.

This library provides streams that are even, not odd. This decision overturns years
of experience in the Scheme world, but follows the traditions of the "pure" functional
languages such as Miranda and Haskell. The primary benefit is elimination of the
"off-by-one" error that odd streams suffer. Of course, it is possible to use even
streams to represent odd streams, as Wadler et al show in their Figure 4, so nothing
is lost by choosing even streams as the default.

Obviously, stream elements are evaluated when they are accessed, not when they are
created; that's the definition of lazy. Additionally, stream elements must be
evaluated only once, and the result cached in the event it is needed again; that's
common practice in all languages that support streams. Following the rule of R5RS
section 1.1 fourth paragraph, an implementation of streams is permitted to delete a
stream element from the cache and reclaim the storage it occupies if it can prove
that the stream element cannot possibly matter to any future computation.

The fact that objects are permitted, but not required, to be reclaimed has a
significant impact on streams. Consider for instance the following example, due to
Joe Marshall. Stream-filter is a function that takes a predicate and a stream and
returns a new stream containing only those elements of the original stream that pass
the predicate; it can be simply defined as follows:

Called as (times3 5), the function evaluates to 15, as
desired. But called as (times3 1000000), it churns the
disk, creating closures and caching each result as it counts slowly to
3,000,000; on most Scheme systems, this function will run out of
memory long before it computes an answer. A space leak occurs when
there is a gap between elements that pass the predicate, because the
naive definition hangs on to the head of the gap. Unfortunately, this
space leak can be very hard to fix, depending on the underlying Scheme
implementation, and solutions that work in one Scheme implementation
may not work in another. And, since R5RS itself doesn't specify any
safe-for-space requirements, this SRFI can't make any specific
requirements either. Thus, this SRFI encourages native
implementations of the streams described in this SRFI to "do the right
thing" with respect to space consumption, and implement streams that
are as safe-for-space as the rest of the implementation. Of course,
if the stream is bound in a scope outside the stream-filter
expression, there is nothing to be done except cache the elements as
they are filtered.

Although stream-define has been discussed as the basic stream
abstraction, in fact it is the (delay (force ...))
mechanism that is the basis for everything else. In the spirit of
Scheme minimality, the specification below gives stream-delay as the
syntax for converting an expression to a stream; stream-delay is
similar to delay, but returns a stream instead of a promise. Given
stream-delay, it is easy to create stream-lambda, which returns a
stream-valued function, and then stream-define, which binds a
stream-valued function to a name. However, stream-lambda and
stream-define are both library procedures, not fundamental to the use
of streams, and are thus excluded from this SRFI.

Specification

A stream-pair is a data structure consisting of two fields called
the stream-car and stream-cdr. Stream-pairs are created
by the procedure stream-cons, and the stream-car and
stream-cdr fields are accessed by the procedures
stream-car and stream-cdr. There also
exists a special stream object called stream-null, which
is a single stream object with no elements, distinguishable from all
other stream objects and, indeed, from all other objects of any type.
The stream-cdr of a stream-pair must be either another stream-pair or
stream-null.

Stream-null and stream-pair are used to represent streams. A stream
can be defined recursively as either stream-null or a stream-pair
whose stream-cdr is a stream. The objects in the stream-car fields of
successive stream-pairs of a stream are the elements of the stream.
For example, a two-element stream is a stream-pair whose stream-car is
the first element and whose stream-cdr is a stream-pair whose
stream-car is the second element and whose stream-cdr is stream-null.
A chain of stream-pairs ending with stream-null is finite and has a
length that is computed as the number of elements in the stream, which
is the same as the number of stream-pairs in the stream. A chain of
stream-pairs not ending with stream-null is infinite and has undefined
length.

The way in which a stream can be infinite is that no element of the stream is
evaluated until it is accessed. Thus, any initial prefix of the stream can be
enumerated in finite time and space, but still the stream remains infinite.
Stream elements are evaluated only once; once evaluated, the value of a stream
element is saved so that the element will not be re-evaluated if it is accessed
a second time. Streams and stream elements are never mutated; all functions
involving streams are purely applicative. Errors are not required to be
signalled, as in R5RS section 1.3.2, although implementations are encouraged
to detect and report errors.

stream-null (constant)

Stream-null is the distinguished nil stream, a single
Scheme object distinguishable from all other objects. If the last
stream-pair in a stream contains stream-null in its cdr field, the
stream is finite and has a computable length. However, there is no
need for streams to terminate.

stream-null => (stream)

(stream-cons objectstream) (syntax)

Stream-cons is the primitive constructor of streams,
returning a stream with the given object in its car field and the
given stream in its cdr field. The stream returned by
stream-cons must be different (in the sense of
eqv?) from every other Scheme object. The object may be
of any type, and there is no requirement that successive elements of a
stream be of the same type, although it is common for them to be. It
is an error if the second argument of stream-cons is not a stream.

Stream-delay is the essential mechanism for operating on streams, taking an
expression and returning a delayed form of the expression that can be asked at
some future point to evaluate the expression and return the resulting value. The
action of stream-delay is analogous to the action of delay, but it is specific to
the stream data type, returning a stream instead of a promise; no corresponding
stream-force is required, because each of the stream functions performs the force
implicitly.

Stream-map creates a newly allocated stream built by
applying function elementwise to the elements of the streams. The
function must take as many arguments as there are streams and return a
single value (not multiple values). The stream returned by stream-map
is finite if the given stream is finite, and infinite if the given
stream is infinite. If more than one stream is given, stream-map
terminates when any of them terminate, or is infinite if all the
streams are infinite. The stream elements are evaluated in order.

Stream-for-each applies procedure elementwise to the elements of the streams,
calling the procedure for its side effects rather than for its values. The
procedure must take as many arguments as there are streams. The value returned by
stream-for-each is unspecified. The stream elements are visited in order.

(stream-for-each display from0) => no value, prints 01234 ...

(stream-filter predicate?stream) (library function)

Stream-filter applies predicate? to each element
of stream and creates a newly allocated stream consisting of those
elements of the given stream for which predicate? returns a
non-#f value. Elements of the output stream are in the
same order as they were in the input stream, and are tested by
predicate? in order.

Implementation

A reference implementation of streams is shown below. It strongly
prefers simplicity and clarity to efficiency, and though a reasonable
attempt is made to be safe-for-space, no promises are made. The reference
implementation relies on the mechanism for defining record types of SRFI-9, and the functions
any and every from SRFI-1. The
stream-error function aborts by calling error as
defined in SRFI 23.

Copyright

Copyright (C) 2003 by Philip L. Bewig of Saint Louis, Missouri, United States of
America. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.