Lists are ubiquitous in Scheme, so it is useful to have available a collection of utility functions that operate on lists.

Take returns a newly-allocated list containing the first n elements of the list xs; if xs has less than n elements, return all of them. Drop is the opposite of take, returning all elements of a list xs except the first n. Split combines take and drop:

Folds use a user-specified function to reduce a list of values to a single value, and are one of the fundamental idioms of functional programming. Fold-left works left-to-right through the list xs, applying the binary op function to base and the first element of xs, then applying the binary op function to the result of the first op function and the second element of xs, and so on, at each step applying the binary op function to the result of the previous op function and the current element of xs; fold-right works the same way, but right-to-left.

The (range [first] past [step]) function takes three arguments and returns a list of numbers starting from first and ending before past, incrementing each number by step. If step is omitted, it defaults to 1 if first is less than past, and -1 otherwise; if first is also omitted, it defaults to 0. Arguments may be of any numeric type.

Mappend is like map, but assembles its pieces with append rather than cons. Iterate repeatedly evaluates a function f against the base values bs, after each iteration shifting the base values left by removing the first and appending the newly-calculated value at the end, returning the first n results in a list:

The (filter pred?xs) function returns a newly-allocated list that contains only those elements x of xs for which (pred? x) is true. The (remove xxs) function removes all occurrences of x from the list xs, using equal? to make comparisons:

Flatten takes a tree represented as a list with sub-lists and returns a list containing only the fringe elements of the tree. The two trees (a (b c)) and ((a b) c) both cause flatten to return (a b c):

All? and any? apply a predicate to each member of a list. All? returns #t if (pred? x) is non-#f for every x in the input list, or #f otherwise. Any? returns #f if (pred? x) is #f for every x in the input list, or #t otherwise. Both functions stop applying the predicate to elements of the input as list as soon as possible.

List comprehensions join the higher-order idioms map, filter and fold into a highly useful form of syntactic sugar for performing looping computations involving lists. The Standard Prelude provides three types of list comprehension:

(list-of exprclause ...) produces a possibly-null list of objects of the type returned by expr

(sum-of exprclause ...) computes the sum of the elements computed by expr

(fold-of opbaseexprclause ...) is like sum-of, but with a user-defined operator and base in place of the + and 0 of sum-of

Clauses may be of four types:

(var range [first] past [step]) — Bind var to first, first + step, ..., until reaching past, which is not included in the output. If first is not given it defaults to 0. If step is not given, it defaults to 1 if (< firstpast) and -1 otherwise. First, past and step may be of any numeric type; if any of first, past or step are inexact, the length of the output list may differ from (ceiling (- (/ (- pastfirst) step) 1).

(var in list-expr) — Loop over the elements of list-expr, in order from the start of the list, binding each element of the list in turn to var.

(var is expr) — Bind var to the value obtained by evaluating expr.

(pred?expr) — Include in the output list only those elements x for which (pred?x) is non-#f.

The scope of variables bound in the list comprehension is the clauses to the right of the binding clause (but not the binding clause itself) plus the result expression. When two or more generators are present, the loops are processed as if they are nested from left to right; that is, the rightmost generator varies fastest. As a degenerate case, if no generators are present, the result of a list comprehension is a list containing the result expression; thus, (list-of 1) produces a list containing only the element 1.

List comprehensions are expanded by a macro that calls itself recursively, one level of recursion for each clause plus a final level of recursion for the base case. The complete implementation, which is based on the set constructor in Kent Dybvig’s book The Scheme Programming Language, is given below:

The list comprehensions given here extend the usual notion of list comprehensions with the fold-of comprehension. Another extension is nth-of, which expands a list comprehension and returns the nth item:

Pattern matching provides syntactic sugar to destructure lists, select alternatives, and bind variables, and is provided by the list-match macro. The syntax (list-match exprclause ...) takes an input expr that evaluates to a list. Clauses are of the form (pattern [fender] expr), consisting of a pattern that matches a list of a particular shape, an optional fender that must succeed if the pattern is to match, and an expr that is evaluated if the pattern matches. There are four types of patterns:

() — Matches the null list.

(pat0pat1 ...) — Matches a list with length exactly equal to the number of pattern elements.

(pat0pat1 ... . patRest) — Matches a list with length at least as great as the number of pattern elements before the literal dot. PatRest is a list containing the remaining elements of the input list after the initial prefix of the list before the literal dot.

pat — Matches an entire list. Should always appear as the last clause; it’s not an error to appear elsewhere, but subsequent clauses could never match.

Each pattern element may be:

An identifier — Matches any list element. Additionally, the value of the list element is bound to the variable named by the identifier, which is in scope in the fender and expr of the corresponding clause. Each identifier in a single pattern must be unique.

A literal underscore — Matches any list element, but creates no bindings.

A constant — Matches if the expression equals the constant value, but creates no bindings.

A quote expression — Matches if the expression equals the quote expression, but creates no bindings.

A quasiquote expression — Matches if the expression equals the quasiquote expression, but creates no bindings.

All comparisons are made with equal?. The patterns are tested in order, left to right, until a matching pattern is found; if fender is present, it must evaluate as non-#f for the match to be successful. Pattern variables are bound in the corresponding fender and expression. Once the matching pattern is found, the corresponding expression is evaluated and returned as the result of the match. An error is signaled if no pattern matches the input list.

Pattern matching is performed by a macro that expands into a cond expression with one clause per pattern; an auxiliary macro handles the various types of pattern elements. The complete implementation, which is based on an idea of Jos Koot, is given below:

R5RS Scheme does not provide any way to create new variable types or to combine multiple variables into a single unit; R6RS provides structures that perform these tasks. The structures given below are upward-compatible to R6RS structures. Calling (define-structure namefield ...) creates a new structure. Define-structure expands into a constructor (make-namefield ...), a type predicate (name? obj), and an accessor (name-fieldx) and setter (set-name-field! xvalue) for an object x of structure type name.

Scheme provides one-dimensional arrays, which it calls vectors, but no two-dimensional arrays. Kent Dybvig provides a matrix data structure defined as a vector of vectors in The Scheme Programming Language:

The for macro is convenient for iterating over the rows and columns of a matrix. The syntax (for (var [first] past [step]) body ...) binds var to first, then iterates var by step until it reaches past, which is not bound; the body statements are executed only for their side-effects. Step defaults to 1 if first is less than past and -1 otherwise; if first is also not given, it defaults to 0:

Hash tables are one of the greatest inventions of computer science, permitting very fast retrieval of key/value pairs.

Make-hash creates an instance of an abstract data type of hash tables. Make-hash takes four arguments: Hash is a function that takes a key and returns an integer that provides an address for the bucket where the key/value pair is stored. Eql? is a predicate that takes two keys and returns #t if they are the same and #f otherwise. Oops is the default value returned by the hash table if a requested key is not present in the table. Size is the number of buckets where key/value pairs are stored; it is best to choose size as a prime number of magnitude similar to the expected number of key/value pairs. Make-hash returns a function that, when called with an appropriate message, performs the requested action; for a hash table created by

(define state-tab (make-hash string-hash string=? #f 4093))

the appropriate call is

(state-tab 'messageargs ...)

where message and args can be any of the following:

insertkeyvalue— inserts a key/value pair in the hash table, overwriting any previous value associated with the key

lookupkey — retrieves the value associated with key

deletekey — removes key and its associated value from the hash table, if it exists

updatekeyprocdefault — proc is a function that takes a key and value as arguments and returns a new value; if key is present in the hash table, update calls proc with the key and its associated value and stores the value returned by proc in place of the original value, otherwise update inserts a new key/value pair in the hash table with key key> and value default.

enlist — returns all the key/value pairs in the hash table as a list

Synonyms are provided for some of the messages; see the source code for details.

Make-dict provides the abstract data type of an ordered map, sometimes called a dictionary. Unlike hash tables that only take an equality predicate, the dictionary takes a less-than predicate lt? so that it can iterate over its key/value pairs in order. The dictionary also provides order statistics, so you can select the nth key/value pair in order or find the ordinal rank of a given key. The implementation uses avl trees, so any access to a particular key/value pair takes time O(log n). A dictionary is created by:

A dictionary is represented by a function, and once it has been created, say by (define dict (make-dict string<?)), operators are applied in message-passing style, such as (dict 'messageargs...) where messageand args can be any of the following:

empty? — returns #t if the dictionary is empty, else #f [nil?]

lookupkey &dash; returns a (key . value) pair corresponding to key, or #f if key is not present in the dictionary [fetch, get]

insertkeyvalue — returns a newly-allocated dictionary that includes all the key/value pairs in the input dictionary plus the input key and value, which replaces any existing key/value pair with a matching key; duplicate keys are not permitted [store, put]

updateprockeyvalue) — returns a newly-allocated dictionary; if key is already present in the dictionary, the value associated with the key is replaced by (prockv), where k and v are the existing key and value; otherwise, the input key/value pair is added to the dictionary

deletekey — returns a newly-allocated dictionary in which key is not present, whether or not it is already present [remove]

size — returns the number of key/value pairs in the dictionary [count, length]

nthn — returns the nth key/value pair in the dictionary, counting from zero

rankkey — returns the ordinal position of key in the dictionary, counting from zero

mapproc) — returns a newly-allocated dictionary in which each value is replaced by (prockv), where k and v are the existing key and value

foldprocbase &mdsah; returns the value accumulated by applying the function (prockvb) to each key/value pair in the dictionary, accumulating the value of the base b at each step; pairs are accessed in order of ascending keys

for-eachproc — evaluates for its side-effects only (prockv) for each key/value pair in the dictionary in ascending order

to-list — returns a newly-allocated list of all the (key . value) pairs in the dictionary, in ascending order [enlist]

from-listxs — returns a newly-allocated dictionary that includes all the key/value pairs in the original dictionary plus all the (key . value) pairs in xs; any key in xs that already exists in the dictionary has its value replaced by the corresponding value in xs

make-gen — returns a function that, each time it is called, returns the next (key . value) pair from the dictionary, in ascending order; when the key/value pairs in the dictionary are exhausted, the function returns #f each time it is called [gen]

Synonyms are provided for some of the operations, as given in square brackets above.

The three input processors for-each-input, map-input and fold-input operate on an input file or port in a manner similar to the way for-each, map and fold-left operate on lists. All three take an optional final argument. If the final argument is missing, input is taken from the current input port; as a side effect, all characters on the port are exhausted. If the final argument is a port, input is taken from the port, which is left open when the function returns; as a side effect, all characters on the port are exhausted. If the final argument is a string, it is taken as the name of a file which is opened, used as the source of input, and closed before the function returns. For all three functions, reader is a function that returns the next item from the input, and proc is a function that operates on each item.

Read-line reads the next line of text from a named port or, if none is given, from the current input port. A line is a maximal sequence of characters terminated by a newline, a carriage return, or both characters in either order; the final line in a file need not be terminated.

Most Scheme systems provide a sort function, and it is required in R6RS. For those that don’t, we provide this sort, which is stolen from Kent Dybvig’s book The Scheme Programming Language; it is stable, applicative, miserly about garbage generation, and fast, and it also provides a merge function. Lt? is a predicate that takes two elements of xs and returns #t if the first precedes the second and #f otherwise:

Like its unix counterpart, unique returns its input list with adjacent duplicates removed. Uniq-c returns its input list paired with a count of adjacent duplicates, just like unix uniq with the -c flag.

Vectors are sorted with the Bentley/McIlroy quicksort. The comparison function (cmp ab) returns an integer that is less than, equal to, or greater than zero when its first argument is less than, equal to, or greater than its second.

Identity returns it only argument and constant returns a function that, when called, always returns the same value regardless of its argument. Both functions are useful as recursive bases when writing higher-order functions.

(define (identity x) x)

(define (constant x) (lambda ys x))

Fst returns it first argument and snd returns its second argument; both functions occasionally find use when composing strings of functions:

(define (fst x y) x)

(define (snd x y) y)

Function composition creates a new function by partially applying multiple functions, one after the other. In the simplest case, there are only two functions, f and g, composed as ((compose f g) x) ≡ (f (g x)); the composition can be bound to create a new function, as in (define fg (compose f g)). Compose takes one or more procedures and returns a new procedure that performs the same action as the individual procedures would if called in succession.

Complement takes a predicate and returns a new predicate that returns #t where the original returned #f and #f where the original returned non-#f. It is useful with functions like filter and take-while that take predicates as arguments.

(define (complement f) (lambda xs (not (apply f xs))))

Swap takes a binary function and returns a new function that is similar to the original but with arguments reversed. It is useful when composing or currying functions that take their arguments in an inconvenient order.

(define (swap f) (lambda (x y) (f y x)))

A section is a procedure which has been partially applied to some of its arguments; for instance, (double x), which returns twice its argument, is a partial application of the multiply operator to the number 2. Sections come in two kinds: left sections partially apply arguments starting from the left, and right sections partially apply arguments starting from the right. Left-section takes a procedure and some prefix of its arguments and returns a new procedure in which those arguments are partially applied. Right-section takes a procedure and some reversed suffix of its arguments and returns a new procedure in which those arguments are partially applied.

Currying is the technique of rewriting a function that takes multiple arguments so that it can be called as a chain of functions that each take a single argument; the technique is named after the mathematician Haskell Curry, who discovered it (Moses Schönfinkel discovered the technique independently, but the term schönfinkeling never caught on). For example, if div is the curried form of the division operator, defined as (define-curried (div x y) (/ x y)), then inv is the function that returns the inverse of its argument, defined as (define inv (div 1)).

The integer square root of a positive number is the greatest integer that, when multiplied by itself, does not exceed the given number. The integer square root can be computed by Newton’s method of approximation via derivatives:

Modular exponentiation is provided by the function (expm bem). This is equivalent to (modulo (expt be) m), except that the algorithm avoids the calculation of the large intermediate exponentiation by performing multiply-and-square in stages.

Common Lisp provides a full suite of bit operators and a bit vector sequence datatype. Scheme, in its minimalism, provided neither until R6RS required a minimal suite of bit operators, but still no bit vectors. Our suite is small but useful. Note that all the bit functions are grossly inefficient; you should use whatever your Scheme implementation provides instead of relying on the functions given below.

Bit-wise operators consider numbers as a sequence of binary bits and operate on them using the logical operations and, inclusive-or, exclusive-or, and not; they are implemented using basic arithmetic. An arithmetic shift multiplies (positive shift) or divides (negative shift) by powers of two:

Bit vectors are implemented using arrays of characters. They are represented as a pair with a vector of eight-bit characters in the car and the length of the bit vector in the cdr. Make-bitvector creates a bit vector of the requested length with all bits zero unless the optional argument val is one; note the logic to ensure any “slop” bits at the end of the bit vector are set to zero, which is useful in the bitvector-count function:

Bitvector-count returns the number of one-bits in the input bit vector. The counts per byte are pre-calculated in the counts vector. Note that the last byte is not special, because make-bitvector was careful to ensure that any “slop” bits are zero:

This is the portable, high-quality random number generator provided in the Stanford GraphBase by Donald E. Knuth. Based on the lagged fibonacci generator (Knuth calls it the subtractive method)an = (an-24 − an-55) mod m, where m is an even number (we take m = 231) and the numbers a0 through a54 are not all even, it provides values on the range zero inclusive to 231 exclusive, has a period of at least 255 − 1 but is plausibly conjectured to have a period of 285 − 230 for all but at most one choice of the seed value, and the low-order bits of the generated numbers are just as random as the high-order bits. You can read Knuth’s original version of the random number generator at http://tex.loria.fr/sgb/gb_flip.pdf; see also our exercise GB_FLIP. This random number generator is suitable for simulation but is not cryptographically secure.

Called with no arguments, (rand) returns an exact rational number on the range zero inclusive to one exclusive. Called with a single numeric argument, (rand seed) resets the seed of the random number generator; it is best if the seed is a large integer (eight to ten digits — dates in the form YYYYMMDD work well), though seeds like 0.3 (three-tenths of 2^35) and -42 (2^35-42) also work. Since the shuffling algorithm requires its own data store, knowing the current seed is not sufficient to restart the generator. Thus, (rand 'get) returns the complete current state of the generator, and (rand 'set state) resets the generator to the given state, given a state in the form provided by (rand 'get). (randint n) returns a random non-negative integer less than n and (randint first past) returns a random integer between first inclusive and past exclusive.

Two versions of the mod-diff function are included. You should use the fast version if your Scheme provides logand natively, or the generic version otherwise. Note that logand is sometimes provided under a different name; for instance, it is bitwise-and in R6RS.

Fortune selects an item randomly from a list; the first item is selected with probability 1/1, the second item replaces the selection with probability 1/2, the third item replaces that selection with probability 1/3, and so on, so that the kth item is selected with probability 1/k. The name derives from the unix game of the same name, which selects an epigram randomly from a file containing one per line.

Rather than writing one-armed ifs, it is generally better to use a when or unless, which declares precisely what it is. When and unless are provided natively by many Scheme systems; for those that don’t, they are given below:

The call-with-values syntax makes it difficult to use multiple values. The let-values syntax required by R6RS and present in many R5RS systems is far more convenient. An implementation, stolen from Kent Dybvig, is given below for those Scheme systems that lack let-values:

Generators provide an easy-to-use syntax for separating the production of values from their consumption, and are provided natively in many other languages (sometimes they are called iterators). A function defined by define-generator creates a function that, when called, returns the next value in a sequence. For instance:

Astronomers calculate the julian number of a date as the number of days elapsed since January 1, 4713 BC. The Gregorian calendar, promulgated by Pope Gregory XIII on February 24, 1582, is the civil calendar used in much of the world. Functions julian and gregorian translate between the two calendars; year must be specified as the full four-digit number (unless you want years in the first millenium), month ranges from 1 for January to 12 for December, day ranges from 1 to 31, and the day of the week can be calculated as the julian number modulo 7, with 0 for Monday and 6 for Sunday:

For several centuries, the calculation of the date of Easter, a calculation known as the computus, was the most important scientific endeavor of the entire world. Function easter calculates the julian number of the date of Easter for a given year. If offset is given, it is the number of days before or after Easter; for instance, to compute the date of Mardi Gras, give an offset of -47:

The assert macro is useful when testing programs. The syntax (assert exprresult) computes expr and result; if they are the same, assert produces no output and returns no value. But if expr and result differ, assert writes a message that includes the text of expr and the result of computing both expr and result. Assert is a macro, not a function, because it prints the literal expr as part of its output, making it easy in a long sequence of assertions to know which is in error. Assert produces no output if all is well, on the theory that “No news is good news.”

Define-integrable is similar to define for procedure definitions except that the code for the procedure is integrated (some people would say inlined) whenever the procedure is called, eliminating the function-call overhead associated with the procedure. Any procedure defined with define-integrable must appear in the source code before the first reference to the defined identifier. Lexical scoping is preserved, macros within the body of the defined procedure are expanded at the point of call, the actual parameters to an integrated procedure are evaluated once and at the proper time, integrable procedures may be used as first-class values, and recursive procedures do not cause indefinite recursive expansion. Define-integrable appears in Section 8.4 of R. Kent Dybvig’s book The Scheme Programming Language:

Scheme provides hygienic macros (though syntax-case provides a way to safely bend hygiene); Common Lisp, by comparison, provides unhygienic macros. There are some circumstances where unhygienic macros are more convenient than hygienic macros; Paul Graham provides numerous examples in his book On Lisp. Define-macro provides unhygienic macros for Scheme:

The original version of isqrt, shown below, has been replaced by a version due to Henri Cohen (see algorithm 1.7.1 of his textbook A Course in Computational Algebraic Number Theory) which is prettier, faster, and generates less garbage:

The Standard Prelude at one time included a function to generate the permutations of a list. That function has been removed because it is too specific for a general-purpose Standard Prelude. The original function, with its description, appears below:

It is sometimes useful to generate a list of the permutations of a list. The function below is from Shmuel Zaks, A new algorithm for generation of permutations, Technical Report 220, Technion-Israel Institute of Technology, 1981: