You are looking at historical revision 20769 of this page. It may differ significantly from its current revision.

egg

LOOP is a generalized iteration form supporting extensible iterator macros, keyword updates, and full recursion. The idea is to create a loop as simple and close to natural Scheme as possible, while still being extensible.

The parameters have default steps, so we don't need to pass them explicitly anymore (though we can still do so if we wanted to override the defaults).

In addition, we provide extensible iterators to automatically handle the logic of stepping, fetching, and checking termination on sequence types. To use an iterator, specify one or more variable names followed by `<-' followed by the iterator and any parameters:

(x <- in-foo bar baz qux)

To iterate over a list, use the IN-LIST macro:

(x <- in-list ls)

This will bind X to the successive elements of LS in the body of the loop.

Now, when iterating automatically, the loop will also terminate automatically if it encounters the end of its input. In such a case you may want to specify a return value. You can do this by putting

Note we can still call, or not call, the loop itself in the body according to whatever logic we want, and re-enter it possibly multiple times. However, in this common case where the entire body is reduced to just calling the loop again, we can omit it by using an anonymous loop:

No flexibility is lost over named let, yet we've gained the convenience of iterators. If you wanted to change the above to work on vectors, all you would need to do is change the iterator:

(x <- in-vector vec)

and it works as expected.

Bindings and scope

Iterator macros may introduce variables in three different lexical scopes:

Loop variables

Analogous to the variables in a named let, these are initialized once and updated on each iteration through the loop. In the example above, KNIL is a loop variable (as are all named let and DO-style variables).

Body variables

Bound in the body, these are usually derived directly from loop variables. They can't be overridden (see below) and are not available in the final expression. In the example above, X is a body variable.

Final variables

Bound once in the return expression, these are sometimes used for some final computation such as reversing a consed up list.

Within each of these three lexical scopes, all variables are updated in parallel, and none are ever mutated (unless the programmer does so manually). This referential transparency is important to achieve full non-tail recursion and re-entrancy.

In many cases the loop variables will be implicit and unnamed. For instance, IN-LIST uses a loop variable to cdr down the list of pairs, binding X to the successive cars. However, in such cases the iterator usually lets you explicitly name the loop variable if you want access to it.

Loop variables may be manually overridden on a recursive call. You can either use the original positional arguments, or specify individual values by name with the <- syntax, punning the initial binding. Thus in

(loop lp ((x ls <- in-list ls)) ...)

the recursive calls

(lp)
(lp (cdr ls))
(lp ls <- (cdr ls))

are all the same. Note that we are binding the loop variable LS, not X which is considered to be always derived from the loop variable. Note also that there is no need to recurse on CDR - we could use CDDR, or a completely unrelated list, or '() to force an early termination.

The following example flattens a tree into a list, using minimal conses and stack. This serves as an example of naming implicit loop variables, binding loop variables, and non-tail recursion.

The scope of the final expression will include all the final variables, as well as all the last instances of all the loop variables, at least one of which will correspond to a true termination condition (you could manually check the others to see if the sequence lengths were uneven). The body variables are not bound, however the loop itself, if named, is available so that you can restart the loop with all new initial values if you want.

in-string / in-string-reverse

Iterate over the characters of a string. Proceeds from <start>, inclusive, to <end>, exclusive. By default <start> is 0 and <end> is the string length, thus iterating over every character.

You can specify a step other than the default 1, for example 2 to iterate over every other character.

The reverse version steps from one less than the end, continuing until you step below the start. Thus with the same <start> and <end> and a <step> of 1 (or any divisor of the difference), the two forms will iterate over the same characters but in the reverse order.

Note this works correctly with the utf8 egg, but is not optimal in such cases because the use of numeric indexes is slow.

in-port / in-file

Iterate over data read from a port, defaulting to (CURRENT-INPUT-PORT) for IN-PORT, and a port opened by (OPEN-INPUT-FILE <path>) for IN-FILE. The reader defaults to READ-CHAR?, and the termination test defaults to EOF-OBJECT?.

The stateful nature of ports means that these are not referentially transparent, and you can't save a loop iteration to go back to later. In particular, IN-FILE will close its port on the first termination, causing an error if you attempt to re-enter the same loop again.

in-range / in-range-reverse

Step through the real numbers beginning with <from> (default 0), until they would be greater than (less then in the -reverse case) or equal to <to> (thus <to> is never included). <step> defaults to 1.

Two arguments indicate <from> and <to>, so provide the default <from> of 0 if you're only interested in <to> and <step>.

These macros are subject to change in the near future.

in-random

syntax: (<number> <- in-random [<range> [<low>]])

With no arguments, <number> is bound to a random inexact number uniformly distributed over 0.0 and 1.0, inclusive, on each iteration.

With a single argument, <number> is bound to a random integer uniformly distributed over 0..<range>-1, inclusive.

With two arguments, <number> is bound to a random integer uniformly distributed over <low>..<low>+<range>-1, inclusive.

These are conceptually infinite sequences, and themselves never cause the loop to terminate.

in-random-element

syntax: (<element> <- in-random-element <vector-or-list>)

On each iteration, <element> is bound a random object uniformly chosen from the elements of the <vector-or-list> source.

Elements may be repeated, so this is a conceptually infinite sequence.

in-permutations

syntax: (<perm> <- in-permutations <list> [<n>])

With one argument, <perm> is bound to the successive permutations of the elements of <list> in lexicographic order. No assumptions about the elements are made - if <list> is a multi-set, duplicate permutations will arise.

This is very fast and mutation free. It uses only O(k) space, where k is the number of elements in <list>. Beware that the number of permutations of n elements is n!, which grows extremely fast.

in-hash-table

Iterate over the <key> and <value> pairs of a hash-table <table>. The current <key> being iterated over may be deleted from the table or have its value in the table changed safely.

The result is unspecified if you add or remove other values to the table while it is being iterated over. If you want to capture a safe snapshot of the table first, you can convert it to an alist and iterate over those values.

collecting

The only of the standard iterators that introduces a final variable. <list> is bound only in the => final clause. By default,

a <cons> of APPEND-REVERSE will append all the <expr>'s into a list.

a <finalize> of REVERSE-LIST->VECTOR will collect a vector, and IDENTITY will collect a reversed list.

By specifying all of <cons>, <finalize> and <init> you could collect into any data structure.

The optional <rev> is a loop variable representing the intermediate consed results. You may override this manually to include or exclude values, or even reset the collected results mid-loop.

This is really just syntactic sugar over an accumulated list to save you the trouble of reversing manually at the end.

Implicit matching

For any body variable (as described above, the ones derived from iterators, e.g. the elements in a list), instead of a simple name you can use any sexp, and it will be matched against the result as in Common-Lisp's destructuring-bind, except using the matchable syntax (described in Pattern Matching. So for example, to iterate nicely over the pairs in an alist, you just do

(loop(((k . v) <- in-list alist))(print "key: " k " value: " v))

This costs nothing if you don't use it, and is fast even if you do.

Extending

Adding your own iterators is easy. When a loop includes a binding such as

(left ... <- in-iterator right ...)

then the iterator itself is called as a macro in the following form:

(in-iterator ((left ...)(right ...)) next . rest)

where next and rest are the continuation. The continuation expects to be passed the appropriate information to insert in the loop, in the following form:

Note the outer let bindings are empty because we don't have anything to remember - the loop just proceeds by cdr'ing down the LS loop variable. In an interator such as IN-VECTOR, where you repeatedly VECTOR-REF the same vector, you'd want to bind the vector once so that it's not evaluated multiple times.

The final result bindings are also usually empty. Currently it's only used by COLLECTING to reverse the list that has been accumulated so far.