In section 3 we said that programs were
collections of function definitions and possibly some variable
definitions, too. To guide the division of labor among functions, we also
introduced a rough guideline:

Formulate auxiliary function definitions
for every dependency between quantities in the problem statement.

So far the guideline has been reasonably effective, but it is now time to
take a second look at it and to formulate some additional guidance
concerning auxiliary functions.

In the first subsection, we refine our original guideline concerning
auxiliary programs. The suggestions mostly put into words the experiences
that we made with the exercises. The second and third one illustrate two of
the ideas in more depth; the last one is an extended exercise.

When we develop a program, we may hope to implement it with a single
function definition but we should always be prepared to write auxiliary
functions. In particular, if the problem statement mentions several
dependencies, it is natural to express each of them as a function. Others who
read the problem statement and the program can follow our reasoning more
easily that way. The movie-theater example in
section 3.1 is a good example for this style of
development.

Otherwise, we should follow the design recipe and start with a thorough
analysis of the input and output data. Using the data analysis we should
design a template and attempt to refine the template into a complete
function definition. Turning a template into a complete function
definition means combining the values of the template's subexpressions
into the final answer. As we do so, we might encounter several situations:

If the formulation of an answer requires a case analysis of the
available values, use a cond-expression.

If a computation requires knowledge of a particular domain of
application, for example, drawing on (computer) canvases, accounting,
music, or science, use an auxiliary function.

If a computation must process a list, a natural number, or some other
piece of data of arbitrary size, use an auxiliary function.

If the natural formulation of the function isn't quite what we want,
it is most likely a generalization of our target. In this case, the main
function is a short definition that defers the computation to the
generalized auxiliary program.

The last two criteria are situations that we haven't discussed yet. The
following two subsections illustrate them with examples.

After we determine the need for an auxiliary function, we should add a
contract, a header, and a purpose statement to a WISHLISTof
functions.36

Guideline on Wish Lists

Maintain a list of functions that must be developed
to complete a program. Develop each function according to a design
recipe.

Before we put a function on the wish list, we must check whether
something like the function already exists or is already on the wish
list. Scheme provides many primitive operations and functions, and so do
other languages.
We should find out as much as possible about our working
language,
though only when we settle on one. For beginners, a superficial
knowledge of a language is fine.

If we follow these guidelines, we interleave the development of one
function with that of others. As we finish a function that does not depend
on anything on our wish list, we can test it. Once we have tested such
basic functions, we can work our way backwards and test other functions
until we have finished the wish list. By testing each function rigorously
before we test those that depend on it, we greatly reduce the effort of
searching for logical mistakes.

People need to sort things all the time. Investment advisors sort
portfolios by the profit each holding generates. Doctors sort lists of
transplant patients. Mail programs sort messages. More generally, sorting
lists of values by some criteria is a task that many programs need to
perform.

Here we study how to sort a list of numbers not because it is important for
many programming tasks, but also because it provides a good case study of
the design of auxiliary programs. A sorting function consumes a list and
produces one. Indeed, the two lists contain the same numbers, though the
output list contains them in a different order. This is the essence of the
contract and purpose statement:

;; sort:list-of-numbers->list-of-numbers;; to create a sorted list of numbers from all the numbers in alon
(define (sortalon) ...)

Using this template, we can finally turn to the interesting part of the
program development. We consider each case of the cond-expression
separately, starting with the simple case. If sort's input is
empty, then the answer is empty, as specified by the
example. So let's assume that the input is not empty. That is,
let's deal with the second cond-clause. It contains two
expressions and, following the design recipe, we must understand what they
compute:

(firstalon) extracts the first number from the input;

(sort (restalon)) produces a sorted version of
(restalon), according to the purpose statement of the function.

Putting together these two values means inserting the first number into its
appropriate spot in the sorted rest of the list.

Let's look at the second example in this context. When sort
consumes (cons1297.04 (cons20000.00 (cons-505.25empty))), then

(firstalon) evaluates to 1297.04,

(restalon) is (cons20000.00 (cons-505.25empty)), and

(sort (restalon)) produces (cons20000.00 (cons-505.25empty)).

To produce the desired answer, we must insert 1297.04 between the
two numbers of the last list. More generally, the answer in the second
cond-line must be an expression that inserts (firstalon) in its proper place into the sorted list (sort (restalon)).

Inserting a number into a sorted list isn't a simple task. We may have to
search through the entire list before we know what the proper place
is. Searching through a list, however, can be done only with a function,
because lists are of arbitrary size and processing such values requires
recursive functions. Thus we must develop an auxiliary function that
consumes the first number and a sorted list and creates a sorted list from
both. Let us call this function insert and let us formulate a
wish-list entry:

;; insert:numberlist-of-numbers->list-of-numbers;; to create a list of numbers from n and the numbers on alon;; that is sorted in descending order; alon is already sorted
(define (insertnalon) ...)

The answer in the second line says that in order to produce the final
result, sort extracts the first item of the non-empty list,
computes the sorted version of the rest of the list, and inserts
the former into the latter at its appropriate place.

Of course, we are not really finished until we have developed
insert. We already have a contract, a header, and a purpose
statement. Next we need to make up function examples. Since the first input
of insert is atomic, let's make up examples based on the data
definition for lists. That is, we first consider what insert
should produce when given a number and empty. According to
insert's purpose statement, the output must be a list, it must
contain all numbers from the second input, and it must contain the first
argument. This suggests the following:

(insert5empty)
;; expected value:
(cons5empty)

Instead of 5, we could have used any number.

The second example must use a non-empty list, but then, the idea for
insert was suggested by just such an example when we studied how
sort should deal with non-empty lists. Specifically, we said that
sort had to insert 1297.04 into (cons20000.00
(cons-505.25empty)) at its proper place:

In contrast to sort, the function insert consumes two inputs. But we know that the first one is a number and atomic. We
can therefore focus on the second argument, which is a list of numbers and
which suggests that we use the list-processing template one more time:

The only difference between this template and the one for sort is
that this one needs to take into account the additional argument n.

To fill the gaps in the template of insert, we again proceed on a
case-by-case basis. The first case concerns the empty list. According to
the purpose statement, insert must now construct a list with one
number: n. Hence the answer in the first case is (consnempty).

The second case is more complicated than that. When alon is not
empty,

(firstalon) is the first number on alon, and

(insertn (restalon)) produces a sorted list consisting of
n and all numbers on (restalon).

The problem is how to combine these pieces of data to get the answer. Let
us consider an example:

(insert7 (cons6 (cons5 (cons4empty))))

Here n is 7 and larger than any of the numbers in the
second input. Hence it suffices if we just cons7 onto
(cons6 (cons5 (cons4empty))). In contrast, when the
application is something like

(insert3 (cons6 (cons2 (cons1 (cons-1empty)))))

n must indeed be inserted into the rest of the list. More
concretely,

(firstalon) is 6

(insertn (restalon))
is
(cons3 (cons2 (cons1 (cons-1empty)))).

By adding 6 onto this last list with cons, we get the
desired answer.

Here is how we generalize from these examples. The problem requires a
further case distinction. If n is larger than (or
equal to) (firstalon), all the items in alon are smaller than
n; after all, alon is already sorted. The result is
(consnalon) for this case. If, however, n is
smaller than (firstalon), then we have not yet found the proper place to
insert n into alon. We do know that the first item of
the result must be the (firstalon) and that n must be
inserted into (restalon). The final result in this case is

(cons (firstalon) (insertn (restalon)))

because this list contains n and all items of alon in
sorted order -- which is what we need.

The translation of this discussion into Scheme requires the formulation of
a conditional expression that distinguishes between the two possible
cases:

(cond
[(>=n (firstalon)) ...]
[(<n (firstalon)) ...])

From here, we just need to put the proper answer expressions into the two
cond-clauses. Figure 33 contains the complete
definitions of insert and sort.

represents a triangle. The question is what empty means as a
polygon. The answer is that empty does not represent a polygon and
therefore shouldn't be included in the class of polygon representations. A
polygon should always have at least one corner, and the lists that represent
polygons should always contain at least one posn. This suggest the
following data definition:

A polygon is either

(conspempty) where p is a posn, or

(consplop) where p is a posn
structure and lop is a polygon.

In short, a discussion of how the chosen set of data (lists of
posns) represents the intended information (geometric polygons)
reveals that our choice was inadequate. Revising the data definition
brings us closer to our intentions and makes it easier to design the
program.

Because our drawing primitives always produce true (if anything),
it is natural to suggest the following contract and purpose statement:

Given that both clauses in the data definition use cons, the first
condition must inspect the rest of the list, which is empty for
the first case and non-empty for the second one. Furthermore, in the first
clause, we can add (firsta-poly); and in the second case, we not
only have the first item on the list but the second one, too. After all,
polygons generated according to the second clause consist of at least two
posns.

Now we can replace the ``...'' in the template to obtain a complete
function definition. For the first clause, the answer must be
true, because we don't have two posns that we could
connect to form a line. For the second clause, we have two posns,
we can draw a line between them, and we know that (draw-polygon
(resta-poly)) draws all the remaining lines. Put differently, we can
write

(draw-solid-line (firsta-poly) (seconda-poly))

in the second clause because we know that a-poly has a second
item. Both (draw-solid-line ...) and (draw-poly ...) produce
true if everything goes fine. By combining the two expressions
with and, draw-poly draws all lines.

Unfortunately, testing it with our triangle example immediately reveals a
flaw. Instead of drawing a polygon with three sides, the function draws
only an open curve, connecting all the corners but not closing the curve:

Mathematically put, we have defined a more general function than the one we
wanted. The function we defined should be called ``connect-the-dots'' and
not draw-polygon.

To get from the more general function to what we want, we need to figure
out some way to connect the last dot to the first one. There are several
ways to accomplish this goal, but all of them mean that we define the main
function in terms of the function we just defined or something like it. In
other words, we define one auxiliary function in terms of a more general
one.

One way to define the new function is to add the first position of a
polygon to the end and to have this new list drawn. A symmetric method is
to pick the last one and add it to the front of the polygon. A third
alternative is to modify the above version of draw-polygon so that
it connects the last posn to the first one. Here we discuss the
second alternative; the exercises cover the other two.

To add the last item of a-poly at the beginning, we need something
like

(cons (lasta-poly) a-poly)

where last is some auxiliary function that extracts the last item
from a non-empty list. Indeed, this expression is the definition of
draw-polygon assuming we define last: see
figure 34.

Formulating the wish list entry for last is straightforward:

;; last:polygon->posn;; to extract the last posn on a-poly
(define (lasta-poly) ...)

And, because last consumes a polygon, we can reuse the template
from above:

Turning the template into a complete function is a short step. If the list
is empty except for one item, this item is the desired result. If
(resta-poly) is not empty, (last (resta-poly))
determines the last item of a-poly. The complete definition of
last is displayed at the bottom of figure 34.

In summary, the development of draw-polygon naturally led us to
consider a more general problem: connecting a list of dots. We solved the
original problem by defining a function that uses (a variant of) the more
general function. As we will see again and again, generalizing the purpose
of a function is often the best method to simplify the problem.

Exercise 12.3.1.
Modify draw-polygon so that it adds the first item of
a-poly to its end. This requires a different auxiliary function:
add-at-end. Solution

Exercise 12.3.2.
Modify connect-dots so that it consumes an additional
posn structure to which the last posn is connected.

Then modify draw-polygon to use this new version of
connect-dots.

Accumulator: The new version of connect-dots is a simple
instance of an accumulator-style function. In part VI we will
discuss an entire class of such problems. Solution

Newspapers often contain exercises that ask readers to find all possible words
made up from some letters. One way to play this game is to form all possible
arrangements of the letters in a systematic manner and to see which
arrangements are dictionary words. Suppose the letters ``a,'' ``d,'' ``e,''
and ``r'' are given. There are twenty-four possible arrangements of these
letters:

ader

daer

dear

dera

aedr

eadr

edar

edra

aerd

eard

erad

erda

adre

dare

drae

drea

arde

rade

rdae

rdea

ared

raed

read

reda

The three legitimate words in this list are ``read,'' ``dear,'' and
``dare.''

The systematic enumeration of all possible arrangements is clearly a task
for a computer program. It consumes a word and produces a list of the
word's letter-by-letter rearrangements.

One representation of a word is a list of symbols. Each item in the
input represents a letter: 'a, 'b, ..., 'z.
Here is the data definition for words:

A word is either

empty, or

(consaw) where a is a symbol ('a,
'b, ..., 'z) and w is a word.

Exercise 12.4.1.
Formulate the data definition for lists of words. Systematically make up
examples of words and lists of words. Solution

Let us call the function arrangements.38 Its template is that of a list-processing
function:

Given the contract, the supporting data definitions, and the examples, we
can now look at each cond-line in the template:

If the input is empty, there is only one possible rearrangement of
the input: the empty word. Hence the result is (consemptyempty), the list that contains the empty list as the only item.

Otherwise there is a first letter in the word, and (firsta-word) is that letter and the recursion produces the list of all possible
rearrangements for the rest of the word. For example, if the list is

(cons'd (cons'e (cons'rempty)))

then the recursion is (arrangements (cons'e (cons'rempty))). It will produce the result

(cons (cons'e (cons'rempty))
(cons (cons'r (cons'eempty))
empty))

To obtain all possible rearrangements for the entire list, we
must now insert the first item, 'd in our case, into all of these
words between all possible letters and at the beginning and end.

The task of inserting a letter into many different words requires
processing an arbitrarily large list. So, we need another function, call it
insert-everywhere/in-all-words, to complete the definition of
arrangements:

Exercise 12.4.2.
Develop the function insert-everywhere/in-all-words. It consumes a
symbol and a list of words. The result is a list of words like its second
argument, but with the first argument inserted between all letters and at
the beginning and the end of all words of the second argument.

Hint: Reconsider the example from above. We stopped and decided that we
needed to insert 'd into the words (cons'e (cons'rempty)) and (cons'r (cons'eempty)). The following is therefore
a natural candidate: