Look at the computer you use --- it was assembled from ``prefab''
pieces --- the display was built by one manufacturer, and it is bolted
and wired to the ``motherboard'' and CPU, which were built by another.
The case was built again by someone else, as was the disk drive
and any other gadgets (printers, DVD drives, cameras) that are attached.

A little history

The idea of assembling a product from parts that fit
together is not new. It was first devised by firearms manufacturers
in the eighteenth century. Gunmakers found it impossible to build
and service all the guns wanted by hunters (and, alas, soldiers).
To cope, the gunmakers designed standard-sized parts that
fitted together, like a jigsaw puzzle does, into a gun.
This way, more guns could be built, and when a part on the gun broke,
it could be replaced by a new, duplicate part.

The idea of standard parts was used by Henry Ford
to develop the modern assembly line,
where entire automobiles were built from standard-sized parts,
connected in a standard way. Almost all products we buy today (even food!)
are built using assembly-line techniques.

Programs are assembled from parts, too

Assembly-line ideas also apply to computer programs:
all large computer programs
are assembled from parts (``components'') that connect together in
standard ways. Since programs look like recipes in an odd
mathematical language, it is not immediately clear what it means
to build a recipe from ``components.'' But if you think about how a book
is built from a sequence of chapters, and how a person
who first appears
in one chapter reappears in a later chapter, then you are getting
the idea --- the characters and settings are ``parts'' that are ``assembled''
into a plot. An author rarely writes a book
from Chapter First to Chapter Last with all the characters, setting, and plot
completely fixed
in place, and large computer programs are
rarely written by one person going from the first command to
the last --- both books and programs are built from parts
that connect together.

There are two standard ways of building ``parts'' of
computer programs that connect together: functions
and modules. Stated crudely, functions are ``small parts''
(like nuts and bolts) that are used in lots of places, and modules
are ``big parts'' (like engine and transmission assemblies) that must be
bolted together to make the final product. In this chapter, we learn
how to manufacture the ``small parts''; the next chapter tells us
about the ``big parts.''

If we look at the programs in the previous two chapters,
we see that they are messy --- they are getting too large to
understand easily,
mostly because they contain lots of little, distracting details.
Also, the exact same details are appearing
in more than one place --- some of the commands were written
in one position and ``cut and pasted'' into another. This is bad,
especially if we have to change the program some months in the future ---
we will be forced to reread the entire program to find all the cut-and-paste
coding that must be changed.

About fifty years of programming
experience has proved that a human can understand and keep in mind
about 50 lines of computer code at one time --- just
50 lines! This is not enough to build any real program. How can
we develop programs that are longer? In addition to writing
good comments for the code, the solution lies in writing
the program in pieces (of no more than 50 lines each) and connecting
them together, much like a book is written in chapters.

The function is the best-known construction
for building program pieces. (Functions are also called
``subroutines,'' ``procedures,'' and ``methods.'' There are some
nuances in using these terms, but they don't matter here.)
Simply stated, a function is a sequence of commands bundled up
with a name,
and when we later mention the name, the bundled-up commands execute.
(It is like when your instructor barks, ``Read Chapter 10!'' You ``execute''
Chapter 10.)

In your Python program,
you are already using functions that were written by the Python designers.
Let's review a few:

print EXPRESSION:
The word, print, is the name of some commands that
tell the computer to display the (answer computed by)
EXPRESSION.

The EXPRESSION is called the function's argument ---
It is extra information that the function requires
to do its job, here, of printing something on the display.

We use the function in examples like, print "Your score is :", score.

raw_input(STRING):
The word, raw_input, is the name of some commands that
display the argument, STRING, on the computer's display. When the user
types some symbols and presses Enter, the commands grab the symbols,
build a string from them, and
return (give back) the string for the program to use.

We use the function in examples like, name = raw_input("Please type your name: ")

int(STRING):
The word, int, names some commands that examine the argument
STRING, make an integer from it, and return the integer.

We use the function in examples like, age = int(raw_input("Please type your age: ")). This example also uses the raw_input function,
whose answer becomes the argument for the int function.

We have used these functions many times, and indeed, we often
use them more than once in the same program.
Such functions are the ``nuts and bolts'' of programming, because they
are little ``packets of commands'' that do useful things
when they execute. Without these functions, we would be forced to
write long sequences of ugly commands for painting letters on
the display, or talking to the buttons on the keyboard, or moving about
the bits inside a computer word full of characters.

Just like there are different forms of parts for building a TV or car,
there are different forms of functions:

fixed parts: These are primitive parts, like nuts and bolts.
For programs, these are functions that do exactly the same computation
each time they are used. One built-in Python example is
exit(), which kills your program and the Python interpreter.

adjustable parts: These are parts that are adjusted
or reset each time they are inserted into an assembly. A spring is
a simple example; another is a volume control (potentiometer).
For programs, these are functions that use arguments to do their
work; an example is print ARGUMENT.

responsive parts: These are parts that respond when they are
plugged in, like a light bulb. For programs, these are functions
that return results; Python examples are
int(STRING), which uses the STRING
of digits to return an integer, and len(SEQUENCE), which returns
an integer that is the length of the argument, SEQUENCE.

The function is named error, and the keyword, def,
tells us (and the Python interpreter) that this line defines
a new function. The matching parentheses, ()
state that the function requires no argument. Note the colon afterwards.
(The space between the right parenthesis and the colon is optional,
and most people omit it.)

The function's commands (its body) are the two indented commands
underneath.

If you wish, you may pretend that

A function, like error,
is a ``little program'' that is started by
typing its name. Most importantly, the little program can be
started by another program.

Figure 1 shows a sample program that will manage cash withdrawals
for an ATM (automated teller machine);
the program asks the user to type a dollars-and-cents amount,
and if the amount is
in error, then the error function is executed by the main program:

The Python interpreter reads the program and encounters
def error():. Both the function's
name and its body are saved in the program's namespace for later
use. The function does not yet execute.

At the next line,
the user is asked to type an input.
The user types -4, and the int, -4, is
assigned to dollars.

The condition, dollars < 0, computes to True, so
error() is called:

The execution ``pauses'' at the point of the call,
and the computer retrieves the two commands named by error.

The two commands execute,
printing the message in the command window.

When the function's body is finished, execution resumes
at the point where the program was paused (where error() was
called).

The user is asked to type a second input, and a similar sequence of
steps occur.

Remember:

When a function is
called, execution pauses at the call, and the
commands in the function's body are retrieved and executed. When the
function's body finishes,
execution resumes at the point where it was
paused.

Here are some diagrams that visualize these steps:
The program starts in this configuration:

and error's definition is saved for later use.
(Actually, the instruction number address of the start of error's body
is saved.)

After the input is read and is found to be negative,

function error is called, and its commands are executed just as if
they were copied into the position where the function was called:

The function's commands execute
while the program is paused, and they use a new, private namespace that is
constructed for the function's use:

Once the function finishes, execution resumes in the program:

The diagrams reveal that function error has its own,
``local'' namespace for its own, ''local'' variable, text.
The function's namespace disappears when the function finishes.

It's true that function error does just a small task and
is not essential, but
compare the program in the Figure with the one that follows below,
without the function ---
The one below is a bit harder to understand because there are more details
in the way:

Also,
if we decide to change the error message to something different,
then we must change the message in two places in the second program.
In the first program, we change just the function definition, say
like this:

def error() :
print "Error. Sorry.\n"

When we define ``helper'' functions like error for
doing such ``little things,'' we will find it easier
to customize or change
our program in the future.

The examples of the error-printing functions show us the first
use of functions:

A function can
do one small thing that must be done over and over
from different places in a program.

We call such functions helper functions, because they
help us
write smaller, simpler programs with function calls
in place of duplicated, copy-and-paste codings.

It is important to insert into every function a comment that describes
what the function does. This is so important in practice that the
Python interpreter reads and saves the comment for later use.
But you must write
the comment as a string, placed immediately after the function's
header line, if you want the Python interpreter to remember the comment.
Here is the reformatted error function, using
a Python-style documentation string:

The documentation string is indented just like the rest of the body
of the function, and it begins and ends with three double-quotes.
(For this reason, the documentation string can be multiple-lines long.)

The documentation string causes no harm when the function executes,
and in fact,
you can now print it, using the print command.
Try this:

print error.__doc__

This prints error's documentation string.
Also, there are some development-environment tools for Python that use the
documentation strings to generate on-line documentation pages for
Python programs. So, we will use the documentation strings in our
functions.

Most of Python's prewritten functions have documentation strings.
For example, try

A function is a ``small piece'' of a program, and like a small mechanical
part, like a nut or bolt, we should be able to test the function
by itself before we insert it into a program. Python makes this
easy to do, by means of interactive testing.

This tells the Python interpreter to read Test.py,
even though it is an incomplete program.
Notice the -i --- this states that, after the program
is read, the Python interpreter allows you to use the
functions and variables in Test.py just as if you
had typed them by hand.

Now you can interactively test the function in Test.py --- type

>>> error()

This executes the function, just as if it was called from within a program.
You will see printed, as expected,

Functions might also require arguments, but they do not receive
them via raw_input. Instead, the arguments are given
to the function by the program that calls it. The argument
enters the function via a new variable, called a parameter.
Here is one example:

The definition line begins with def, followed by the function's
name, followed by a variable name in parentheses, and then the colon.
(The documentation string will be examined
in a moment.)
Here, message is a Python variable, just like variable text.
It is called a parameter (or formal parameter),
and it ``connects'' to an argument, somewhat like this:

The documentation string summarizes what the function does and
explains how the parameter will be used within the function.
This information is helpful when others call the function.
(Also, it is a standard Python style to leave a blank line between the
first line of the documentation string, which explains what the
function does, and the rest, which explains the parameter(s).)

In errorMessage("You typed a negative number."), the
"You typed a negative number." part is called the argument.
When the function is called, the argument
is assigned to the parameter.

Here, the expression, "The input, " + data + ", is incorrect."
is computed to its answer, a string, before function errorMessage
starts its work. For example, if the user types 123
as the input for variable data, then
the argument to the function computes to
"The input, 123, is incorrect." and only then does the function
start (by assigning the string to parameter message).

Look again at the printAlert function; it isn't so easy
to remember the order we must use for its three arguments.
In this situation, Python allows you to call the function with
keyword arguments, where you write the assignment statements
between the arguments and their parameters. You
do it like this:

The last line,
return answer, commands that the number named by answer
is inserted at exactly the position where the function was called.

For example, if the calling command was this:

one_third = reciprocal(3)

then the answer, 0.33333333333333331 ``replaces'' the function's call
and is assigned to variable
one_third. Or, if the calling command was this:

print "1/3 is", reciprocal(3)

then the answer ``replaces'' the function's call and we see printed,

1/3 is 0.33333333333333331

Here is a second example, where we write a function to do something
slightly complicated and important. (We will use the function in
the ATM program were are building.)

Accounting programs maintain balances in cents-only format
(for example, $30.95 is stored as 3095), because
bankers hate to lose fractions of pennies due to fractional-arithmetic
imprecision. But when we print a cents-only amount,
we must reformat it as a dollars-cents string (so that 3095
prints as "30.95"). Here is a function that
does the reformatting:

To learn the string it must print,
the print command computes
the expression,

"Balance is " + formatDollarsCents(current_balance)

To compute this,
formatDollarsCents(current_balance) is invoked.

Execution pauses in the middle of the expression
so that formatDollarsCents can compute and return
an answer:

Parameter amount is assigned 3205 (which is the value in variable
current_balance).

The commands in formatDollarsCents execute one by
one.

The last command,
return answer, returns
"$32.05" to the exact position
where the function was called.

Execution resumes, and
"Balance is " + "$32.05" computes to
"Balance is $32.05".

The string becomes the argument to the print function,
which prints it on the display.

The formatting function will will be used within the ATM
program. The function contained some delicate computation steps, and it
was good that we wrote and tested the function seperately.
Indeed, this is a second standard use of functions:

A function can be a ``little program'' that does
an important activity.

Such a function should
be designed and tested by itself
before it is inserted into a completed program.

The improved ATM program

Here is the latest version of the ATM program; it uses the function
we just wrote to print the amount the user wishes to withdraw:

A function is a ``little program,'' and it will compute an appropriate
answer if it is given appropriate arguments for its parameters.
The documentation strings that we write describe the appropriate arguments
and answers.

The restrictions on a functions argument that make the argument appropriate
are called preconditions, and the form of answer that is
computed and returned is called a postcondition.
Often the precondition and postcondition are stated in informal English,
but if we wish to formalize the algebraic knowledge produced by a
function, we can write pre- and postconditions in an algebraic way.

For example, here is the reciprocal function from the previous
section, where its documentation string is augmented with more precisely
stated pre- and postcondition information:

The precodition lists an algebraic condition under which the function
will behave properly, and the postcondition lists the algebraic property
of the function's answer (named, ``answer'') that holds true,
provided that the function's arguments made the precondition
hold true.

Let's look more closely at this relationship, that is, how a precondition
can be used to calculate how a function behaves and calculates
the knowledge asserted in the postcondition. We use the usual
algebraic reasoning on the function's body:

With the precondition, we guarantee that the
if-command will compute its test to True and the division
will take place. This ensures the truth of the postcondition.

A function's pre- and postconditions are used to do logical reasoning
when the function is called. For example, the main program might
invoke reciprocal. We can reason about the knowledge that is
produced by the function call, like this:

immediately before the invocation, we calculate
that the logical condition, PRE(args), holds true,
that is, where args are substituted for params;

then we
assert that POST(x) holds true after the function finishes, that is,
where x is substituted for answer.

We saw this reasoning applied in the previous call to reciprocal:

x = ...
# assert: x > 2
# implies: x != 0
# Since reciprocal's precondition is n != 0,
# and since x is the argument for parameter n, we must show that
# x != 0 to show that we are calling reciprocal correctly.
# This is true here.
y = reciprocal(x)
# reciprocal's postcondition is answer == 1.0/n,
# and since the answer is assigned to y, we have that
# y == 1.0/x

Since we proved that the precondition is true for
argument, x, we asserted the postcondition for
variable y.

Of course, the computer will let us call reciprocal incorrectly,
like this:

x = 0
y = reciprocal(x)

but no useful knowledge is produced from the call,
because the function's
precondition is False. If it is crucial that the function is
computed only when its precondition is True, then we might use a
Python assert command to enforce the precondition.
This might be done as follows:

The postcondition relates the structure of the
answer string to the function's parameter.
This knowledge ensures users of the function that the function computes
correct information for printing, like this:

Using preconditions and postconditions, we can write precise algebraic
properties of functions and insert these properties into the programs
that call the functions. In this way, we can continue to calculate
the
precise knowledge calculated by the main program that calls the functions.

Functions that do ``little things'' are usually self contained:
Arguments are deposited into their parameters, and the functions
print or return answers.
But functions can also
build and help maintain a program's data structure,
which is not contained within the function. Such a data structure
is sometimes called a global variable.

We can understand global variables with a simple example: Say that
a program uses a ``clock,'' and
every sixty clock ticks, an alert must sound.
The program makes a variable for the clock and starts it at zero:

clock = 0

Within the program, there are places where the clock should tick --- say,
after every third instruction. It is best to write a helper function,
tick(), to make the clock tick by 1. The function is called
within the program somewhat like this:

Look at the first command in the body of tick:
global clock tells the Python interpreter where to find
the variable that must be incremented.
When the function is called and the global clock
command is executed,

a notation is placed in tick's namespace that variable
clock is found in the ``global'' namespace.

When the following assignment, clock = clock + 1
executes, the notation in tick's namespace directs the
computer to find the value of clock in the global namespace.
The assignment to clock is done to the variable in the
global namespace, and the function finishes:

The ATM program with a global variable

The ATM program we are developing should consult the customer's
bank account to learn the customer's balance because conducting
any withdrawal.

In reality, the customer' bank account is kept in a separate database
program at the bank. Such a database will be a data structure --- like
a list --- of customers' balances. As a simple example,
say that each customer has a bank-account number that is a nonnegative
int, such as 0 or 1 or 2 .... Then, we can keep the balances for
the bank's customers in a list. For example,
if the bank has three customers, where
acouunt 0 has a balance of $100.50, and
account 1 has a balance of $30.00, and
account 2 has a balance of $250.66,
then the list of accounts might be defined as

all_accounts = [10050, 3000, 25066]

where the amounts are stored in pennies.

If a customer wishes to withdraw money from their account,
the customer's account in the list must be consulted, and the
amount withdrawn must be subtracted from the account's balance.
It is best to write functions that get a customer's balance
and that do withdrawals.
This is because the list of accounts is a critical piece of information
(to the bank!) and it is best to use maintenance functions that
correctly maintains the list.

========
# BankDatabase is a simple simulation of a bank's database.
# The database remembers the cash saved for each account number.
# This is done with a list of ints. For example, the list,
# all_accounts = [10050, 3000, 25066]
# has recorded that
# acouunt 0 has a balance of $100.50, and
# account 1 has a balance of $30.00, and
# account 2 has a balance of $250.66
# Here is a sample database to get started.
# (Usually, this information is read from a disk file.)
all_accounts = [10050, 3000, 25066]
#### Maintenance functions for the database:
def getBalance(account_num) :
"""getBalance returns the account's balance
parameter: account_num - an int, the account number
returns the balance, in cents
"""
global all_accounts
return all_accounts[account_num]
def withdraw(account_num, cash) :
"""withdraw removes cash from balance. If cash is greater than balance,
then only the remaining balance is withdrawn.
parameters: account - an int, the account number
cash, an int, a cents-only amount
returns the amount of cash withdrawn from the balance
"""
global all_accounts
balance = all_accounts[account_num]
if cash > balance :
amount_withdrawn = balance
else :
amount_withdrawn = cash
# revise the account info:
all_accounts[account_num] = balance - amount_withdrawn
return amount_withdrawn
========

These two functions ``maintain'' the bank-account.
Since the two functions share the list, all_accounts,
it is global to both.
For example, if an ATM wishes to learn the balance for account 2,
it would call the function, getBalance, like this:

balance = getBalance(2)

and a withdrawal of $10 from the same account would be done by this
function call:

cash_withdrawn = withdraw(2, 1000)

Here is the final version of the ATM program with all its functions in
place. It uses the simulated bank database seen above.

Of course, a real-life bank database would be a huge data structure
of bank accounts that live in a separate program on a separate
computer. The ATM program would communicate with the database to
obtain balance and withdrawal information. In such a case,
the functions are used by the ATM program to communicate with the
database.
Even though our ATM example is simplistic, it is faithful to a
real-life ATM and bank database. In the next chapter, we learn
how to write programs in separate files that can communicate with
each other by means of functions.

The clock and ATM examples in the previous section showed how
functions can correctly maintain a global variable.
We call such functions maintenance functions.
This mating of global-variable-plus-maintenance-functions is extremely important
to building large programs, and it
is so important that we give three examples now.

Bank account

In the previous section, we modelled a
bank account that might be used with an ATM. The example is important and
we repeat it here, along with some additional functions that
let a bank officer add new accounts and make deposits.

==================================================
# BankDatabase is a simple simulation of a bank's database.
# The database remembers the cash saved for each account number.
# This is done with a list of ints. For example, the list,
# all_accounts = [10050, 3000, 25066]
# has recorded that
# acouunt 0 has a balance of $100.50, and
# account 1 has a balance of $30.00, and
# account 2 has a balance of $250.66
# Here are some sample accounts to get started.
# Usually, this information is read from a disk file.
all_accounts = [10050, 3000, 25066]
# Maintenance functions for the database:
# A banker uses these functions to create new accounts and deposit money:
def addAccount(cash) :
"""addAccount creates a new bank account and assigns to it
an account number
parameter: cash - an int, the initial deposit
returns: the account number, an int, for the newly created account
"""
global all_accounts
all_accounts = all_accounts + [cash]
new_account_num = len(all_accounts) - 1
return new_account_num
def deposit(account_num, cash) :
"""deposit adds cash to an account.
parameters: account_num - an int, the account number
cash - an int, the amount to deposit
"""
global all_accounts
all_accounts[account_num] = all_accounts[account_num] + cash
# The ATMs send input to these functions, which compute and return answers:
def getBalance(account_num) :
"""getBalance returns the account's balance
parameter: account_num - an int, the account number
returns the balance, in cents
"""
global all_accounts
return all_accounts[account_num]
def withdraw(account_num, cash) :
"""withdraw removes cash from balance. If cash is greater than balance,
then only the remaining balance is withdrawn.
parameters: account - an int, the account number
cash, an int, a cents-only amount
returns the amount of cash withdrawn from the balance
"""
global all_accounts
balance = all_accounts[account_num]
if cash > balance :
amount_withdrawn = balance
else :
amount_withdrawn = cash
# revise account info:
all_accounts[account_num] = balance - amount_withdrawn
return amount_withdrawn
=========================================================

The functions, withdraw and getBalance, do the
maintenance and are carefully written so that the value of
balance is never altered incorrectly (e.g., a negative
value for balance never arises).
People and ATMs that wish to examine and alter the account's
balance use the maintenance functions to do this delicate work ---
they do not
assign to balance directly.

The above coding would almost certainly live in a separate file on
a separate computer; the bank's employees and the ATM's would
be situated on other computers and use their own programs to
call the withdraw and getBalance functions.
(We learn how to make this arrangement in the next Chapter.)

Office telephone directory

Say that our company must build and maintain an electronic
telephone directory that is shared by its employees.
For design reasons and for practical reasons,
we build the telephone directory by itself:
We choose a data structure for the telephone directory
and functions that will correctly maintain the directory.

Perhaps the directory looks like the one from the previous chapter,

so we use a dictionary data structure, where information
is stored like this:

The company's employees wish to
(i) look up
phone numbers, (ii) insert new people and their phone numbers,
and (iii) delete people who quit the company.
Since there are three actions one can do with the telephone book
(dictionary), we write three maintenance functions, one for each action.

Each function is simple to write, and we quickly design
and code the computerized telephone book:

========================================================
# The telephone book: It maps a person's name to their phone number,
# e.g., tel_book = {"Jane Sprat": 4098, "Jake Jones": 4139, "Lana Lang": 2135}
tel_book = { }
# The functions that maintain the telephone book:
def lookup(name) :
"""lookup finds the phone number for name and returns it
parameter: name - the person we look for
precondition: name in tel_book
returns: the phone number for name. If name not in the book, -1 returned
postcondition: number == tel_book[name]
"""
global tel_book
number = -1
if name in tel_book :
number = tel_book[name]
return number
def insert(name, number) :
"""insert inserts the new name and number into the book.
If the name is already in the phone book, _no action is taken_!
parameters: name - the person; number - their phone number
precondition: name not in tel_book
postcondition: tel_book[name] == number
"""
global tel_book
if not(name in tel_book) :
tel_book[name] = number
def delete(name) :
"""delete removes the entry for name in the book.
precondition: name in tel_book
postcondition: name not in tel_book
"""
global tel_book
if name in tel_book :
del tel_book[name]
=================================================================

Next, we should place this coding in a file and interactively test
the functions, one at a time. The test session might go like this:

and so on. Notice that, in between our function testing,
we can interactively peek inside the
global variable and see its value.
This is useful for checking that
the variable is correctly updated by the functions.
We are allowed to do this
because we are the software developers, and we want to see all the
details within the program we build. But the people who will
use our telephone book will not do this --- they will use the
functions as the ``entry points'' into the telephone book.

Slide puzzle

When we model within the computer a chess game or the world's weather
patterns, we build an electronic version of the real-world
object we study. The standard approach to computer modelling a physical
object is to define a data structure to represent the object
and define functions that manipulate and maintain the object.

Last chapter we saw how to use a grid to model a slide puzzle,
which is a simple example of a physical game board.
We can extract the modelling of the puzzle and
its function that slides puzzle's pieces.
Here is the data structure and its
maintenance functions:

The difficult details about sliding a piece into the puzzle's empty
space are all contained within function move. Simularly,
the details of formatting of the puzzle for printing are contained
within printPuzzle. This makes it truly easy to write
programs that play with the puzzle:

Summary

A function can do maintenance
to a global variable (data structure) to keep
the variable correctly updated.

When the global variable and its maintenance functions are used by a
larger program, the the program calls the maintenance
functions to do the correct updating of the global variable.

In the next chapter, we will learn how to package a global variable
and its helper functions into a separate file --- a module ---
which is a ``big part''
or ``subassembly'' that can be connected together to other files
to assemble
a big program.

When a function is called, arguments are assigned to parameters. This
is just assignment.
Since we can assign data structures like sequences and dictionaries
to variables, we can use data structures as arguments to parameters as well.

The previous section emphasized the importance of writing maintenance
functions that contain technical details for updating and maintaining
important data structures. We learned in the previous two chapters
that data structures usually have important internal properties,
called data invariants, that must always be kept true.

For example, the slide puzzle program used a grid to represent
the puzzle. The key data invariant for the puzzle is this:

That is, no matter how often puzzle is updated, its cells hold
numbers that are exactly some permutation (mixed-up set of) 0 to 15.

A data structure's maintenance functions must maintain
the data structure's data invariant.
Since the maintenance functions are the only functions
that have permission to directly change the data
structure, we should always carefully check, when each maintenance
function is called and finishes its work, that the data invariant
still holds true.

For the slide puzzle in the previous section, it is clear that the
printPuzzle function does not alter the puzzle, so it
maintains the puzzle's data invariant. When we study the
move function, we notice that it alters the puzzle here:

Because we know that empty_space is a pair of two
numbers that remembers the coordinates of where 0 lives in the
puzzle, we can confirm that puzzle's data invariant is
kept true by these two commands, which exchange a 0 in one cell
with an integer in another cell.

Data invariants are critical to banking applications.
Consider one more time the bank-account example and note carefully
the data invariant attached to the account's balance:

This reasoning is absolutely critical to the bank, who does not
wish to give away more money that what has been deposited!

Because we allow only the maintenance functions to alter the
data structures, we need check the data invariants only within
the maintenance functions.
This greatly reduces the work we must do to ensure that the entire
program, which might be truly huge, operates correctly.

We have already learned that functions are ``small parts'' that are useful
for

doing one small thing that must be done over and over from different
places in a program. This is called bottom-up programming,
and the functions are called helper functions.

doing one important thing that must be designed and tested by itself
before it is
inserted into the rest of the program. This is called
top-down programming. Such functions are sometimes called
top-down or stepwise-refined functions.

doing one thing to a global variable (data structure) to help keep
the global variable correctly updated. This is called modular programming, and the functions are called maintenance functions.

In the ATM program, we saw all three uses of functions:

bottom-up:
The errorMessage function was invented because error messages
were printed at multiple places in the program.

top-down: The formatDollarsCents function was written
to contain the commands that convert a cents amount
into a formated dollars-and-cents amount.

modular: The withdraw function was written to
update balance each time cash is withdrawn.

All three reasons for coding functions will be used in big programs,
and we must look for opportunities to employ them.
Most programmers find bottom-up programming is easy to learn (you spot
duplicate code or you find yourself copying-and-pasting code
so you write a function that holds the duplicate code);
modular programming is a bit tougher to learn (you look at how the program
alters the main
data structure and you write a function containing the commands that
do the alterations);
and top-down programming is the toughest to do well (you think about
what are the ``big steps'' the program must do to solve the problem
and you define a function for each big step).

We now do a case study to illustrate how these techniques might
be used.

In Chapter 3, we built a checkbook-assistant program that logged
a sequence of bank-account deposits and withdrawals, maintained the
balance, and printed the ledger when asked.

The program we wrote (Chapter 3, Figure 3?), was acceptable, but if
we reread it, we see that there are too many little details in the
main program, the same details appear in multiple places, and
many of the commands do tiny steps on the ledger data structure.
Now that we know about functions, we can redesign the program and
do better. (By the way, many useful programs are written twice ---
once to understand how to solve the problem, and once to build
a good solution!)

Let's progress through the usual stages of program design and implementation.

Stage 2: Design the data structures

The behaviors suggest that a program (or a person) who logs
transactions and keeps a balance would require
a variable to remember the current balance and a ledger to remember
the transactions.
Here is a diagram
of how the data structures might look
after the first two transactions in the above example:

The diagram suggests that the ledger is a sequence of transactions.
We know that either a tuple or a list can be used to make a sequence;
for now, we reuse the tuple from Chapter 3, because once a transaction
is constructed and added to the ledger, there is no need to index it
and alter its contents.

As for the transactions,
each transaction might be a single string, or we might
make each transaction into a tuple of its own, so that the
ledger looks like a tuple of tuples:

The earlier sections in this chapter suggest that it will be
useful to write some functions that maintain the balance and ledger.
We will do this, but perhaps it is a bit too soon to know exactly
what these functions might be. (We might guess that we will
need a function that adds a new transaction to the ledger, and
a function that updates the current balance. But what parameters
should they use? What kind of answers should they return?
Maybe it is too soon to know!)

In a situation where we are unsure which functions to write, we
can continue to the next stage of development and later return
to writing functions that update the data structure.

Stage 3: Write the algorithm

In Chapter 3, we noted that the accounting program was processing
a sequence of user transactions and that there is a pattern for
doing this.
Here is the algorithm that was proposed in Chapter 3; it is the
standard indefinite-iteration pattern:

Notice something important: phrases like
``process a check-writing request'' and
``process a deposit'' and
``print the contents of the ledger and the balance''
are describing important computation steps that we will probably
want to design, write, and test one at a time.
This suggests we might make each of these steps into a function:

Clearly, we should bundle these steps into a function that we
write and test separately.

Even so,
there are still too many details in the deposit step.
First, a function that holds the commands that
read the deposit amount would be helpful --- we will need a similar
function for the check-writing step, anyway.
Also, we see a command that is altering the global variable, balance,
and there will be commands that alter the global
variable, ledger. These commands are good candidates for
bundling into functions that maintain the global data structures.

Before we invent these functions, let's quickly compare
the refinement of the deposit step
to the refinement
of the check-writing step:

read the amount of the check
read the date and payee
subtract the amount of the check from the current balance
append the check info to the end of the ledger

We see that the same commands will be needed
to read the amount of the check as the ones used for reading the
amount of the deposit! It is dead clear that we should define a function
for reading a dollars-cents amount.

We also see that
the last step, that of appending the transaction to
the ledger, is a repeat of the last step in the deposit
transaction. So, a function should be written for this, also.

Here are the two functions that will be called by both the deposit
and check-writing steps:

def logTransaction(kind, amount, message) :
"""logTransaction adds a transaction to the end of the ledger.
parameters: kind - a string, either "deposit" or "check"
amount - an int, the amount of the transaction
message - a string, the details about the transaction
"""
pass # ...we must refine and write this one; for now, we use pass

Now, it is time to apply ``modular programming'': When a data structure
(or an important global variable), like balance or ledger,
is altered inside the algorithm, we should seriously consider writing
a function that holds the details about the alteration
and place that function next to the data structure.
This helps us see better how the data structure is constructed and maintained.

For the example,
we might define this function, which bundles together the changes
made by a deposit
to the global variables, balance and ledger:

This looks pretty, because the function calls are explaining
to the human reader the steps required to process a check.
(Some people call such readable commands, self documenting code ---
there is no need for extra comments to explain what the program is doing.)

Stage 4: Coding and Testing

Here is the program we have written, using the mixture of
bottom-up, top-down, and modular-programming techniques:

The program includes a function, formatDollarsCents,
that we studied earlier in the chapter.

At this point, we should test the program. If you do this, try to
``break'' the program by typing silly input values and also by
writing more checks than what you deposit. It is clear that the
program must be improved before we give it to
an accountant to use!

Stage 5: Document

The previous Figure has already included commentary for the program, so that
you can understand how the data structures and functions operate. But
it is also a good
idea to check that the codings match
the comments that are inserted with them.

Modifying the program

Functions are ``small parts,'' and sometimes a part must be removed
and replaced by another. For practice, let's do this.

Just now, the program uses a ledger data-structure that is
a tuple of transactions, where each transaction is itself a
tuple, e.g.,

Because we used modular-programming techniques, we have one
function, logTransaction, which makes all the updates
to the ledger. This is the only function that must be changed in the
entire program! We ``unplug'' the existing logTransaction function
and replace it with this one:

Then
we change the initialization assignment at the top of the program
to say

ledger = [ ]

and we are finished!

This little alteration show us how easy it is to alter and improve
a function that is built from functions --- we unplug one function and
replace it by its improvement.

To Conclude...

Algorithm refinement
is often a mix of bottom-up, top-down, and modular programming strategies.
If doing all of them at once is overwhelming, then
use the ``beginner's strategy''
to programming:

Build a first version of your program, as far as you can go, without
using any functions.

Then, look for little sequences of commands that are repeating
in multiple places. Extract the commands and bundle them into functions.

Next, you can go to the ``intermediate strategy'':

Apply the beginner's strategy as much you can.

Look inside the program to see where there are
sequences of little commands that alter the program's data structure(s).
Extract the commands and
bundle them into functions. Place those functions by the data structure.

Finally, you can apply the ``full strategy'':

Apply the intermediate strategy as much as you can.

Look at the program and see which sequences of commands are
steps that work together to do one ''bigger thing.'' Bundle each sequence of
commands into a function and replace the sequence by a function call.

These strategies are a methodical way for learning
bottom-up, modular, and
top-down programming. (I realize that the full strategy is not exactly
top-down. But it is difficult to blindly apply a true top-down
strategy without a lot of bottom-up experience!)

The final version of the checkbook accountant is well organized,
and in particular, its
main program is pretty and can be easily understood by a human.
But the list of data structures
and functions that precede the program is long and distracting.

In the next chapter, we will learn how to use Python modules
to split the above program into two ``big pieces'': one piece will hold
the data structures and their maintenance
functions, and the other piece will hold
the main program and its helper functions.

6A.10 Design: A program creates a special world and functions give the language of that world

The previous case study extracted a collection of functions that
talk about the actions we make with bank accounts: we read dollars-cents
amounts, we make deposits and withdrawals, we print ledgers, and we
format numbers for printing. These functions form a little
``language'' for talking about and working with checking accounts.
There is something important going on here:

Every computer program builds its own ``special world'' inside the computer.

For example, a program that helps you play chess builds the world
of chess in the computer;
a program that does bank-account management builds the world of
bank accounts in the computer; the slide-puzzle program built a slide
puzzle inside the computer.

Functions define the terminology --- the language --- of
actions
in the special world.

Each
function used in the program is a
``little piece'' of code that
makes an action.
Each action is a significant concept, because
we have bothered to name it!

For these two reasons,
extracting and defining functions
is not an idle, cosmetic task --- it is a central
part of programming, because along with the data structures,
the functions define the program's special
world and give us the language to talk about how that world operates.

Many so-called domain-specific programming languages
grew out of functions extracted from
programs that were built to operate in a specific problem domain ---
The family of functions became the basis of a new programming language!

After you complete this course, you will likely apply your computing
skills in other courses. Say that you take a biology class, and you must
write computer programs to do sequence matching of data to gene patterns.
What is the ``language'' of genes and matching? You will uncover the
answer by constructing your programs and naming the functions used
in those programs.

At this point, you are ready to read Chapter 6 of Dawson's text.
Study carefully his tic-tac-toe example, and try to deduce which
of his functions were created from bottom-up, top-down, and
modular programming styles. Think about whether the functions he
chose define a language for talking about and playing tic-tac-toe.

In the earlier sections of this chapter, we encountered diagrams that
showed what happens inside the computer when functions are called.
The diagrams were helpful
but informal. Now, we study a more precise description of what
happens when functions are defined and called.
Here is a quick review of the semantics of functions:

When a function is defined, the function's name
is saved in the namespace.

When a function is invoked, a new namespace is constructed for the
function's private use and the function's commands execute with the
namespace. When the function finishes, the
namespace is erased.

Private namespaces are a simple and natural idea, but
they are profound
and generated a revolution in computer programming when they first
appeared in the late 1950's. A function's private namespace is also known
as an activation record.

Here is a tricky example that we use to illustrate
functions, parameters, global variables, and
private namespaces:

The Python interpreter has an instruction counter (i.c.)
to remember the next command to execute, and it also has
a namespace variable (n.s.) to remember the namespace to use.
Storage is organized into a local-namespace area, and a global-namespace/heap
area.

After the command at Line 3 does its work, initializing a to 2,
functions f and g are saved in Test's
namespace. Only the initial instruction numbers of the functions are saved.

Next, Line 14 constructs a list in the heap and saves its address in
the variable cell for b:

Line 15 calls f. These steps are undertaken:

The two parameters are computed
to their values: b is found in the namespace, and it computes
to addr1; 4+1 computes to 5.

f is found in the namespace, and i.c. is reset to
instruction 6.

A private namespace for f is constructed in the area
for local namespaces; the namespace holds
the names of parameters x and y, and the two arguments
are placed in those cells. An extra name, global,
is saved in the namespace to remember the namespace in which
f was originally defined. Variable ns is set to
f's namespace.

Here is the configuration:

The Python interpreter has paused the execution
at Line 15 and will execute the function at line 6. The namespace used
is f's.

Line 6 adds a to the namespace, but a cell is not
created; instead, a is marked as belonging to
the global namespace (Test):

Line 7 starts an assignment to a.
First, since a is already present
in f's namespace, no cell is constructed.

Next, the expression, g(x[0] * y), must be computed, so that it
can be assigned to a. This means x[0] * y must be
computed. First, x is found in the namespace;
it computes to addr1. This means that cell 0 at addr1
is found and its value, 3, is returned. Next, y's value,
5, is found. Finally, 3*5 computes to 15.

The next step pauses execution within f and starts execution
at g. A namespace for g is constructed in the local
namespace area,
and the argument, 15, is placed in y's cell there. (This is a
differenty than the one in f's namespace!)
Both i.c. and n.s. are reset:

Line 10 constructs a cell in the current namespace (g's):

Line 11 says to compute y*z, which is 90, and return it
to function f. At this point, g's namespace is erased,
and both i.c. and n.s. are reset to their
values just before the call to g:

To finish the assignment at Line 7, 90 is assigned to a's
cell. In the current namespace, a is marked global,
so the cell for a in namespace Test receives 90.
At this point, function f is finished, so its namespace is
erased, and both i.c. and n.s. are reset:

Syntax and semantics of functions

DOC_STRING (this is optional) is an indented
(multi-line) string, bracketed
by triple double-quotes, describing the purpose of the function and its
parameters.

COMMANDs is an indented sequence of one or more commands, which
can include the new command form,

return EXPRESSION

The semantics of a function definition is that the function's name,
its parameters, and its commands are
saved in the current namesapce.

A function call (invocation) has the format,

NAME(ARGUMENTS)

where

NAME is the name of a previously defined function

ARGUMENTS are zero or more EXPRESSIONs.

The semantics of a function call operates as follows:

The arguments are computed to answers, one by one, from left to right.

The function's name and its body are located in the namespace.

Execution at the position of the function call is paused.

The arguments are assigned, one by one, to the parameters listed in
the function's definition.

The commands within the function are executed.

When the commands finish, execution resumes at the position where
the function was called.

When a return EXPRESSION command is executed within the
function's body, the expression part is computed to an answer,
the function immediately terminates, and the answer is inserted exactly
in the position where the function was called.

A function can also be called with keyword arguments.

A function, F's documentation string can be
printed with the command, print F.__doc__

Functions can be written and testing separately, using interactive
testing:

Place the function, fred, in a file by itself, say, Test.py.

Open a command window and start the Python interpreter interactively:

$python -i Test.py

Interactively test the function:

>>> fred(..arguments...)

This executes the function, just as if it was called from within a program.

If you change the coding of fred, you can retest it by stopping
the Python interpreter and starting over.

There are three important uses of functions:

doing one small thing that must be done over and over from different
places in a program. This is called bottom-up programming.

doing one important thing that must be designed and tested by itself
before it is
inserted into the rest of the program. This is called
top-down programming.

doing one thing to a global variable (data structure) to help keep
the global variable correctly updated. This is called modular programming.