Lisp Lists

Lisp Lists

In this section we introduce another generic collection type,
this time called LispList. This type is not provided
as a standard part of Java, so to use it you will have to copy the
file LispList.class from the location
~mmh/DCS128/lispLists/LispList.class, or just download
it from the link. As a generic type, it should be thought of as having type
LispList<E>, meaning that a full type
needs to be made by combining LispList with some
base type in the same way as we combined ArrayList
with base types. So LispList<Integer> is the
type "Lisp list of integers", LispList<String>
is the type "Lisp list of strings" and so on. The code in
the file LispList.class makes use of code from another file
LispList$Cell.class, so you
will need to copy this from the same directory to thesame directory as the
file LispList.class, or download it from the link.

The name "Lisp list" comes from the programming language
Lisp,
which is one of the oldest programming languages
in existence, having first been introduced in the 1950s. Lisp
was the first example of a functional programming language,
a more modern example of such a language is
Haskell, although
Lisp is still widely used. The name "Lisp" comes from
"list processor", with lists being the fundamental data structure of
the language. That is why we use the term "Lisp List" to name this
data structure here.

The reason for considering programming with Lisp lists is that it is
a particularly simple data structure, yet a very powerful one.
Because of this, some fundamental aspects of algorithms and data structures
are best illustrated by considering them in terms of Lisp lists.
Some universities, for example
Oxford,
Cambridge and
Imperial College
in the UK and
MIT in the States, use a functional
language as the first language for teaching programming,
believing it the best way to put across important issues which get
buried in the detail of more widely used languages such as Java.

A Lisp list is a data structure which consists of an ordered collection
of items, in that way it is like the arrayList
we considered previously. However, arrayLists are mutable, that is there
are operations on them which change their content, and are considered
as an enhanced form of array, that is programming with them is centred
round accessing their components using an integer index. Lisp lists
are immutable and are not primarily manipulated using indexes. Only the
following operations are provided:

head - returns the first item of the list it is called on

tail - returns a new list consisting of all but the first
item of the list it is called on

cons - takes an argument and returns a new list whose
head is the argument and whose tail is the list it is called on

isEmpty - returns true if the list it
is called on is the empty list, returns false otherwise

Conventionally, we write lists enclosed with the square brackets
[ and ], with the items in the list in
order separated by commas. So, for example, [30,15,7,12]
is the way we would write a list of type LispList<Integer>
whose head is 30 and whose tail is [15,7,12].
If ls is a variable of type LispList<Integer>
which refers to an object whose value we write as [30,15,7,12],
then ls.head() will return 30 and
ls.tail() will return a reference to an object whose value
is written as [15,7,12]. The call ls.tail()
does not change the object ls refers to, it remains with
the value represented as [30,15,7,12], but the assignment
ls1=ls.tail() would cause the variable ls1 to
start referring to a new object which can be represented as
[15,7,12]. Also the call ls.tail() could
be passed directly as an argument to a method, or used directly
to have a further method called on it. So, for example,
ls2=ls.tail().tail() would cause ls2 to refer
to a new object whose value can be written [7,12].
Similarly, ls.cons(8) does not change the object ls
refers to, it continues to be one represented by [30,15,7,12],
but ls3=ls.cons(8) would cause the variable ls3
to start referring to a new object whose value can be represented by
[8,30,15,7,12]. The empty list is represented by
[], an attempt to call the methods head or
tail on a variable referring to an object representing
the empty list would cause an exception to be thrown, but the
cons operation can be called on such a variable.

Note that in Lisp the operations we call head and tail
are called car and cdr respectively, this is
for historical reasons connected with the original implementation
of the language. In some data structures textbooks, the term "tail" is used
to mean the last element of a list. It is unfortunate that these two
different uses of "tail" have arisen with lists, but in these notes we will
only ever use the term "tail" when applied to a list to mean "all but the
first element". So Lisp lists are broken down using the operations
head and tail, and assembled using the
operation cons. We also need a way of getting an object
representing the empty list, which in Java is done by a static method
called empty. In general, when you call a static
method from another class, you attach the call to the class name,
but here we have the additional aspect of calling a generic static method.
To create a new object of type LispList<Integer>
representing an empty list, you use the call
LispList.<Integer>empty(). The <Integer>
after the dot which attaches the method call to the class name
LispList gives a value to the type variable
of this method, in this case setting it to Integer.
But you can omit it if the value of the type variable can be
worked out otherwise, for example if the value being returned
by the call to the method empty is being assigned
to a variable of type LispList<Integer> we know
the type variable must be Integer so the call can be
just LispList.empty().

To maintain the simplicity of the LispList generic
type, there are no additional methods provided except for a
toString method so that a suitable text representation of
Lisp Lists is given, and an equals method to enable them to
be compared for equality. There are no public constructors
at all. Any other operations must be provided by writing static
methods as required. The methods cons, tail and
empty are the only way to construct new LispList
objects. You can find code for Lisp lists in the directory
~mmh/DCS128/code/lispLists. To use the type
LispList, you will need to copy two .class files into
your directory, LispList.class and LispList$Cell.class.
The second of these is a class which is only used internally by objects
of type LispList.

Iterative methods on Lisp Lists

If the variable ls refers to an object of type
LispList<T>, then ls.head() gives
the first item in the list, ls.tail().head() gives the
second item in the list, ls.tail().tail().head() gives the
third item in the list, and so on. This is because the tail of
a list is the list of every item except the first one, so the head of the
tail must be the second item, and so on. A common way of dealing with lisp
lists is to have a variable which is first set to a list, then set to the
tail of the list, then to the tail of the tail, then to the tail of
the tail of the tail, and so on. If the variable is set to its own
tail, it will refer to the portion of the list with one less item at
the front than it used to, and keep resetting it this way means it
goes through the whole list with the head of the object referenced
by the variable being each item in the original list in turn. When the
variable refers to the empty list, it has gone through the whole list.

For example, suppose ls1 is of type
LispList<Integer> and we want to add
together all the integers in the list. The following code
fragment will do this:

Note that, strictly, ls2.head() is of type Integer,
not int, but assigning it to a variable of type int
will convert it to the int equivalent, using the principle of
"unboxing" we described previously.

After this code fragment has been executed, the variable count
holds the value obtained by adding together all the integers in the
list referred to by ls1, and ls1 continues
to refer to the whole list. Suppose ls1 refers to the
list [8,30,15,7,12]. Then the first time round the loop,
ls2 also refers to [8,30,15,7,12], and
l2.head() returns 8. The update statement
in the for loop causes ls2 to be reset to refer to
[30,15,7,12] on the second time round the loop, with
l2.head() returning 30. The next time
round the loop, ls2 refers to [15,7,12]
and ls2.head() returns 15 and so on.
This could be put into a static method:

Code for reading a Lisp list of integers and adding the integers in it
can be found in the files
UseLispLists1.java and
UseLispLists2.java.
The second of these shows the operation done through the static method
given above. It demonstrates that, for the reasons we showed
previously,
although ls=ls.tail() is executed within the method, this
does not change what the list variable used as an argument to the
method refers to. The variable ls inside the method
is a local variable, which is initially set to the object passed
as the argument to the method, but if it is changed to refer to another
object, the variable that was used as the argument to the method call
stays referring to its original object. So a call to sum(ls)
will return the sum of integers in the Lisp list referred to by ls,
but will not change what ls refers to.

The code given in the file also includes a method parseIntLispList
which takes a string and gives the Lisp list of integers it represents
using the text representation given above. This uses some of the methods
provided in the Java library we discussed
previously.

The method sum here is not generic because it relies
on the base type of the Lisp list it handles being Integer.
However, we can consider generic methods for operations which do not
rely on the base type. We have seen that for arrayLists there is a method
provided by Java which gives the position of an item in an arrayList. Here
is a static method which provides a similar operation for Lisp lists:

This gives the position of any item in a Lisp list of items of any type, so long
as the type of the item and the base type of the Lisp list are the same.
It uses the Java convention that counting positions starts at position
0 and returning -1 if the item does not
occur in the list. It uses the technique we used
previously of a
for loop, where the exit test for the for loop is when the whole
of the structure has been gone through and the item not found, and
where there is a return statement in the for loop which is executed
to return the position (and hence exit the for loop and the whole method)
when it is found. This is a case of a for-loop with two update
statements - ls=ls.tail() and pos++, changing
what the list variable ls refers to and the value of
pos giving the position in the original list of the
current head of ls. It is possible to have more than one
update statement in a for loop, if so the update statements are
separated by commas.

An alternative way of programming the loop would be to have it exit
either if the list becomes the empty list, or if the head of the list
is the item whose position we are trying to find. Then -1
is returned if the list is the empty list, and the value of the position
variable if it is not as that means the loop has been exited because the
position of the item has been found. Then the body of the loop is empty,
all it does is update and test. Here is the code for this:

The files UseLispLists3.java
and UseLispLists4.java
give these two versions of the method, and support code to test them
with lists of integers and lists of strings. A separate method
parseStringLispList is needed to convert a string
representing a list of strings into the actual Lisp list object.
The text representation is the conventional one we used above,
starting with [ and ending with
] with the strings separated by commas, spaces before
and after the commas are not treated as part of the strings in the list,
otherwise all characters are treated as characters in the strings.

Now let us consider the operation of changing all occurrences of one
item to another, as we have already considered with arrays and
arrayLists. Since Lisp lists are immutable, we can only do this
constructively. A first attempt at such a change method might look
rather like the one we used
previously for a constructive
change in arrayLists, going through the items of the argument collection,
adding them to a new collection which is initially empty, but if the item is
the one to be changed, adding the one it is to be changed to instead. With
arrayLists, the cons method is the only way to
construct a new list which represents adding an item to an existing
list. This gives the following code:

You will find this code and supporting code to demonstrate it in the file
UseLispLists5.java.
If you run this, you will find it makes the change as required, but returns
the list in reverse order. If ls is set to represent the
list written [2,12,4,17,21,4,9,10,4] and m is
set to 4 and n is set to 50,
then the call change(ls,m,n) will return a reference to
a Lisp List object with textual representation
[50,10,9,50,21,17,50,12,2].

The reason for this is that we can only take items from the front of
a list and join them to the front of a list. If you repeatedly take items
from the front of one list and join them to the front of another, you
will end up reversing the list. If we want the list in the correct
order, a simple way to do it would be to use the samer technique again
to reverse the reversed list, so this would make the method:

This code and supporting code to demonstrate it can be found in the file
UseLispLists6.java.
Note that ls3 = ls3.cons(ls2.head()) means the
same as

{
T h = ls2.head();
ks3 = ls3.cons(h);
}

You can express this using a local variable to hold the value returned
by ls2.head() then making that variable the argument to
the call to cons, or you can make the call
ls2.head() directly the argument to the call to cons.
It makes no difference to the way the code executes when run in Java,
deciding how to write it is just a matter of what you feel looks clearer.

Methods which are centred around the use of a loop are known as
iterative. An iterative algorithm is a way of solving a problem
which is best explained in terms of making repeated changes to some
data until the desired condition is met. In the above example, the
data being changed in each loop consist of two variables referring to
Lisp list objects. Each time round the loop, one of them is changed to
refer to a shorter list, the other is changed to refer to a longer list.
We know the loop will only be gone through a certain number of times,
because the loop test is that the variable which keeps being set to
a shorter list refers to a non-empty list - eventually it will refer
to an empty list and the loop woll be exited. Then the variable which
kept being set to a longer list holds the desired result. No actual
object is changed, since Lisp list objects are immutable meaning they
can't be changed. But new objects are created and the variable change
their values because they are set to refer to these new objects.

The alternative to iteration when thinking about problem solving is
recursion. A recursive approach to problem solving uses the
consideration that we can often solve a problem by solving a smaller
version of the same problem then adapting the solution, or sometimes
using more than one solutions to smaller versions of the problem.
For example, suppose I need to make a journey from town A to town B,
and I don't know a direct route. I could pick another town, let's
call it C, where I know a direct route from A to C and C is closer
to B than it is to A. Then I have reduced the problem of finding a
route from A to B to one of finding a route from B to C and putting
the route from A to C I know on the front of it. Finding a route from
B to C is done in the same way, by reducing the problem to finding a
route to B from a town closer to B, and eventally the problem will be
reduced to one where there is a clear direct route and it does not have
to be broken down further. An alternative way of solving the problem
is to pick a town midway between A and B, again we'll call it C.
Now the problem has been reduced to two similar problems: finding a route
from A to C and finding a route from C to B. Eventually, if we solve
those two problems in the same way, we'll get down to routes which
are short enough for a direct route to be found.

Recursive approaches to solving Lisp list manipulation problems often
work well because Lisp lists are a recursive data type. A Lisp
list is either the empty list, or it consists of two parts its head and
its tail, and the tail is itself a Lisp list. So a Lisp list is defined
in terms of a Lisp list. Suppose we want to change all occurrences of
m to n in a Lisp list, ls. If
we change all occurrences of m to n in the
tail of ls (remember this involves creating a new Lisp
list object representing the change, and not destructively changing
the tail of ls1) we almost have the desired Lisp list.
All we need to do is put n on the front of it if
m was on the front of the original list, or put the front
item, of the original list at the front of the new list if m
was not at the front of the original list. We also need to consider that
if we wish to change all occurrences of m to n
in the Lisp list ls, then if ls is the empty
list, then the answer is the empty list. This gives us the following
code which performs the same change operation as above:

Again, there is no right or wrong way to express a solution to this
problem. You may find the iterative solution easier to understand, or
you may find the recursive solution easier to understand. Recursive
solutions often have a simpler structure, but require a bit more
intuitive thinking to appreciate. A good computer scientist should
be able to understand and devise recursive and iterative solutions
to problems. If recursion seems a rather strange way to go about
tackling a problem, you will need to practice with it until you
find it natural.

Let us consider recursive version of the other operations we have discussed
on Lisp lists. The sum of the all the elements in the empty list is clearly
0. The sum of all the elements in any other list is equal
to the first element added to the sum of the elements in the rest of
the list. I hope this seems intuitively obvious, and yet it is very
close to a program to give the sum. The "sum of all the elements in the
rest of the list" is a recursive call to the sum operation
taking as its argument the tail of the original argument. So this
gives the following code:

A recursive method must always have a case when it doesn't make
a recursive call. With Lisp lists this is often, but not always, when
the argument list is the empty list. The case when no recursive
call is made is known as the base case. Any recursive call must
always have arguments which are closer to a base case than the original
arguments, since we want a succession of recursive calls eventually to come
to an end in the base case. So with Lisp lists this generally (but not
always) means the recursive call takes as an argument the tail of an original
argument.

If we consider the operation indexOf we considered
previously, there are two base cases. One is when the list is empty,
then the item cannot occur in it so it must return -1. The
other is when the item is the first item in the list, then it must
return 0. Otherwise, the index of an item in
a list is one more than its index in its tail. For example, the
position of 80 in [20,90,70,50,80,60] is
4 (remember with the usual Java convention of starting
counting at 0), this is one greater than the index of 80
in [90,70,50,80,60]. But this relationship is not
quite right. The index of 40 in [20,90,70,50,80,60]
is -1 (meaning it doesn't occur), but the index of
40 in [90,70,50,80,60] is also -1.
So the index of an item in a Lisp list is -1 if the
list is empty, 0 is it is the first element, -1
if the index in the tail is -1, otherwise it is one greater than
its index in the tail. This translates to the following code:

You can see from this that when programming using recursion, it often
makes more sense to think in terms of logical relationships between
the result of a call and the result of a recursive call than it does to
think of it in terms of one-by-one steps as with iteration.

The file
UseLispLists2a.java
is the same as the file
UseLispLists2.java
and the file
UseLispLists3a.java
is the same as the file
UseLispLists3.java
except that they use recursion as given above, rather than iteration.
You will see that when you run the code using the recursive method
the result is exactly the same as when the code is run using the
iterative method. This is correct - the important thing is not how
the code is programmed, but whether it gives the right result.
There may be more than one way of programming an operation to give
a particular result, but someone who uses a program to get a particular
result is only concerned that the result is correct.

Actually, as we shall see later, there is also the issue of efficiency.
Different ways of programming an operation may give the same result in
terms of what the method for the operation does or returns, but one
way may take much longer than another. Someone who uses a program will
be concerned that it operates as quickly as possible.