Infinite Trees

Kathi Fisler

1Motivation

Up until now, we’ve mostly looked at trees as a data structure, but we
haven’t looked at them in the context of any particular problems.
Trees are used in many applications in computer science, robotics, and
game design. One common application is for representing the possible
moves in a game or search problem. Imagine that you were exploring
the space that a player (character or robot) could move through. On
each move, the player would change position (and possible acquire the
ability to do new tasks based on location – we’ll set that aside for
now).

Imagine that you wanted to run programs to optimize the movement of
your player through the space. You would need a representation of the
space, a way to determine the new position after each move, and a way
to estimate which moves now could yield the best payoff a few moves
down the road. Trees are a useful data structure here: each node
contains data about the present location (where it is, its features,
etc), each type of move yields a new location (aka node). For
example, consider a grid-based simple game in which a player can only
move between adjacent squares (ie, forward, backward, left, right).
This yields a location tree like:

------- <4,8>------

//\\

<4,9><4,7><3,8><5,8>

/ / \ \

----- /\ -------

//\\

<3,9><3,7><2,8><4,8>

(Ignore that we have <4,8> twice in this picture – we’ll return to
that tomorrow.)

The problem with building this tree upfront is that set of possible
game/robot configurations (or positions) tends to be large, if not
(effectively) infinite. In this case, creating the tree upfront
simply isn’t feasible. Instead, we need to create the tree in a
demand-driven fashion: we create each node when we need it,
remembering just enough information to let us create more nodes as
necessary.

Our task today is to figure out how to develop infinite trees in
Java. To simplify the problem, we will work simply with binary
trees, each of which has an integer for its datum. The left child of
each node is the node with data value minus 1; the right child is the
node with data value plus 1. In other words:

0

/ \

-11

/\...

-20

......

2Modeling Infinite Trees

Let’s start with our previous definition of binary trees and see how
well it fits our new problem:

interface INode {}

class Leaf implements INode {

Leaf(){}

}

class Node implements INode {

int data;

INode left;

INode right;

Node (int data, INode left, INode right) {

super(data);

this.left = left;

this.right = right;

}

}

class Examples {

INode myTree = new Node(4,

new Node(3, new Leaf(), new Leaf()),

new Leaf());

}

This isn’t a great representation, because a Leaf denotes the
end of the tree. In our desired problem, the tree never ends, it just
hasn’t necessarily been built yet. There is also no notion here that
a Leaf has data, but not (yet) children. So this shape doesn’t
capture the essence of our problem. Instead, we need classes that
capture the "unfinished" nature of some nodes. We will omit leaves
entirely.

Following conventional CS terminology, we will call unbuilt nodes
"promises":

class NodePromise implements INode {

int data;

NodePromise(int data) {

this.data = data;

}

}

Our prior example would be written as following using
NodePromise:

class Examples {

INode myTree = new Node(4,

new NodePromise(3),

new NodePromise(5));

}

That is, the tree has one completed node (with data 4), and two
"pending" nodes (for 3 and 5) that we might create on demand.

To demand that a node be created, we "force" the promise of a node to
be carried out ("force" is again the conventional CS term here). In
particular, our NodePromise class will have a force
method that builds-out one more node of the tree:

private INode force() {

return new Node(this.data,

new NodePromise(this.data - 1),

new NodePromise(this.data + 1));

}

So if we were to run myTree.left.force(), we would get a
Node with data 3 and NodePromise children for
values 2 and 4.

2.1The Class Hierarchy So Far

To summarize, we now have the following class hierarchy for infinite
trees, cleaned up with an abstract class to capture the data common to
the two kinds of INode:

interface INode {}

abstract class AbsNode implements INode {

int data;

AbsNode(int data) {

this.data = data;

}

}

class Node extends AbsNode {

private INode left;

private INode right;

Node (int data, INode left, INode right) {

super(data);

this.left = left;

this.right = right;

}

}

class NodePromise extends AbsNode {

NodePromise(int data) {

super(data);

}

private INode force() {

return new Node(this.data,

new NodePromise(this.data - 1),

new NodePromise(this.data + 1));

}

}

3When to Force Node Creation

Now that we have force, we need to decide when to use it. We
said that we wanted to construct the tree "on demand", meaning that if
a user of our code wants to see a node, we should force it to exist.
The user should not have to think about whether the node already
exists. They should simply be able to ask for a node and get it back;
it is up to our code to create that node if needed. Put differently,
the user should have the illusion of having an infinite tree. That we
don’t create it as such is an implementation detail of no concern to
the user.

Notice that Node has fields for left and right,
while NodePromise does not. This suggests that the
INode interface needs to provide methods getLeft() and
getRight() that will get the corresponding subtree. For
Node objects, these methods just return the corresponding
(existing) node. For NodePromise objects, these methods force
creation of the new nodes:

interface INode {

INode getLeft();

INode getRight();

}

class Node extends AbsNode {

private INode left;

private INode right;

// constructor omitted

public INode getLeft() { return this.left; }

public INode getRight() { return this.right; }

}

class NodePromise extends AbsNode {

// constructor omitted

private INode force() {

return new Node(this.data,

new NodePromise(this.data - 1),

new NodePromise(this.data + 1));

}

public INode getLeft() { return this.force().getLeft(); }

public INode getRight() { return this.force().getRight(); }

}

To convince ourselves that this does hide the details from a user, we
can experiment with our interface (in DrJava, you can use the
interactions window for this).

> NodePromise root = new NodePromise(0);

> root.getLeft().getData()

-1

> root.getLeft().getLeft().getData()

-2

> root.getLeft().getRight().getLeft().getData()

-1

> root.getLeft().getRight().getRight().getData()

1

This assumes that we also have an getData method in the
AbsNode class that returns the data (see the final posted code
file for this). We see that a user can’t tell that we are using
NodePromise underneath, even though we have not created the
full tree explicitly (as seen in the definition of root). We
also know that force must be working, since we started with
only a NodePromise. In addition, we know that we are not
actually creating the infinite tree, as our test case terminated
without a stack overflow error.

4Computing the Size

Of course, we might feel more confident that we aren’t constructing
the whole tree if we added a method to compute its size. What if we
added the following method to AbsNode?

abstract class AbsNode implements INode {

int data;

// constructor omitted

public int getData() { return this.data; }

public int size() {

return 1 + getLeft().size() + getRight().size();

}

}

If you run this method on our root example, Java eventually reports a
"stack overflow" error. This means that the size computation
did not terminate. Why not? If you think about our data structure
carefully, it has no base case. This seems weird: a recursive data
structure with no real base case. Haven’t we always needed a base
case (such as leaf) in order to finish building data, or to make
functions terminate? Yes. We only get away with building infinite
data here because we defer building all the data via the promise.

As for termination, when you follow templates on data with a base
case, your functions are guaranteed to terminate. For functions on
infinite data, we need to rely on something other than the data
definition for termination. For example, we can change the problem to
introduce a pseudo-base case (ie, terminate on NodePromise).
Or, we can do a computation that has some other argument on which we
can terminate; if you expand a tree out by a fixed number of levels,
as the lookAhead function will below, then the number of levels
gives a way to guarantee termination. Either way, you need to make
some argument about termination, rather then relying on the data
structure to do that for you.

As a side note: some of you may have learned about induction in high
school or other courses. Induction is how we usually prove that
functions over recursively-defined data are correct. Induction,
however, relies on having a base case, which isn’t true here. If you
are interested in the mathematics of infinite data structures, look up
co-induction, which handles reasoning on infinite data structures.

To see the "pseudo base-case" idea in action, we can choose to compute
the size of the demanded portion of the tree, we can stop
counting the size at promised nodes. We can achieve that by
overwriting the size computation in the NodePromise
class:

public int size() { return 0; }

5Lazy Computation and Two Kinds of Functions

What you’re seeing here is a powerful programming concept known as
laziness. In lazy computation, we compute only what is needed,
extending the data if more cases are required. Laziness is an
interesting concept that we cover in depth in the Programming
Languages course (for anyone who is interested). For purposes of this
class, you should understand that demand-driven computation is
sometimes essential, and that you can use methods such as force
to achieve it.

Stop and think about force for a moment. It is a rather
different function than ones you have seen before. Most functions
simply provide names for computations. When we give them arguments,
they substitute those arguments into the function body and return a
result. In contrast, force exists to delay a
computation (namely, the construction of the tree). It doesn’t take
arguments. Those of you who came through 1102 saw this concept before
through lambda. The core idea is important in computer science
overall, however: functions do not always abstract over behavior; they
also can set up a computation to be performed later.

6Looking Ahead in the Tree

We mentioned that a common operation on game trees is to look ahead a
few steps to help determine which of several moves to take. To wrap
up this example, let’s write a lookAhead function that takes an
integer indicating how many levels down to expand the tree, and
returns a list of the possible nodes at that depth. A real game
program would then perform some computation over that list of nodes to
select which move to make next.

First, let’s agree on the interface and examples:

interface INode {

int getData();

INode getLeft();

INode getRight();

IList<INode> lookAhead (int depth);

int size();

}

Calling lookAhead on a node at depth 0 should return a list
containing only the node (irregardless of whether that node is a
NodePromise). At any depth larger than 0, lookAhead
will simply append the results of calling lookAhead on each of
its subtrees at depth-1.

Note that nothing in this description differentiates between
Node and NodePromise, so we can simply put the
lookAhead computation in the AbsNode class:

This code uses a method append that concatenates the results of
two lists. We haven’t written this method on lists before, but could
easily add it to our running IList implementation if we so
chose.

6.1Using Java’s Built-In List Classes

Actually, this is a good opportunity to introduce you to Java’s
built-in classes for lists. Java uses the term LinkedList for
the kinds of lists you have created in Racket and Java. If you want
to use the LinkedList classes, you first need the following
statement at the top of your file:

import java.util.LinkedList;

The LinkedList package has several minor and one major
difference from the lists we’ve created manually. The minor
differences arise in the names of operators. We’ll just show you
those as we need the operators, or you can look up the Java
LinkedList interface with an easy Google search. The major
difference is that Java’s operators do not return a new list.
Instead, they modify an existing list. We’ll get into the serious
implications of this difference next week. For now, it largely means
that you have to write your code somewhat differently than before.
This is best seen by example. Consider the following lookAhead
code as written against the built-in list library:

public LinkedList<INode> lookAhead(int depth) {

if (depth == 0) {

LinkedList<INode> newlist = new LinkedList<INode>();

newlist.add(this);

return newlist;

}

else {

LinkedList<INode> leftresult =

this.getLeft().lookAhead(depth - 1);

LinkedList<INode> rightresult =

this.getRight().lookAhead(depth - 1);

leftresult.addAll(rightresult);

return leftresult;

}

}

First, we see that the class name for a built-in list is
LinkedList. Now look inside the case when the depth is zero.
The built-in classes do not distinguish between
empty and non-empty lists as our classes do. When we ask for a new
LinkedList, we simply get back a new list with no elements. To
add this as an element to the list, we use the add operator:

newlist.add(this);

After adding this, we can return newlist.

If you try to write these three lines more compactly as

return new LinkedList<INode>().add(this);

Java gives you a type error. Our lookAhead function needs to
return a list, but the add operator doesn’t return a list.
Instead, it returns void, which you can think of as the
"return nothing" type. The standard list operations all return either
void or boolean (to indicate success or failure of the
operation). As a result, you can’t compose list operations as we have
done previously. Instead, you create a variable to hold the list(s),
write each operation as an independent statement, then return
whichever list holds your result at the end. The sequence of
statements in the else case of lookAhead give you
another example of this stylistic difference.

6.2Testing lookAhead

One way to test lookAhead is to make sure that we are getting the
number of nodes we expect from looking ahead at various depths. The
following set of test cases shows what we expect here, using Java’s
built-in size method on lists to compute our answer:

class Examples {

NodePromise root = new NodePromise(0);

boolean testLookAhead0 (Tester t) {

return t.checkExpect(root.lookAhead(0).size(),1);

}

boolean testLookAhead1 (Tester t) {

return t.checkExpect(root.lookAhead(1).size(),2);

}

boolean testLookAhead3 (Tester t) {

return t.checkExpect(root.lookAhead(3).size(),8);

}

}

For more detailed testing, we would want to check the contents of each
node. We will hold off on that for now, as we haven’t yet shown you
how to iterate over Java’s built-in lists.

7And Some More Java Lessons ...

Actually, this example works well for illustrating some other
Java-isms that you need to know coming out of this course. The final
version of the code, posted to the website, illustrates three
additional features/techniques that you need to understand:

Access modifiers: Java lets you control access to fields and
methods across classes. You’ve already encountered "public" as an
annotation for methods, but we’ve never discussed what it means. This
code illustrates three different access modifiers: public, private,
and protected.

Private fields and methods are only visible inside of
the class (they are not accessible in subclasses); force in
NodePromise is a good example of a private method: no class
other than NodePromise should know about or be able to use this
method. As a general rule, you should make all fields private
(providing methods to access them, as we did with getData)
unless you have a good reason to do otherwise. Once we get into
mutation, other classes could modify your non-private data, which
can become problematic.

Protected fields and methods are visible within the class
and its subclasses. The data in AbsNode is protected to
restrict its direct access to Node and NodePromise. Had
it been private, neither subclass could use the expression
this.data.

Public fields and methods are accessible to any class or
object. Methods in interfaces are required to be public (that’s the
spirit of being an interface).

This summary really is all you need to know about
public/private/protected. We will begin to use these modifiers
consistently so that you get comfortable with them and their
constraints. There are any number of descriptions of these online if
you feel you need more detail.

Wrapper classes: It is common to use a separate class to "wrap"
an instance of your datastructures. This is useful when you want to
give someone either an obscured or persistent reference to your data,
while allowing you to modify the data behind the scenes. The
GameTree class in the posted code is a wrapper for nodes. It
is useful in this case because a user should not need to know that
there is a difference between Node and NodePromise
(i.e., that we defer the computation of some nodes is really an
implementation detail). Instead, we allow a user to create a
GameTree, which we in turn initialize with a NodePromise.

Multiple Constructors: Notice that there are two constructors in
the GameTree class: one takes a value for the rootdata and one
initializes the rootdata to 0. This provides flexibility to your
user. In a real game, the initial data might be too complex for a
user to provide, or the user might customize only part of the initial
data. Multiple constructors provide flexibility when different
contexts would use your program in different ways.

8Summary

We’ve covered a lot of ground here. What should you understand from
this lecture?

That we can support infinite data by defining classes and
methods to build more data as it is needed.

How to create and augment Java’s built-in lists (iteration will
come later).

Java’s access modifiers, and when to use each.

The idea of wrapper classes and multiple constructors to give
users a need-to-know view of your data.