Core Java: Collections Framework and Algorithms

This sample book chapter shows how Java technology can help you accomplish the traditional data structuring needed for serious programming, and introduces you to the fundamental data structures that the standard Java library supplies.

This chapter is from the book

Object-oriented programming (OOP) encapsulates data inside classes, but this doesn't make the way in which you organize the
data inside the classes any less important than in traditional programming languages. Of course, how you choose to structure
the data depends on the problem you are trying to solve. Does your class need a way to easily search through thousands (or
even millions) of items quickly? Does it need an ordered sequence of elements and the ability to rapidly insert and remove elements in the middle of the sequence? Does it need an array-like structure with
random-access ability that can grow at run time? The way you structure your data inside your classes can make a big difference
when it comes to implementing methods in a natural style, as well as for performance.

This chapter shows how Java technology can help you accomplish the traditional data structuring needed for serious programming.
In college computer science programs, a course called Data Structures usually takes a semester to complete, so there are many, many books devoted to this important topic. Exhaustively covering
all the data structures that may be useful is not our goal in this chapter; instead, we cover the fundamental ones that the
standard Java library supplies. We hope that, after you finish this chapter, you will find it easy to translate any of your
data structures to the Java programming language.

Collection Interfaces

Before the release of JDK 1.2, the standard library supplied only a small set of classes for the most useful data structures:
Vector, Stack, Hashtable, BitSet, and the Enumeration interface that provides an abstract mechanism for visiting elements in an arbitrary container. That was certainly a wise
choice—it takes time and skill to come up with a comprehensive collection class library.

With the advent of JDK 1.2, the designers felt that the time had come to roll out a full-fledged set of data structures. They
faced a number of conflicting design decisions. They wanted the library to be small and easy to learn. They did not want the
complexity of the “Standard Template Library” (or STL) of C++, but they wanted the benefit of “generic algorithms” that STL
pioneered. They wanted the legacy classes to fit into the new framework. As all designers of collection libraries do, they
had to make some hard choices, and they came up with a number of idiosyncratic design decisions along the way. In this section,
we will explore the basic design of the Java collections framework, show you how to put it to work, and explain the reasoning
behind some of the more controversial features.

Separating Collection Interfaces and Implementation

As is common for modern data structure libraries, the Java collection library separates interfaces and implementations. Let us look at that separation with a familiar data structure, the queue.

A queue interface specifies that you can add elements at the tail end of the queue, remove them at the head, and find out how many elements
are in the queue. You use a queue when you need to collect objects and retrieve them in a “first in, first out” fashion (see
Figure 2-1).

As of JDK 5.0, the collection classes are generic classes with type parameters. If you use an older version of Java, you need
to drop the type parameters and replace the generic types with the Object type. For more information on generic classes, please turn to Volume 1, Chapter 13.

Each implementation can be expressed by a class that implements the Queue interface.

The Java library doesn't actually have classes named CircularArrayQueue and LinkedListQueue. We use these classes as examples to explain the conceptual distinction between collection interfaces and implementations.
If you need a circular array queue, you can use the ArrayBlockingQueue class described in Chapter 1 or the implementation described on page 128. For a linked list queue, simply use the LinkedList class—it implements the Queue interface.

When you use a queue in your program, you don't need to know which implementation is actually used once the collection has
been constructed. Therefore, it makes sense to use the concrete class only when you construct the collection object. Use the interface type to hold the collection reference.

With this approach if you change your mind, you can easily use a different implementation. You only need to change your program
in one place—the constructor. If you decide that a LinkedListQueue is a better choice after all, your code becomes

Why would you choose one implementation over another? The interface says nothing about the efficiency of the implementation.
A circular array is somewhat more efficient than a linked list, so it is generally preferable. However, as usual, there is
a price to pay. The circular array is a bounded collection—it has a finite capacity. If you don't have an upper limit on the number of objects that your program will collect,
you may be better off with a linked list implementation after all.

When you study the API documentation, you will find another set of classes whose name begins with Abstract, such as AbstractQueue. These classes are intended for library implementors. To implement your own queue class, you will find it easier to extend
AbstractQueue than to implement all the methods of the Queue interface.

Collection and Iterator Interfaces in the Java Library

The fundamental interface for collection classes in the Java library is the Collection interface. The interface has two fundamental methods:

There are several methods in addition to these two; we discuss them later.

The add method adds an element to the collection. The add method returns true if adding the element actually changes the collection, and false if the collection is unchanged. For example, if you try to add an object to a set and the object is already present, then
the add request has no effect because sets reject duplicates.

The iterator method returns an object that implements the Iterator interface. You can use the iterator object to visit the elements in the collection one by one.

Iterators

By repeatedly calling the next method, you can visit the elements from the collection one by one. However, if you reach the end of the collection, the next method throws a NoSuchElementException. Therefore, you need to call the hasNext method before calling next. That method returns true if the iterator object still has more elements to visit. If you want to inspect all elements in a collection, you request
an iterator and then keep calling the next method while hasNext returns true. For example,

As of JDK 5.0, there is an elegant shortcut for this loop. You write the same loop more concisely with the “for each” loop

for (String element : c){do something withelement}

The compiler simply translates the “for each” loop into a loop with an iterator.

The “for each” loop works with any object that implements the Iterable interface, an interface with a single method:

public interface Iterable<E>
{
Iterator<E> iterator();
}

The Collection interface extends the Iterable interface. Therefore, you can use the “for each” loop with any collection in the standard library.

The order in which the elements are visited depends on the collection type. If you iterate over an ArrayList, the iterator starts at index 0 and increments the index in each step. However, if you visit the elements in a HashSet, you will encounter them in essentially random order. You can be assured that you will encounter all elements of the collection
during the course of the iteration, but you cannot make any assumptions about their ordering. This is usually not a problem
because the ordering does not matter for computations such as computing totals or counting matches.

Old-timers will notice that the next and hasNext methods of the Iterator interface serve the same purpose as the nextElement and hasMoreElements methods of an Enumeration. The designers of the Java collection library could have chosen to make use of the Enumeration interface. But they disliked the cumbersome method names and instead introduced a new interface with shorter method names.

There is an important conceptual difference between iterators in the Java collection library and iterators in other libraries.
In traditional collection libraries such as the Standard Template Library of C++, iterators are modeled after array indexes.
Given such an iterator, you can look up the element that is stored at that position, much like you can look up an array element
a[i] if you have an array index i. Independently of the lookup, you can advance the iterator to the next position. This is the same operation as advancing
an array index by calling i++, without performing a lookup. However, the Java iterators do not work like that. The lookup and position change are tightly
coupled. The only way to look up an element is to call next, and that lookup advances the position.

Instead, you should think of Java iterators as being between elements. When you call next, the iterator jumps over the next element, and it returns a reference to the element that it just passed (see Figure 2-3).

Here is another useful analogy. You can think of Iterator.next as the equivalent of InputStream.read. Reading a byte from a stream automatically “consumes” the byte. The next call to read consumes and returns the next byte from the input. Similarly, repeated calls to next let you read all elements in a collection.

Removing Elements

The remove method of the Iterator interface removes the element that was returned by the last call to next. In many situations, that makes sense—you need to see the element before you can decide that it is the one that should be
removed. But if you want to remove an element in a particular position, you still need to skip past the element. For example,
here is how you remove the first element in a collection of strings.

Iterator<String> it = c.iterator();
it.next(); // skip over the first element
it.remove(); // now remove it

More important, there is a dependency between calls to the next and remove methods. It is illegal to call remove if it wasn't preceded by a call to next. If you try, an IllegalStateException is thrown.

If you want to remove two adjacent elements, you cannot simply call

it.remove();
it.remove(); // Error!

Instead, you must first call next to jump over the element to be removed.

it.remove();
it.next();
it.remove(); // Ok

Generic Utility Methods

Because the Collection and Iterator interfaces are generic, you can write utility methods that operate on any kind of collection. For example, here is a generic
method that tests whether an arbitrary collection contains a given element:

The designers of the Java library decided that some of these utility methods are so useful that the library should make them
available. That way, library users don't have to keep reinventing the wheel. The contains method is one such method.

In fact, the Collection interface declares quite a few useful methods that all implementing classes must supply. Among them are:

Many of these methods are self-explanatory; you will find full documentation in the API notes at the end of this section.

Of course, it is a bother if every class that implements the Collection interface has to supply so many routine methods. To make life easier for implementors, the library supplies a class AbstractCollection that leaves the fundamental methods size and iterator abstract but implements the routine methods in terms of them. For example:

A concrete collection class can now extend the AbstractCollection class. It is now up to the concrete collection class to supply an iterator method, but the contains method has been taken care of by the AbstractCollection superclass. However, if the subclass has a more efficient way of implementing contains, it is free to do so.

This is a good design for a class framework. The users of the collection classes have a richer set of methods available in
the generic interface, but the implementors of the actual data structures do not have the burden of implementing all the routine
methods.

java.util.Collection<E>1.2

Iterator<E> iterator()

returns an iterator that can be used to visit the elements in the collection.

int size()

returns the number of elements currently stored in the collection.

boolean isEmpty()

returns true if this collection contains no elements.

boolean contains(Object obj)

returns true if this collection contains an object equal to obj.

boolean containsAll(Collection<?> other)

returns true if this collection contains all elements in the other collection.

boolean add(Object element)

adds an element to the collection. Returns true if the collection changed as a result of this call.

boolean addAll(Collection<? extends E> other)

adds all elements from the other collection to this collection. Returns true if the collection changed as a result of this call.

boolean remove(Object obj)

removes an object equal to obj from this collection. Returns true if a matching object was removed.

boolean removeAll(Collection<?> other)

removes from this collection all elements from the other collection. Returns true if the collection changed as a result of this call.

void clear()

removes all elements from this collection.

boolean retainAll(Collection<?> other)

removes all elements from this collection that do not equal one of the elements in the other collection. Returns true if the collection changed as a result of this call.

Object[] toArray()

returns an array of the objects in the collection.

java.util.Iterator<E>1.2

boolean hasNext()

returns true if there is another element to visit.

E next()

returns the next object to visit. Throws a NoSuchElementException if the end of the collection has been reached.

void remove()

removes the last visited object. This method must immediately follow an element visit. If the collection has been modified
since the last element visit, then the method throws an IllegalStateException.