Selecting the Best Java Collection Class for Your Application

Set Classes

Sets are collections that contain no duplicates. Attempting to add an object to a set that already contains that object will have no effect. Collection classes that organize objects into sets implement the Set interface.

Classes that implement the Set interface are not required to keep their elements in any particular order. If all that you know about a collection object is that its class implements the Set interface, then you cannot predict the order in which an iterator on the collection will traverse the contents of the collection.

There is a sub-interface of Set named SortedSet. Classes that implement the SortedSet interface keep the elements of a set in a sorted order.

If you have decided that a set is the organization that you need, here are suggestions regarding which class to use:

HashSet: HashSet is the most commonly used class for managing a set of objects. Objects can be added to, removed from, and found in a HashSet quickly. The amount of time required is unrelated to the number of objects in the set, assuming that the parameters of the HashSet have been properly tuned. A HashSet object does not keep the objects in a set in any particular order. If you need a set that keeps its elements sorted in some order, then consider using the TreeSet class.

TreeSet: The TreeSet class keeps the elements of a set sorted in a specified order. However, the time that it takes to add, remove, or find elements in a set is generally longer than that required for a set managed by a HashSet. The time required for these operations is proportionate to the log of the number of elements in the set.

EnumSet: The EnumSet class provides a fast and compact way of representing a set of enum objects. All elements of an EnumSet set must come from the same enum type, and they are always sorted in their natural order.

LinkedHashSet: The LinkedHashSet class is similar to the HashSet class, except that it keeps its elements in the same order in which they were inserted into the set. Adding and removing elements in a LinkedHashSet object is a little bit slower than for a HashSet object due to the additional overhead of maintaining the order of its elements.

ConcurrentSkipListSet: Sets managed by the ConcurrentSkipListSet class can be safely modified by multiple threads at the same time, with no need to use locks. If a set managed by one of the other set classes may be updated by more than one thread, you will need locks to prevent multiple threads from updating the set at the same time. You can use the ConcurrentSkipListSet class to avoid threads having to wait for locks. A ConcurrentSkipListSet object keeps the elements of its set sorted in a specified order. Like TreeSet, the time that it takes to add, remove, or find elements in a ConcurrentSkipListSet is proportionate to the log of the number of elements in the set. However, the ConcurrentSkipListSet takes longer to do these things. One operation that takes a particularly long time is size, which takes time proportionate to the number of elements in the set.

CopyOnWriteArraySet: The CopyOnWriteArraySet class is used to solve a specific concurrency problem without having to resort to locks. While an iterator is iterating over a HashSet or TreeSet object, to ensure that the iterator sees a consistent view of the set, the contents of the set must not change. If they do, then next time the iterator's next or previous method is called, the method will throw a ConcurrentModificationException exception. To avoid this situation, you could create a lock to prevent methods that would modify the HashSet or TreeSet object while an iterator is active. If you rarely iterate over a set at the same time it is being modified, then using a lock is a good solution. However, if iterating over a set is common, then solving the problem with a lock can create another problem: the amount of time threads spend waiting to modify the set may be unacceptably high. The CopyOnWriteArraySet class has one feature that distinguishes it from the other classes. All operations that change the contents of a CopyOnWriteArraySet collection cause the underlying array to be replaced with a copy of itself before the contents of the array are changed. Any active iterators will continue to see the unmodified array, so there is no need for locks. Iterators created by a CopyOnWriteArraySet object cannot change the underlying array. Though these iterators do have methods for changing the underlying collection, their implementations throw an UnsupportedOperationException instead of modifying the underlying collection.

Queue Classes

Queues impose an order on the objects they contain. They have operations to add objects at the end of the order and remove objects from the start of the order. Most commonly, queues use the order in which objects are added as the ordering for their contents. Queues that use the order in which objects are added are sometimes referred to as FIFO (first in, first out) structures.

Classes that organize a collection as a queue implement the Queue interface. The methods defined in the Queue interface return immediately. This means that an attempt to remove an object from an empty queue fails either by returning null or throwing an exception, depending on which method is called. An attempt to add an object to a size-limited queue that is full fails either by returning false or throwing an exception.

Some classes that organize a collection as a queue implement a sub-interface of Queue named BlockingQueue. The BlockingQueue interface adds methods to Queue that may not return immediately if a queue is empty or full. The put method will wait to return or "block" until the queue is no longer full. The take method will block until the queue is no longer empty. There are also similar methods named offer and poll that will time out if they block for more than a specified amount of time.

If you have decided that a queue is the organization to use, here are suggestions regarding which class to use:

LinkedList: The LinkedList class can be used to organize objects into a queue that keeps objects in the order they are added. The add and remove methods take a short amount of time, which is independent of how many objects are in the queue. Queues based on the LinkedList class do not block.

PriorityQueue: A PriorityQueue object keeps objects in a queue sorted in a specified order. Its add and remove methods take time that is proportionate to the number of objects in the queue. Queues based on the PriorityQueue class do not block.

PriorityBlockingQueue: A PriorityBlockingQueue object keeps objects in a queue sorted in a specified order. Its add and remove methods take time that is proportionate to the number of objects in the queue. Queues based on the PriorityQueue include methods that block.

ArrayBlockingQueue: The ArrayBlockingQueue class can be used to organize objects into a queue that keeps objects in the order they are added. The add and remove methods take a short amount of time that is independent of how many objects are in the queue. Queues based on the ArrayBlockingQueue class can block. For situations when multiple threads are waiting to perform operations on a queue, the behavior of an ArrayBlockingQueue object will be more consistent and predictable than for a LinkedBlockingQueue. When multiple threads are waiting for blocked method calls of an object to return, there is normally no guarantee in which order the method calls will return. The ArrayBlockingQueue class's constructor takes an optional argument that can specify that the ArrayBlockingQueue object will force its blocked method calls to return in the same order they were called. A restriction of the ArrayBlockingQueue class is that the capacity of ArrayBlockingQueue objects must be specified as an argument to their constructor. The capacity of a ArrayBlockingQueue object cannot be changed after it is constructed.

LinkedBlockingQueue: The LinkedBlockingQueue class can be used to organize objects into a queue that keeps objects in the order they are added. The add and remove methods take a short amount of time that is independent of how many objects are in the queue. Queues based on the LinkedBlockingQueue class can block. When multiple threads are waiting to perform operations on a queue, the behavior of a ArrayBlockingQueue object will be more consistent and predictable than that of a LinkedBlockingQueue. However, the LinkedBlockingQueue class may provide better throughput.

ConcurrentLinkedQueue: The ConcurrentLinkedQueue class can be used to organize objects into a queue that keeps objects in the order they are added. The ConcurrentLinkedQueue class is a good implementation choice when many threads will seek to get objects into and out of a queue. This is because ConcurrentLinkedQueue can safely be modified by multiple threads at the same time. No locks are needed. The methods to add objects to or remove objects from a ConcurrentLinkedQueue take a constant amount of time that is not related to the number of objects in the queue. However, the size operation takes time proportionate to the number of objects in the queue.

DelayQueue: A DelayQueue is a specialized blocking queue that consults the objects it contains when they can be removed from the queue. Objects in a DelayQueue must implement the Delayed interface. The Delayed interface defines a getDelay method that returns a delay expressed in a specified unit of time. Objects cannot be removed from a DelayQueue while their delay is a positive number.

SynchronousQueue: The SynchronousQueue class is a specialized class for coordinating producers and consumers of objects. The SynchronousQueue class blocks. A SynchronousQueue object is always empty. A SynchronousQueue object's put method blocks unless another thread is waiting for the object's take method to return. A SynchronousQueue object's take method blocks unless another thread is waiting for the object's put method to return.