Writing Type-Safe Collections in C#

Introduction

Compiled programming languages allow earlier error checking, better enforcement
of programming styles, and generation of more efficient object code than interpreted
languages, where all type-consistency checks are performed at run time. However,
even in compiled languages, there is often the need to deal with data whose
type cannot be determined at compile time. In such cases, a certain amount of
dynamic type checking is required in order to preserve type-safety.

For example, the Collections Framework in C# allows the creation of a collection of System.Object
types, which then requires an unsafe downcast to the class of the desired type. Since an object of any type
can be wrapped into a System.Object--the ultimate superclass of all classes,
in the .NET Framework--we can essentially create containers of objects in which we can
put anything at all. A user of this container is then faced with the problem of guessing the type of
each object. Hence it is the responsibility of the programmer to make sure that the collection gets populated
with objects of the correct type, and that necessary checks are performed while retrieving objects from
the collection. As new developers get added to a team, or as new developers take on a project, it becomes
quite difficult to enforce such discipline, and the potential for error increases greatly.

Fortunately, instead of relying on verbal communications or written guidelines to enforce
this discipline, developers can apply object-oriented techniques to achieve compile-time type safety
with collections. In this article, I will describe the various ways in which type-safe
collections can be written in C#, and the advantages and disadvantages of each approach.

First Approach: Inheriting from Existing Collection Classes

One obvious approach that comes to mind is to derive from an existing collection class and override
all methods that need to enforce type checking. This allows us to reuse most of the methods in the base class
for free, and override only a few. It also gives us the flexibility of passing around the derived class
wherever the base class type is expected. For example, the following would create a type-safe
ArrayList for storing a list of Customer objects:

While this approach helps us achieve what we want, it has a major drawback--i.e., we need to override
every method of ArrayList that can compromise its type safety, including overloaded
methods. Since all of the public and protected methods of ArrayList are available to
CustomerArrayList, not overriding even one type-unsafe method will expose the base
ArrayList's method to the programmer--a condition we are trying to
avoid. Besides, with every new release of the C# language, we need to check if any new method
was added to this class that could break the type-safety of CustomerArrayList, and then override that
method. This problem will exist not only for ArrayList, but also for every collection class in the
Collections Framework that needs to be made strongly typed. This can be quite a tedious programming
task. Surely there's a better way to achieve our goal.

Second Approach: Inheriting from CollectionBase and DictionaryBase

The System.Collections.Specialized
namespace in the .NET Framework Class Library contains a few specialized, strongly-typed
collections that can contain only strings. For example,
StringCollection represents a collection of strings,
NameValueCollection contains a sorted list of
string keys and string values, and
StringDictionary implements a hashtable with
string keys and string values.
The System.Collections
namespace provides a few more specialized, strongly-typed collections like
AttributeCollection, CookieCollection, ListItemCollection,
etc.

System.Collections.CollectionBase

To create type-safe collections other than the ones supported, the .NET Framework
provides a class called
CollectionBase
in the System.Collections
namespace. This is an abstract class that needs to be subclassed in order to be useful.
It provides basic functionality like providing a count of the number of elements, removing
an element from a particular location, etc.--functionality that does not compromise the
type safety of the collection. All other methods (like Add, Remove,
Insert, etc.) need to be implemented by the subclasses of CollectionBase.

The .NET Framework provides a few type-safe
implementations
of CollectionBase. These implementations are for specialized object types.
Most often we require collections of custom object types, in which case we need to construct our own
collection. Below, we create a type-safe collection of Customer objects by deriving
from CollectionBase.

CollectionBase encapsulates an
ArrayList and provides access to it via a protected property called InnerList.
CollectionBase also contains a protected property called List, which is
nothing but CollectionBase itself, returned as an IList (notice that
CollectionBase implements IList). In the code provided above, all calls to
CustomerList are delegated to the List object--they could be delegated to the InnerList object as well. So the following code would also be perfectly valid:

The difference is that List.Add is a wrapper around InnerList.Add. Before
List.Add calls InnerList.Add, it calls OnValidate and OnInsert.
After calling InnerList.Add, it calls OnInsertComplete. OnValidate,
OnInsert, and OnInsertComplete are virtual methods defined by CollectionBase.
They can be overridden to perform some custom validation and processing while accessing the members
of InnerList.

At this point, you're probably wondering how delegation to the List object is working at all. You're
wondering that, since List is nothing but the CollectionBase object itself returned
as an IList, calling List.Add seems like another way of
calling CollectionBase.Add. And looking at the member definitions for
CollectionBase,
you do not see any public or protected member called Add. Then how are we able to call
List.Add, and how is this code compiling and running?
The answer lies in the fact that CollectionBase defines Add as an
explicit interface member implementation
of IList, which means Add is, in some sense, a private method that can only be accessed through
an instance of IList. For this reason, List returns CollectionBase
as an instance of IList.

System.Collections.DictionaryBase

While CollectionBase allows us to implement strongly-typed collections of objects, another
class called
DictionaryBase
provides the abstract base class for creating
strongly-typed collections of (key, value) pairs. Just as CollectionBase encapsulates an
ArrayList, DictionaryBase encapsulates a Hashtable and provides access
to it via a protected property called InnerHashtable. DictionaryBase also contains
a protected property called Dictionary, which is nothing but DictionaryBase
itself returned as an IDictionary (notice that DictionaryBase implements
IDictionary). The semantics of defining explicit interface member implementations are the same
as in the previous case, which means that even though certain methods like Add and Remove
are defined in DictionaryBase, they can only be accessed via an instance of IDictionary.

The following is a type-safe hashtable that stores int/Customer pairs as opposed to
object/object pairs stored in a Hashtable. All calls to
CustomerTable are delegated to the Dictionary object--they could be delegated to the InnerHashtable object as well.

CollectionBase and DictionaryBase solve the problem more elegantly than the
first approach. They take most of the work out of implementing a type-safe collection. However, they
suffer from some drawbacks that we did not see in the first approach:

Since CustomerList is not of type ArrayList, we cannot pass a
CustomerList wherever an ArrayList is expected. We could provide a method
that returns the InnerList object, but that would violate the rules of encapsulation.

If we want type-safe versions of more specialized collections like Stack and
Queue, we're out of luck using this approach.

Third Approach: Containing Existing Collection Classes

Even though the above approach works well for most collection classes, it does not address creation of
type-safe Stack and Queue classes. To implement these, we could either use the
inheritance method we saw in the first approach, or we could use the
containment/delegation method (similar to the second approach, except that this time we do it independent
of CollectionBase). The following class creates a type-safe Stack
of Customer objects using containment and delegation.

This approach gives us more flexibility in implementing our collections. For example, we could use the
same class to implement multiple collections. Also, containment protects us from any future changes
to the collections classes. The downside of this approach is the same as we saw with the second
approach--since CustomerStack is not of type Stack,
it cannot be passed wherever a Stack is expected.

In the above code, we could make CustomerStack implement the ICollection interface,
but in that case, we must be prepared to provide implementations for all of the methods and properties of
ICollection. By not implementing ICollection, we just chose a little freedom
for ourselves so that we could provide only the minimum functionality required for our stack implementation,
while still leveraging the contained Stack object by delegating all calls to it.

Conclusion

Even though all of the above approaches achieve compile-time type safety, none of them is
particularly elegant or efficient, for the following reasons:

All of them require us to write every type of collection for every type of object we wish to
store. This sacrifices polymorphism and results in code duplication.
The size of an application that wishes to achieve such
precise type safety can increase rapidly, and fixing bugs in the duplicated sections can be quite a task.

Since collection classes in C# are designed with reference types in mind, using them with value types
leads to unnecessary boxing
and
unboxing,
which degrades performance. All three approaches discussed in this article make use of the built-in collection classes,
so they all suffer from the same performance problem.

The problems faced by the Collections Framework are precisely the problems that generic types
try to solve, but until we see support for generics in C#, we will have to either implement our own
type-safe collections, or succumb to writing type-unsafe code that rears its ugly head during a demo.

Amit Goel
has been developing object-oriented
applications for several years. You can learn more about him at www.amitgoel.com.