2. Linked Data Structures

In our example application, we need to record the names of all the
people that enroll in a course. We have already established that it is
more efficient to use dynamic allocation to record their names.
Therefore, for each name we know starting at what memory location we
have copied it. We call this memory location the address of
the name.

Now, in order to record the names of all the students in
the course, what should we do? We could use an array that's big enough
to accommodate the largest number of students we can reasonably
expect. For each student, we'd record his/her name as illustrated
earlier, and store the corresponding address in the next available
cell of the array.

This array might have to be quite big. And what if we wanted to do
the same for all courses offered at the university. We'd have to
allocate very large arrays for each course, even though most of them
would not be at full capacity; therefore, large portions of each array
would remain used, yet would consume memory.

Let's take a look at what our example might look like. For
concreteness we suppose that the name JOE starts at memory location 1
and MARYJANE at memory location 7.

Note the order in which names (or rather the indices of their
memory location) are entered into the array. The first student is in
the first position; the next student is in the next position, etc...
When we consult this array, we rely on the expectation that the next
student can be found in the next location. We depend on
the fact that these locations are contiguous. In other words,
the logical relation next is represented by the physical
order of the locations in the array.

With a linked representation, this logical relation is represented
explicitly. Along with each student entry, we store the address
of the next student entry:

The first student entry, that of JOE, begins at location 1. It
starts with the address of the next student entry, which happens to be
MARYJANE at location 8. The entry for MARYJANE is linked to the next
entry (not shown in the diagram) that begins, say, at location 23 for
example.

Each student entry is linked to the next one. What happens
when there is no next one? How do we recognize the end of a chain? We
use the invalid address 0 for this purpose. In C, this invalid address
is represented by the constant NULL (not to be confused
with the NUL character '\0', although they both happen to
be defined as 0). Thus, if there were no further entries after
MARYJANE, we would have:

You can see that this method is much more economical than to
statically allocate a large array. Each student entry can be stored
anywhere in memory, and we can examine in turn all the students
enrolled in a course by following the links until we find one whose
value is NULL.

Furthermore, we can similarly envision a linked data structure of
all courses offered at the university. Each course would be
represented by a pair of positions (dynamically allocated): the first
would point to the first student in the first course (see
above), and the second would point to the next course. For
example, we might have something like this:

2.1. Conclusion

The combination of dynamic allocation and linked data structures is
extremely attractive because it utilizes memory with very high
efficiency. There is a price to be paid however.

As mentioned above, the user is responsible for correctly
returning memory when it is no longer needed.

Linked structures do require extra memory: space is needed for
storing the links. This overhead is usually small compared to
the increased utilization of memory gained by dynamic
allocation, but there are circumstances where the space for
links is a very high cost. We need to be wary of this.

The code for manipulating linked structures is sometimes quite a
lot more complex than the code for contiguous structures. But
this is not of great concern because when you use abstract
data types, the user does not know how the structure is
implemented. If the implementation is complex, that is the
implementer's concern; the complexity will be confined to the
few core operations that define the data type. The vast
majority of the code will be independent of the implementation
details and unaffected by its complexity.