Data Structures: The Linked List

If someone were to ask you, "What is a linked list good for," what would you
say? If you are like many developers, your response might be, "not much." We
would probably struggle to think of a situation in our career when we needed to
use one.

However, a linked list is not only a useful data structure, it is so often used
that we do not even notice it. From Deques to Stacks, it is used in situations
where fast insertion and removal is required.

Let's take a brief look at the basic structure of a linked list to understand
how it works.

The Basics

A linked list is a structure used to store an ordered set of data. Similar to
an array, the linked list can be used to store and access individual members of
a collection.

Each item, or node, in a linked list points the next node. If needed, a node
can also point to its predecessor. This is known as a doubly linked list. Each
node will also contain some sort of data. This can be as simple as an integer,
as complex as another structure, or even a pointer to another structures. The
sky's the limit, and the details depend on what problem you are trying to
solve.

Characteristics of the Linked list

As with any other data structure, linked lists are good at some things and bad
at others. The characteristics of any data structure dictate the optimal use
cases. So, what are the characteristics of a linked list?

Dynamic Sizing

Unlike an array, the members of a collection do not need to be stored next to
each other in memory. An array requires blocks of contiguous memory. A linked
list only needs a reference to the next (and, if necessary, the previous) node.
This design gives linked lists a characteristic of being dynamic. In other
words, the size of the collection can be easily altered during run-time.

Let me explain

Using a standard array, we (or the compiler) must define a contiguous block of
memory to store our data. As an example, we could use an array to store 100
integers. That is 400 bytes (assuming x86) of contiguous memory. There is no
issue with this. We have an insane amount of virtual address space available to
us (128 TB in Windows 10). Finding 400 contiguous bytes, or even 4 billion
contiguous bytes is not a problem. But what happens when we fill up this array
and still have more values to store?

Now, I know what you're thinking: "In my language I can't fill up an array.
It just gets bigger." You're right. Let's take it to the next step though and
ask the question: how does the array get bigger?

Many languages like Node, Python, and C# can define an array of any size using
something called a dynamic array. Dynamic arrays have their advantages, but their
use comes at a cost. When the data of a dynamic array exceeds its pre-allocated
space, it is resized. This is most often done by duplicating the entire array,
including additional space for new values.

In our example, a system using dynamic arrays would lock a minimum of 804 bytes
of memory and ask the processor to copy all 400 bytes to the new memory
location. With 400 bytes this is not an issue. With 4 billion bytes, though,
you can see where this is heading. While it might be easy to find 4G of
consecutive address space, it is no trivial task to actively read and write
over 8G of memory on a system that might have only 4G of physical RAM.

In this situation, using a linked list would use more memory (overhead for
pointers to the next node) but would not require it all be mapped
simultaneously. Nor would it require the processor copy large blocks of memory.
It would simply allocate an additional 12 bytes of storage: 8 bytes for the
next pointer, and 4 bytes for the value, then put the location of this new node
in the next field of the last entry. Insertion time is constant.

Fast Insertion and Deletion

As we learned from the above section, insertion time is constant. It is
probably no surprise that deletion time is also constant. Just as adding a new
integer to our collection required an allocation and a pointer reference
update, a deletion only requires a pointer reference update and a
de-allocation. But this is not only true at the ends of a list. It is true
anywhere in the list.

As you can see, a node can be added between two other nodes just as easily no
matter where it is in the list. This is different from an array where each
member below the point of insertion will need to be moved to make room for the
new value. An operation like that could be expensive.

Space Efficiency

Another characteristic of a linked list is space efficiency. This does not mean
that a linked list takes up the minimum amount of memory required to store a
collection of values. It does not. It means that it does not waste any space,
and has minimal overhead. If you have a collection of 50 integers, you are only
using the space needed to store 50 integers. You do not need to reserve
additional space because you plan to store more values. That space can simply
be allocated when needed.

LinkedList:1->2->3Array:[1|2|3||]

Linear Search

Unlike arrays, lists are not well suited for random access. A linked list has
no way to directly access a specific node at a random offset in a collection.
The only option is to do a linear search through the list and find the node
being requested.

In addition, the linked list has no special facility to check membership of a
specific value. Once again, the list must be searched to find the answer.

As an example, if we want to get the 5th node in the list, we start at the head
of the list and follow to the next node, repeating until we reach the 5th node.
If we wanted to know if the value 400 was contained in the list, we would have
to start at the head and test each value in turn until we find it. Once again,
the 4 billion members because problematic.

This is NOT an issue in standard arrays. We can simply go to the offset of
a member in a constant time. If the array is sorted, we can use a binary search
to expedite membership tests. An array is also capable of performing a linear
search so linked lists hold no advantage in this area.

Use Cases

Linked lists are good at fast insertion and fast removal of members, and are
space efficient, but are terrible at random access and membership checks. With
these characteristics in mind, what problems are linked lists suited to solve?

Stacks

From undo/redo tracking to syntax checking to returning to a preempted
workload, stacks are powerful data structures. They are also quite simple - but
why are we talking about stacks?

There is more than one way to implement a stack but using a linked list is
simple, especially if your programming language already supplies a linked list.
If you insert and remove from only one end of a linked list, you effectively
have a stack. All that is missing is handling of exceptional cases such as an
empty list.

Queues

Workload management, rate limiting, or turn-based problems are just a few things
that queues are good at.

To implement a queue with a doubly linked list, simply write to one end and
remove from the other end. Once again, some exceptional condition handling
would be required.

// Assume that head->next points to tail and tail->prev points to head.node*head;node*tail;voidenqueue(inti){node*n=malloc(sizeof(node));n->value=i;n->next=headhead->prev=n;head=n;}intdequeue(){intvalue=tail->value;node*n=tail->prev;free(tail);tail=n;returnvalue;}

Conclusion

Linked lists are often overlooked. Although they solve similar problems as
arrays, they have strengths that arrays do not. Linked lists are well suited to
tasks that require quick adding and removing, or tasks where sequence is
important, but would be a poor choice if you need to query or check membership
of a large collection.

As with any other data structure, it is useful to consider the positive and
negative characteristics and how they might relate to your specific problem.

Other Thoughts

One more consideration regarding linked lists that I find intriguing:
understanding lists is essential to understanding many more complex data
structures such as Binary Search Trees, Heaps, and Graphs. Each of these
structures keep references to their neighbors in the same way lists do, and it
is simply how the data is organized within the structure that gives it unique
characteristics. A linked list is nothing more than a simple tree in the way a
tree is a simple graph.