Mastering data structures in Ruby — Graphs

A graph is a data structure that allows us to represent data in terms of objects and relationships. Objects on a graph are called vertices (or nodes), and their connections to other nodes are called edges.

A <-- (this is vertex/node.) / \ B - C \ / <- (this is an edge.) D

A graph is said to be undirected where relationships between objects are bidirectional (like the one in the section above) or directed otherwise.

Directed graphs can be cyclic when a node points back to the node that references it, or acyclic when there are no back-pointers. For instance, compiler’s code generators may transform ASTs(abstract syntax trees) to DAGs (directed acyclic graphs) before emitting code for the target platform.

A / \ / \ v v B ----> C

* Directed Acyclic Graph.

As it happens with binary trees, you can search nodes by doing an either, breadth-first or depth-first traversal. Also, as well as binary trees, this kind of graphs don’t allow duplicate keys.

Interface for Graphs

The interface for general purposes graphs it’s quite simple. It’s just a small set of operations that allows us to add, remove, connect, disconnect, and search for vertices. That’s it.

Depending on the specific use of the graph, you may have to add additional properties. For instance, you might have to add weight to the edges if you are trying to solve the shortest path from one node to the other.

Methods

Graph interface.

Implementation details

At its core, a graph is just a sequence of connected nodes. To represent that sequence we are going to use a singly linked list where each entry on the list points to an instance of the Vertex class, the class we use to represent a node and its edges.

The singly linked list we are going to use this time is a slight variation of the one that we built in the first post of this series. The only addition to this version is a method that allows us to remove elements in constant time. (You can see the monkey patched version in the code attached to this post.)

This is how the ASCII representation of this data structure looks like:

[ vertex | {edges} | next] -> [ vertex | {edges} | next ] -> X (nil)

Each vertex (node) contains a key that uniquely identifies it and a set of edges attached to it.

To represent edges on the graph, we use the set data structure that we built previously on this series.

Initialize

This method is the graph’s constructor, and the only thing it does is initialize the list of vertices.

The complexity of this method is O(1).

def initialize @vertices = LinkedList.newend

Find Vertex

This method finds a vertex based on its key.

Since this method delegates the task to a singly linked list, the search ends up being linear, and so the complexity is O(n).

def find_vertex key @vertices.find_first { |v| v.data.key == key }end

Insert Vertex

This method adds a vertex to the graph.

Since this method has to guarantee that the graph contains no duplicates, its complexity is O(n), where n is the number of vertices in the graph.

def insert_vertex key return if find_vertex key

vertex = Vertex.new key @vertices.insert vertexend

Insert Edge

This method connects two nodes by adding an edge to the graph.

The complexity of this method is O(n), where n is the number of vertices in the graph.

Remove Vertex

This method removes a vertex from the graph.

Since this method is the most complex one, we are going to analyze it step by step.

The first thing we have to do is to look for the vertex that matches the key that we want to remove. This time we can’t use the standard find_vertex method because what we need is the node that contains the vertex on the vertices list and also the one that precedes it. By having those nodes, we can remove the target vertex in constant time.

Once we found the target node, we have to make sure that it doesn’t contain references to other vertices. If the node contains references, we can’t remove it.

If we pass the previous check, the node can be safely removed; To do so, we use the node that precedes the target node (captured while searching) and called the method remove_next on the modified version of the singly linked list that we use to store vertices.

The complexity of this method is O(n + e), where n is the number of vertices in the graph and e is the number of edges.