Existence of the de Bruijn Cycles (Sequences) via Graphs

There are nk strings of length k written with n distinct symbols, such as, e.g., 0, 1, 2, ..., n-1.

de Bruijn cycle (also de Bruijn sequence), say dBC(n, k), is a cyclic sequence that when written on a circle each of the nk sequences appears in it exactly once as a subsequence.

For example, here's one dBC(3, 3):

000222122021121020120011101

which should be imagined written in a circle. In this example, the length of the string is 3³ which is clearly the absolute minimum possible. On the other hand, any cyclic string of length greater than 3³ would contain at least one substring of length 3 at least twice. To generalize, de Bruijn cycle dBC(n, k) - if exists - should have length nk.

There are de Bruijn cycles for any positive integers n and k.

Proof

For k = 1, any string that lists all n available symbols in any order is dBC(n, 1).

Given a string abc...de of length k > 1 written with n symbols, there are n strings that could follow it in a dBC(n, k) if such one existed:

bc...de0, bc...de1, ..., bc...de(k-1).

Let's form a digraph - a de Bruijn graph - with nk vertices, labelled by the nk strings of length k in n symbols. We join the node labeled abc...de to the n nodes so constructed and label the edge with a single symbol with which the node "to" begins. For example, the edge joining abc...de to bc...de2 is labelled "2".

Similary, there are also n strings that could have preceeded it in a dBC(n, k), had one existed:

0abc...d, 1abc...d, ..., (k-1)abc...d.

This tells us that every vertex on a de Bruijn's graph has both indegree and outdegree equal to n.

Furthermore, a de Bruijn graph is necessarily connected. There is a pass from any vertex abc...de to any other vertex uvw...yz. For example. For k = 5, here is a path from abcde to uvwyz:

abcde, bcdeu, cdeuv, deuvw, euvwy, uvwyz.

It follows that a de Bruijn graph always has an Euler cycle. Pick any such cycle, record the successive edge labels in a string. The result will be one of de Bruijn cycles dBC(n, k+1).

Example 1

Let's construct dBC(2, 3). To this end, form a graph with vertices 00, 01, 10, and 11, and join them as shown:

Each vertex has the indegree and outdegree equal 2. Let's pick one of the Euler paths, say,

00→00→01→11→11→10→01→10→00

The record of the labels becomes 01110100, a string of length 2³ - as expected.

Example 2

Let's construct dBC(2, 4). To this end, form a graph with 8 vertices

000, 001, 010, 011, 100, 101, 110, 111,

and join them as shown below

There are 16 edges so that any Euler path has length 16 = 24 - the required length of a de Bruijn cycle dBC(2, 4).

Example 3

To see that the algorithm works in general, start recording with the label of the first vertex. Then at each step, i.e., after moving along an Euler path to the next vertex and recording one edge label, you shall be able to read that vertex's label at the end of the string. Closing the Euler cycle will result in a string of length (n+1)k + n, the latter n accounting for the label of the first vertex. This now can be dropped, leaving successive vertex labels to be read from the string.