1.7.8 Longest Common Substring

INPUT OUTPUT

Input Description:
A set S of strings S_1,...,S_n.

Problem:
What is the longest string c such that for each S_i,
1 \leq i \leq n, the characters of c appear as a scattered
subsequence of S_i?

Excerpt from
The Algorithm Design Manual:
The problem of longest common subsequence arises whenever we search for
similarities across multiple texts.
A particularly important application is in finding a consensus among
DNA sequences.
The genes for building particular proteins evolve with time, but the
functional regions must remain consistent in order to work correctly.
By finding the longest common subsequence of the same gene in different
species, we learn what has been conserved over time.

The longest common substring problem
is a special case of edit
distance, when substitutions are forbidden and only exact character match, insert, and delete are allowable edit
operations. Under these conditions,
the edit distance between p and t is n+m-2 |lcs(p,t)|,
since we can delete the
missing characters from p to the lcs(p,t) and insert the missing characters
from $t$ to transform p to t.
This is particularly interesting because
the longest common subsequence can be faster to compute than edit distance.