Implement Diff Utility

Implement your own diff utility. i.e given two similar strings, efficiently list out all differences between them.

The diff utility is a data comparison tool that calculates and displays the differences between two text. It tries to determine the smallest set of deletions and insertions to create one text from the other. Diff is line-oriented rather than character-oriented, unlike edit distance.

For example,

Input:
string X = “XMJYAUZ”;
string Y = “XMJAATZ”;

Output:
X M J -Y A -U +A +T Z

(- indicates that character is deleted from Y but it was present in X)
(+ indicates that character is inserted in Y but it was not present in X)

We can use Longest Common Subsequence (LCS) to solve this problem. The idea is to find a longest sequence of characters that is present in both original sequences in the same order. From a longest common subsequence it is only a small step to get diff-like output –

If an character is absent in the subsequence but present in the first original sequence, it must have been deleted (indicated by the ‘–’ marks).

If it is absent in the subsequence but present in the second original sequence, it must have been inserted (indicated by the ‘+’ marks).