List comparing algorithms

Generally, we recommend using Levenshtein, because it’s the smartest one.
But use it with caution, it could be slow for long lists,
say more then 300 elements.

The main advantage of Simple algorithm is speed, it has linear computation complexity.
The main disadvantage is the verbose output.

Choose the Set algorithm if you don’t care about the items ordering.
JaVers will convert all Lists to Sets before comparision.
This algorithm produces the most concise output (only ValueAdded and ValueRemoved).

Simple vs Levenshtein algorithm

Simple algorithm generates changes for shifted elements (in case when elements are inserted or removed in the middle of a list).
On the contrary, Levenshtein algorithm calculates short and clear change list even in case when elements are shifted.
It doesn’t care about index changes for shifted elements.

For example, when we remove one element from a list:

javers.compare(['a','b','c','d','e'],['a','c','d','e'])

the change list will be different, depending on chosen algorithm:

Output from Simple algorithm

Output from Levenshtein algorithm

(1). 'b'>>'c'
(2). 'c'>>'d'
(3). 'd'>>'e'
(4). removed:'e'

(1). removed: 'b'

But when both lists have the same size:

javers.compare(['a','b','c','d'],['a','g','e','i'])

the change list will the same:

Simple algorithm

Levenshtein algorithm

(1). 'b'>>'g'
(2). 'c'>>'e'
(3). 'd'>>'i'

(1). 'b'>>'g'
(2). 'c'>>'e'
(3). 'd'>>'i'

More about Levenshtein distance

The idea is based on the Levenshtein edit distance
algorithm, usually used for comparing Strings.
That is answering the question what changes should be done to go from one String to another?

Since a list of characters (i.e. String) is equal to a list of objects up to isomorphism
we can use the same algorithm for finding the Levenshtein edit distance for list of objects.

The algorithm is based on computing the shortest path in a DAG. It takes both O(nm) space
and time. Further work should improve it to take O(n) space and O(nm) time (n and m being
the length of both compared lists).

Custom Comparators

There are cases where Javers’ diff algorithm isn’t appropriate,
and you need to implement your own comparator for certain types.