I don't quite feel like I can implement the algorithm based on the
paper. It seems to outline fairly clearly how to detect longer patterns
with statistical significance, but I don't see how it selects
equivalence sets. Can anyone shed any light on that piece?

That being said, the algorithm looks extremely straight-forward to me;
as described in the paper, it's a brute force scanning for exactly what
we'd expect a pattern to be.

Also note that the algorithm implicitly assumes distinct language
literals and have poor recognition of identical patterns occurring in a
neighboring part of the board.

entitled "Automatic acquisition and efficient representation of
syntactic structures", and claiming to be an algorithm for learning
"sequences", not only language-based but also "sheet music or protein
sequences".