Lempel-Ziv Algorithm: Coding a Text with Digits

Suppose our text contains not only letters and punctuation marks, but digits as well. How will we be able to distinguish in the coded chain whetheer a digit is a last character of a string (and thus a character from the original text) or part of the reference number of a prefix? There is an easy solution to this problem: make all the reference numbers have a fixed length, adding leading zeros when necessary. In our example we have fewer than 100 strings, so we can use two (decimal) digits for each reference number.
In our example, the original string of:

Thus, we can use the same symbols for prefix and for the last character, since all our encoded strings have the same length now, and the position of a character within an encoded string unambigously indicates its role. This idea allows us to compress binary text using only 0s and 1s. You will see an example on the next page.

To summarize, we have 4 steps:

parsing

counting the number of strings to choose the size for encoding prefix.