Text::Lossy is a collection of text filters for lossy compression. "Lossy compression" changes the data in a way which is irreversible, but results in a smaller file size after compression. One of the best known lossy compression uses is the JPEG image format.

Note that this module does not perform the actual compression itself, it merely changes the text so that it may be compressed better.

Text::Lossy uses an object oriented interface. You create a new Text::Lossy object, set the filters you wish to use (described below), and call the "process" method on the object. You can call this method as often as you like. In addition, there is a method which produces a closure, an anonymous subroutine, that acts like the process method on the given object.

This method takes a list of filter names and adds them to the filter list of the filter object, in the order given. This allows a programmatic selection of filters, for example via command line. Returns the object for method chaining.

If the filter is unknown, an exception is thrown. This may happen when you misspell the name, or forgot to use a module which registers the filter, or forgot to register it yourself.

Returns a code reference that closes over the object. This code reference acts like a bound "process" method on the constructed object. It can be used in places like Text::Filter that expect a code reference that filters text.

The code reference is bound to the object, not a particular object state. Adding filters to the object after calling as_coderef will also change the behaviour of the code reference.

Collapses any whitespace (\s in regular expressions) to a single space, U+0020. Whitespace at the beginning of the text is stripped completely. Whitespace at the end is also collapsed to a single space, to help separate lines. Text consisting only of whitespace results in an empty string.

A variant of the "whitespace" filter that leaves newlines on the end of the text alone. Other whitespace at the end will get collapsed into a single newline. If the text ends in whitespace that does not contain a new line, it is replaced by a space, as before.

This filter is most useful if you are creating a Unix-style text filter, and do not want to buffer the entire input before writing the (only) line to stdout. The newline at the end will allow downstream processes to work on new lines, too. Otherwise, this filter is not quite as efficient as the whitespace filter.

Any newlines in the middle of text are collapsed to a space, too. This is especially useful if you are reading in "paragraph mode", e.g. $/ = '', as you will get one long line per former paragraph.

A variant of "punctuation" that replaces punctuation with a space character, U+0020, instead of removing it completely. This is usually less efficient for compression, but retains more readability, for example in the presence of URLs or email addresses.

Leaves the first and last letters of a word alone, but replaces the interior letters with the same set, sorted by the sort function. This is done on the observation (source uncertain at the time) that words can still be made out if the letters are present, but in a different order, as long as the outer ones remain the same.

This filter may not work as proposed with every language or writing system. Specifically, it uses end-of-word matches \b to determine which letters to leave alone.

Adds one or more named filters to the set of available filters. Filters are passed in an anonymous hash. Previously defined mappings may be overwritten by this function. Specifically, passing undef as the code reference removes the filter.

One thing to note is that the Text::Lossy filters do not follow the Text::Filter's convention that lines "to be skipped" should result in an undef. This means you need to expect completely empty lines (q{}, not even a newline character) in your output. This should be no problem if you print to a file handle or append to a string, but may be surprising if you are filtering an array of lines.