This module provides transparent methods to maintain Thesaurus files. The module uses a subset from ISO 2788 which defines some standard features to be found on thesaurus files. The module also supports multilingual thesaurus and some extensions to the ISOs standard.

A Thesaurus is a classification structure. We can see it as a graph where nodes are terms and the vertices are relations between terms.

This module provides transparent methods to maintain Thesaurus files. The module uses a subset from ISO 2788 which defines some standard features to be found on thesaurus files. This ISO includes a set of relations that can be seen as standard but, this program can use user defined ones. So, it can be used on ISO or not ISO thesaurus files.

Thesaurus used with this module are standard ASCII documents. This file can contain processing instructions, comments or term definitions. The instructions area is used to define new relations and mathematical properties between them.

Comments can appear on any line. Meanwhile, the comment character (#) should be the first character on the line (with no spaces before). Comments line span to the end of the line (until the first carriage return).

Processing instructions lines, like comments, should start with the percent sign (%). We describe these instructions later on this document.

Terms definitions can't contain any empty line because they are used to separate definitions from each other. On the first line of term definition record should appear the defined term. Next lines defines relations with other terms. The first characters should be an abbreviation of the relation (on upper case) and spaces. Then, should appear a comma separated list of terms.

There can be more than one line with the same relation. Thesaurus module will concatenate the lists. If you want to continue a list on the next line you can repeat the relation term of leave some spaces between the start of the line and the terms list.

When presenting a thesaurus, we need a term, to know where to start. Normally, we want the thesaurus to have some kind of top level, where to start navigating. This command specifies that term, the term that should be used when no term is specified.

Internationalization functions, languages and setLanguage should be used before any other function or constructor. Note that when loading a saved thesaurus, descriptions defined on that file will be not translated. That's important!

interfaceLanguages()

This function returns a list of languages that can be used on the current Thesaurus version.

interfaceSetLanguage( <lang-name> )

This function turns on the language specified. So, it is the first function you should call when using this module. By default, it uses Portuguese. Future version can change this, so you should call it any way.

This module uses a perl object oriented model programming, so you must create an object with one of the thesaurusNew, thesaurusLoad or thesaurusRetrieve commands. Next commands should be called using the OO fashion.

To use the thesaurusLoad function, you must supply a file name. This file name should correspond to the ISO ASCII file as defined on earlier sections. It returns the object with the contents of the file. If the file does not defined relations and descriptions about the ISO classes, they are added.

Also,

$obj = thesaurusLoad({ completed => 1}, 'iso-file');

can be used to say that the thesaurus needs not to be complete after load.

Everybody knows that text access and parsing of files is not efficient. So, this module can save and load thesaurus from Storable files. This function should receive a file name from a file which was saved using the storeOn function.

This method dumps the object on an ISO ASCII file. Note that the sequence thesaurusLoad, save is not the identity function. Comments are removed and processing instructions can be added. To use it, you should supply a file name.

Note: if the process fails, this method will return 0. Any other method die when failing to save on a file.

This method should be used to describe the inversion property to relation classes. Note that if there is some previous property about any of the relations, it will de deleted. If any of the relations does not exist, it will be added.

This function completes the thesaurus based on the invertibility properties. This operation is only needed when adding terms and relations by this API. Whenever the system loads a thesaurus ISO file, it is completed.

The downtr method is used to produce something from a set of terms. When no term is given, the all thesaurus is taken. It should be passed as argument a term and an associative array (handler) with anonymous subroutines that process each relation. These functions can use the pre-instantiated variables $term, $rel, @terms. The handler can have three special functions: -default (default handler for relations that don't have a defined function in the handler), -eachTerm executed with each term output (received as $_), and -end executed over the output of the the other functions (received as $_),

If a -order array reference is provided, the correspondent order of the relations will be used.

Writes a thesaurus in LaTeX format... The first argument is used to pass a tag substitution hash. It uses downtr function to make the translation; a downtr handler can be given to tune some transformations details...

This method writes a thesaurus in XML format... The first argument is used fo pass a tag substitution hash. It uses downtr function to make the translation; a downtr handler can be given to tune some transformations details...