English abstract

Application of transducers of state-finite to unification processes of term variants. An approach based on techniques of state-finite has applied to the processes of unification of terms in Spanish. The algorithms of conflation are computational procedures utilized in some Information Retrieval (RI) systems for the unification of term variants, semantically equivalent, to a normalized form. The programs that carry out habitually this process are called: stemmers and lematizadores. The objective of this work is to evaluate the deficiencies and errors of the lemmatizers in the conflation of terms. The method utilized for the construction of the lemmatizer has been based on the implementation of a linguistic tool that allows to build electronic dictionaries represented internally in Finite-State Transducers (FST). The lexical resources developed have been applied to a corpus of verification to evaluate the performance of these lexical parsers. The metric of evaluation utilized has been an adaptation of coverage and precision measures. The results show that the main limitation of unification processes of term variants through technology of state-finite is the infra-analysis.

References

"SEEK" links will first look for possible matches inside E-LIS and query Google Scholar if no results are found.

ADAMSON, G. W. and BOREHAM, J. The Use of an association measure based on character structure to identify semantically related pairs of words and document titles. Information Storage and Retrieval, v. 10, n. 1, p. 253-260, 1974.