In order to solve problems of reliability of systems based on lexical repetition and problems of adaptability of language-dependent systems, we present a context-based topic segmentation system based on a new informative similarity measure based on word co-occurrence. In particular, our evaluation with the state-of-the-art in the domain i.e. the c99 and the TextTiling algorithms shows improved results both with and without the identification of multiword units.

Subjects: 1.10 Information Retrieval;
13. Natural Language Processing

Submitted: Apr 24, 2007

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.