You are here

discourse analysis

Purpose - The object of this study is to develop methods for automatically annotating the argumentative role of sentences in scientific abstracts. Working from Medline abstracts, we classified sentences into four major argumentative roles: objective, method, result, conclusion. The idea is that if the role of each sentence can be marked up, then this metadata can be used during information retrieval to seek for particular types of information such as novelty, conclusions, methodologies, aims/goals of a scientific piece of work. Methodology - Two approaches were tested: linguistic cues and positional heuristics. Linguistic cues are lexico-syntactic patterns modeled as regular expressions implemented in a linguistic parser. Positional heuristics make use of the relative position of a sentence in the abstract to deduce its argumentative class. Findings - Our experiments showed that positional heuristics attained a much higher degree of accuracy on Medline abstracts with an F-score of 64% whereas the linguistic cues only attained an F-score of 12%. This is mostly because sentences from different argumentative roles are not always announced by surface linguistic cues. Research limitations/implications - A limitation to this study is that we were not able to test other methods to perform this task such as machine learning techniques which have been reported to perform better on Medline abstracts. Also, to compare the results of our study to earlier studies using Medline abstracts, the different argumentative roles present in Medline had to be mapped onto four major argumentative roles. This may have favorably biased the performance of the sentence classification by positional heuristics. Originality/value. To the best of our knowledge, our study presents the first instance of evaluating linguistic cues and positional heuristics on the same corpus