Abstract: PaperMaker is a novel IT solution that receives a scientific manuscript via a Web interface, automatically analyses the publication, evaluates consistency parameters and interactively delivers feedback to the author. It analyses the proper use of acronyms and their definitions, and the use of specialized terminology. It provides Gene Ontology (GO) and Medline Subject Headings (MeSH) categorization of text passages, the retrieval of relevant publications from public scientific literature repositories, and the identification of missing or unused references.

Friday, 3 September 2010

Imagine that we have a set of mailing lists (e.g. Dbworld, Corpora-List, Linguist, BioNLP, moses-support, so on). Each of such a mailing list actually contains a bulk of questions and answers given by domain experts or semi-experts. A couple of situations can be raised:Situation 1: a new user raises a question which had already been partially or fully answered through one or more email threads of mailing list. Need a question answering system or summarizer in this context???Situation 2: a new user wants to search a topic of interest. A retrieval system needed???Situation 3: such a mailing list needs a classification of email threads???Situation 4: TBA

Wednesday, 9 June 2010

Given a scenario in which the system takes the input with a research topic and needs to generate a summary of related works relevant to that topic automatically.
--> I think this research problem is still open and actually very challenging. It requires advanced processing which combines many fields in AI such as: NLP, IR, IE, ...

Some initial works (including mine) as follows:
1) Scientific Paper Summarization Using Citation Summary Networks by Qazvinian V. et al. (COLING 2008).
--> this work only targets single article summarization using a clustering approach based on citation summary networks.
2) Generating surveys of scientific paradigms by Saif Mohammad et al. (NAACL 2009).
--> this work explores the usefulness of citation summary in compared to summary from abstracts or full text of articles.
3) Towards Automated Related Work Summarization by Cong Duy Vu HOANG et al. (COLING 2010)
--> this work does not use citation summary but tries to take advantage of full text of article in generating related work summary.
It makes a strong assumption that each related work summary follows a topic hierarchy tree which is provided as the input of summarization system. The system then proposes two different strategies (general & specific content summarization) based on manual rhetorical analysis on how humans use topic hierarchy tree to generate related work summary.
4) Identifying Non-Explicit Citing Sentences for Citation-Based Summarization by Vahed Qazvinian and Dragomir R. Radev (ACL 2010)
--> TBA
5) Context Identification of Sentences in Related Work Sections using a Conditional Random Field: Towards Intelligent Digital Libraries by Angrosh M. A. et al. (JCDL 2010)
6) Imitating Human Literature Review Writing: An Approach to Multi-document Summarization by Jaidka K. et al. (ICADL 2010)
7) Analysis of the Macro-Level Discourse Structure of Literature Reviews by Jaidka K. et al. (Online Information Review)
8) Ultimate Research Assistant: http://ultimate-research-assistant.com/
9) iResearch Reporter: http://iresearch-reporter.com//
10) TBA

Future works (what I come up in my mind now) includes:
- Given a research topic --> automatically generate a topic hierarchy tree of that topic.
- A systematic comparison of summaries built from citations, abstracts, full text of articles. Which ones are more useful to users?
- An initial add-in component integrated into online ACL anthology system.
- Some other issues improve the summarization performance (i.e. use rhetorical discourse analysis, ...)
- ...
--
Cheers,
Vu