Mining Answers from Texts and Knowledge Bases

Papers from the AAAI Spring Symposium

The rate of producing textual documents is quite larger than the rate of generating reliable knowledge bases and reasoning mechanisms. However, the information expressed in various on-line textual documents, either on the Internet or in large text repositories cannot be computationally used unless it is associated with expert knowledge bases. The incorporation of textual information into knowledge bases is not simple, due to the multiple forms of ambiguities that characterize natural language texts. However, today we are in the position of having sufficiently large knowledge bases available and the natural language processing technologies have matured enough to process real-world documents and extract information and answer natural language questions with good accuracy.

Part of the recent success of open-domain Q/A is due to novel combinations of technology developed in the 90s (e.g. named entity recognizers) with techniques used in the 80s (e.g. abductive interpretations of texts) and novel indexing/retrieval mechanisms (e.g passage retrieval). Discovering relevant knowledge from the web can be combined with domain knowledge provided by large-scale knowledge bases (e.g. Cyc, Wordnet, UT's Component Library, IEEE's Standard Upper Ontology effort). Furthermore, textual information extraction techniques can be enhanced by using world knowledge available in large lexicon-semantic knowledge bases.

Achieving orders of magnitude improvement in question answering performance requires us to make synergistic use of these advances. Extraction and text mining methods must make use of knowledge based inference in support of the extraction task and for post processing the extracted information. Knowledge bases need to rely on large corpuses of knowledge to support their initial creation, and then subsequent testing and maintenance.

The symposium brought together diverse techniques for text and answer mining from AI (more specifically, natural language processing, machine learning, knowledge representation and reasoning) with information retrieval (from text collections or from the web) or data tracking and detection. The invited talks and contributed papers focus on topics such as common sense knowledge bases, linguistic knowledge bases, use of dialog in query formulation, inference and query evaluation techniques, and techniques for competence evaluation of question answering systems.