Most biomedical text mining systems target only text information and do not provide intelligent access to other important data such as Figures. More than any other documentation, figures usually represent the """"""""evidence"""""""" of discovery in the biomedical literature. Full-text biomedical articles nearly always incorporate images that are the crucial content of biomedical knowledge discovery. Biomedical scientists need to access images to validate research facts and to formulate or to test novel research hypotheses. Evaluation has shown that textual statements reported in the literature are frequently noisy (i.e., contain """"""""false facts""""""""). Capturing images that are essentially experimental """"""""evidence"""""""" to support the textual """"""""fact"""""""" will benefit biomedical information systems, databases, and biomedical scientists. We are developing a biomedical literature figure search engine BioFigureSearch. We develop innovative algorithms and models in natural language processing, image processing, machine learning and user interfacing. The deliverables will be novel biomedical natural language figure processing (bNLfP) algorithms and iBioFigureSearch allowing biomedical scientists to access figure data effectively, and open-source tools that will enhance biomedical information retrieval, summarization, and question answering. The bNLfP algorithms we will be developing can be applied or integrated into other biomedical text-mining systems.

Public Health Relevance

This project proposes innovative algorithms and models in natural language processing, image processing, machine learning, and user interfacing, to return figures in response to biomedical queries. It is anticipated that the algorithms, models, and tools developed will significantly enhance biomedical scientists'access to figures reported in literature, and thereby expedite biomedical knowledge discovery.