Department of Computer Science (CS) assistant professor Niranjan Balasubramanian has received a grant from the National Science Foundation (NSF) for his latest research into language understanding algorithms. The collaborative project, Scalable Schema-based Event Extraction, aims to develop machine learning algorithms that can acquire the type of common sense knowledge that typically underlies human communication but that a computer traditionally lacks.

A major hurdle to overcome when working with modern language understanding is that the algorithms work well only in narrow domains (e.g., extracting information from crime reports). General language understanding for these algorithms is hard due to their lack of common sense knowledge. During a conversation between two people, common sense is frequently used to make necessary inferences for proper communication. A computer lacks this common knowledge or common sense, which results in a decreased ability to perform tasks based on its understanding of text, such as answering questions. Balasubramanian believes that the algorithms that can automatically learn to represent and organize information about many real-world scenarios could potentially make up for the lack of human common sense knowledge.

“Unlike prior extraction research which focused on understanding individual events and atomic relations, this research studies how to extract richer event structures that describe broader scenarios,” according to Balasubramanian.

This NSF grant, which totals just under $400k in funding for Stony Brook, is a collaborative effort with Nate Chambers at the United States Naval Academy. The research is centered-around inducing event schemas that represent real-world scenarios and learning the extractors to identify these schemas in text. This research will produce the largest and most diverse set of event schemas through crowd-sourcing, enabling consistent and clear evaluation of future models. By teaming with the United States Naval Academy, the research has the potential to impact a broad community including students across a variety of socio-economic backgrounds and ethnicities.

This is project is currently underway and will take place over the course of three years, with an estimated completion date of August 31, 2019.

“Niranjan is a rising star in natural language processing. His natural language research has attracted NSF’s attention with a prior grant as well as funding from the Allen Institute for Artificial Intelligence,” according to Arie Kaufman, CS department chair.

Professor Balasubramanian received his PhD from University of Massachusetts, Amherst, where he was a part of the Center for Intelligent Information Retrieval (CIIR). Before he started his PhD studies, he was a software engineer at the Center for Natural Language Processing (CNLP) at Syracuse University. He completed his master’s degree in computer science at the University of Buffalo in 2003. Prior to joining Stony Brook, Balasubramanian was a post-doctoral researcher in the Turing Center in the University of Washington. He has a dual role at Stony Brook as a member of both the computer science and biomedical informatics departments.