Search form

Georgia Tech Artificial Intelligence Research Includes Collaborative Approaches with Humans, Automating Content, and More

Tue, 02/06/2018

Georgia Tech’s latest artificial intelligence research, presented Feb. 2-7 at the AAAI Conference on Artificial Intelligence in New Orleans, demonstrates some of the many approaches to developing capabilities for the next generation of autonomous machines.

Four faculty from the Schools of Interactive Computing and Computational Science and Engineering had research accepted into the program. They include Interactive Computing’s Dhruv Batra, Ashok Goel and Mark Riedl, and CSE’s Le Song.

Building for Creativity

Among the accepted Georgia Tech research is work on deep neural networks to teach AI agents how to write and construct narratives with a human collaborator, allowing for stories to be generated in new ways.

Researchers have come up with a method to simplify sentences into “events,” akin to an elementary school grammar lesson. Understanding the subject, verb and other constituent parts of a sentence makes it easier for the computer to generate a reasonable next event in a story. That AI’s event is translated back into a human-readable sentence.

“We can use these methods in an AI that goes back and forth with someone, co-creating a brand new story in real-time,” says Lara Martin, Ph.D. candidate in Human-Centered Computing and lead researcher. “More importantly, this system will be able to continue a story about any topic, which is crucial for improvisation.”

Mark Riedl, director of the Entertainment Intelligence Lab and co-author on the paper, has developed many systems to advance AI creativity as a domain that can spur growth in the field.

“As human-AI interaction becomes more common, it becomes more important for AIs to be able to engage in open-world improvisational storytelling,” he says. “This is because it enables AIs to communicate with humans in a natural way without sacrificing the human’s perception of agency.”

Creating Context for Visual Media

Another Georgia Tech innovation is defining a method to create captions for images from any digital file on- or offline. The research team studied current machine learning models for automatic image captioning and assessed that they had limitations in providing robust output. The team looked to improve on what they considered boring, generic descriptions. Their approach, Diverse Beam Search, is an algorithm that tries to capture the richness of language by generating a diverse set of descriptions that are in general more preferred by humans.

“We categorized images based on their complexity and observed that on ‘complex’ scenes, say, a view of a kitchen with multiple objects, our method indeed resulted in significant improvements in captions,” says Ashwin Vijayakumar, Ph.D. student in Computer Science and lead author.

Simpler images were tougher for the AI system - the internet’s many cat closeups could only be described in so many ways, according to Vijayakumar.