IBM Research Project Debater: Closer To Passing The Turing Test

Alan Turing, one of the founders of modern computing, posited (In “Computing Machinery and Intelligence”) holding a conversation with a computer and not being able to distinguish that conversation from one with a human. One of the main goals of artificial intelligence (AI) is to pass that test.

IBM Research recently announced an intriguing step towards passing the Turing test. Project Debater is an attempt to have a real-time debate with a human. The team is led by IBM’s research lab in Haifa, Israel, and the result was a short debate that included four minute opening statements and four minute rebuttals by both Project Debater and Noa Ovadia, the 2016 Israel national debate champion.

Arvind Krishna, Director, IBM Research, posted an article about the system, which is worth reading, but about which I have issues and opinions. Mr. Krishna describes the three technologies as “breaking new ground”:

Data-driven speech writing and delivery

Listening comprehension

Modelling human dilemmas “in a unique knowledge graph”

Natural Language And Computing

In the words of Meat Loaf, “two out of three ‘aint bad”. The first point is the one about which I have a problem. My contention is that this is an advance in the area but while important it is hardly ground breaking. Narrative Science and Automated Insights are private companies which are providing natural language generation (NLG) services to write articles based on detailed, primarily tabular, information. From minor league baseball articles to financial reporting, AI writing is expanding its markets.

Meanwhile, the growth of voice assistants such as Amazon Echo and
Google Home are improving speech delivery via NLP. Writing and spoken delivery are advancing on multiple fronts.

There is a key advance specific to Project Debater, and it came out in a conversation with Dr. Ranit Aharonov, Manager of the world-wide Debater team, and Dr. Noam Slonim, Principal Investigator of the Debater team. Ranit and Noam were very clear that their research is involved in loosely structured information, language. They are not accessing tables of data but are focused on analysis of large bodies of text from articles, conversations and other sources. “The focus that differentiates Project Debater from existing NLG systems is the ability to create a persuasive narrative out of billions of sentences, rather than a narrative generated out of tabular information,” said Dr. Aharonov. “Building an argument out of carefully pinpointed small pieces of text, supported by facts, is more complex than creating standard sentence structure around facts.”

That difference is important to understand. While it is still a challenge to create readable, accurate, flowing narratives based on the facts, it is another level of complexity to understand weights, both logical and emotional, on facets of an argument, build a narrative around that, and then support the narrative from chosen facts.

While Narrative Science, Automated Insights, and other competitors are certainly leveraging NLP right alongside their data analysis, the IBM team’s focus on textualized content is clearly driving forward the field. That leads us to the other points listed above.

Advancing Semantic Analysis

What is intriguing about Project Debater is in the second two bullets above. It is one thing to parse and then glue sentences into a speech, it’s quite another to use semantics to analyze a speech for key points in a long, spoken, argument and then prioritize the arguments to prepare for a response. That the researchers chose debate as a first test was smart, as debaters make their arguments in much more formal and logical manner than do most of us during regular, informal speech. In a relative way, Ms. Ovadia’s arguments are far easier to understand and a great way for IBM to demonstrate the advance.

The comprehension is the visible result of the third point. The underlying driver is the system’s ability to analyze the opponent’s argument to lay out the needed response. The process is far more complex than the above mentioned ability to take data sets from a baseball game or accounting system and describe the results.

A foundation of that capability is the ability to leverage a structure called a knowledge graph (Figure 1). This is a type of data structure that, unlike a typical relational table, treats individual pieces of knowledge as objects that can be related to each other in a many-to-many relationship. Building the knowledge graph is more than linking content based on clear facts.

Sentiment analysis and other techniques are needed to understand the relationships and value weights linking points in the argument being made.

“The human-dilemma knowledge graph provides Project Debater the relationships needed to understand, organize and leverage the different arguments humans use to make a point,” said Dr. Slonim. “This allows the system to better anticipate the arguments that may emerge during the debate and respond in a more natural manner.”

Debate and Turing

What IBM’s Project Debater is working on is critical to natural language processing. It is the “natural” in that phrase that is still the challenge. Syntax is well understood and a number of advances in the last decade have helped move virtual assistants into the market. What debate analysis is doing is moving past the simple semantic analysis of a sentence and into the realm of real conversation.

Debate is one of the artificially constrained conversation forms of human speech. Given those constraints, it is a great way to focus on AI for real-time learning. As systems become better at responding naturally within the debate format, companies such as IBM can then extend the conversational systems to more natural forms of speech.

The day two robots are in a pub, arguing about the results of a soccer game based on an article written by an AI system is a day that is coming.

David A. Teich is interested in business intelligence (BI), artificial intelligence (AI), machine learning (ML) and other advances technologies, focused on how they help businesses improve performance. He's an analyst and consultant in those areas as well as in high tech, B...