The Talking Thinking Teaching Head

From Talking Heads to Teaching Heads

Towards the Total Turing Test

How can real-time interaction between humans and machines be made significantly more effective? The Thinking Head team takes current research on Talking Heads into the realm of Thinking Heads, in the process addressing a range of fundamental interdisciplinary issues about verbal-aural communication, the most efficient of human communication systems. The approach is novel in its integration of best-practice Talking Head science and technology with careful analysis and evaluation from the perspective of cognitive science to create a tight feedback loop for Thinking Head development and elaboration.

The Head X Research Platform is one of Flinders AILab's major research outputs for this project, taking us into the realm of a high-fidelity Thinking Head with a platform that is freely available to other researchers – both for basic research, further developing the individual technologies in the pipeline, and for applied research, where our own applications focus in the area of Assistive and Educational Technology.

The research and technology is relevant to human-machine communication, telecommunications, e-commerce, and mobile phone technology; personalised aids for disabled users, the hearing impaired, the elderly, and children with learning difficulties; and foreign language learning; and will facilitate the development of animation in new media, film, and in particular games. The various Heads have been demonstrated widely, with public visibility for the project will be facilitated by the incorporation of high-profile installations and exhibitions, including the Arts Festival preceeding the Beijing Olympics, and a permanent display, as well as occasional robotic displays, at the Powerhouse museum in Sydney.

The Thinking Head incorporates components focussed on dialogue management, speech generation and speech understanding. At the same time the project seeks to move beyond the current engineering orientation to explore the evolution of interactive behaviour and the role of emotion and facial gestures in communication. The ability of the Thinking Head to display/understand emotion/gestures is being explored in association with performance artists and technologists at our partner institutions, and is leading to increased understanding of how to produce realistic animation models for the game and movie industries. In multiyear interactive museum display, a large projection screen was used to display word associations while "colouring" the ambience to match the emotions being expressed.

Future directions for the Talking Head will incorporate and extend the Flinders University Lip Reading and Audio-Visual Speech Recognition technology developed by Prof. David Powers and Dr Trent Lewis, which is integrated with Auditory Speech Recognition and Speech Synthesis technology from Carnegie Mellon University in partnership with A/Prof. Alan Black and Dr Tanja Schultz at CMU. We are also starting to use EEG to monitor subjects interacting with the Thinking Head in order to understand their learning and engagement with the technology, as well as to develop a Hybrid AudioVisual Brain Computer Interface technology that uses multimodal input to improve speech understanding.

KIT has an associated program in Evolutionary Robotics and Natural Language Learning, building on Prof. Powers' Robot Baby and Language Learning research as well as the research of Dr Martin Luerssen and Dr Richard Leibbrandt on Grammar Evolution and Induction of Part of Speech categories from child-directed speech (CHILDES). This will seek to evolve improved architectures and develop the adaptability required to deal with changing social, linguistic and environmental conditions. Another way of looking at this is that we are looking to develop a system that can pass the Total Turing Test or TTT. Turing felt that to pass his Imitation Game, the traditional "pen pal" Turing Test, it would be necessary for the computer to actually learn as a robot, and deal with the real world and social/cultural context - this includes behaving/acting in a way that is indistinguishable from humans, and thus also includes addressing Human Computer Interaction at the level of Gestures, Emotions and Expressions – see the Role of Emotion (sidebar Feature on this page). In fact Harnad and Schweizer have each proposed higher levels of indistinguishability or TTTTs: such Total Total Turing Tests or Truly Total Turing Tests have even stronger conditions.

Milne, M.K., Luerssen, M.H., Lewis, T.W., Leibbrandt, R.E., & Powers, D.M., 2010. Development of a Virtual Agent Based Social Tutor for Children with Autism Spectrum Disorders. Proceedings of the International Joint Conference on Neural Networks 2010, 1555-1563.

Feature

The Role of Emotion

What does it take to think, talk and act like an ordinary person? This remains one of the great challenges of Artificial Intelligence and Cognitive Science. It is easier to produce a champion chess playing program or provide university level advice on any subject, than it is to duplicate the capabilities of a typical two-year old.

The Turing Test focuses on the brain as a computer that communicates in normal language, and in 1950 Alan Turing predicted that by the year 2000 a computer would fool 30% of people that talked to it for 5 minutes into thinking it was a person. This was actually achieved by the winner of the Loebner Prize at the annual competition held at Flinders in 1998!

However, Turing thought that for real intelligence you needed sensors, you needed to be able to understand the world, and we built this condition into the requirements for Loebner Prize Gold Medal - a kind of show-and-tell aspect to the Turing Test. Harnad talks about the Total Turing Test, which requires not just sensors but robotic capabilities, the ability to interact with the world and learn about the world and society, and their laws, at the same time as you learn about language, and its laws (otherwise known as grammar). This is the focus of KIT and its AI/LT Lab. But Harnad goes on to talk about the Total Total Turing Test – that is physical indistiguishability. Why on earth would we need this?

Language and Intelligence are caught up with every aspect of who we are and how we interact with the world, our fellow humans, and the rest of creation. It is caught up with our drives and our feelings, our hungers and our pains, and unless the computer/robot/android has the same physical structure as us, the best we can do is attempt to simulate all these things – silicon-based "lifeforms" may indeed be possible, Androids that are superficially human-like may indeed be created by us, and may indeed be intelligent, but they are unlikely to be mistaken for humans for very long. Actually we find it unsettling to talk to or even just watch a human-like being that somehow just doesn't gel as being human - the so-called Uncanny Valley effect.

A major part of our research effort thus goes on exploring the gestures, expressions and emotions that colour our conversations, and convey information beyond the mere words of a simple Talking Head. Having a Teaching Head show appropriate expression versus neutral expression, can lead to the students learning much more, and achieving a grade point higher on average! Of course the flip side of putting emotional expression into our faces and speech is recognizing emotional expression in the speech and expressions we see and hear. This is also the capability which when it missing leads to a diagnosis of autism - for our AIs!

Rather than developing an Autistic Head, we are putting a lot of effort into understanding emotion and expression, and in fact designing a Teaching Heads that teach a standard and specific social skills curriculum to children that suffer from Autism Spectrum Disorders, Attention Deficit Hyperactivity Disorder, or Hearing Impairment. We are also designing a Teaching Head to help health sciences students learn the skills necessary for motivational interviewing - that is helping people understand their problems, decide they want to make changes, and determine a solution.