Automated ‘Coach’ Could Help with Social Interactions

New software system from MIT could help people improve their conversational and interview skills

CAMBRIDGE, Mass. — Social phobias affect about 15 million adults in the United States, according to the National Institute of Mental Health, and surveys show that public speaking is high on the list of such phobias. For some people, these fears of social situations can be especially acute: For example, individuals with Asperger’s syndrome often have difficulty making eye contact and reacting appropriately to social cues. But with appropriate training, such difficulties can often be overcome.

Now, new software developed at MIT can be used to help people practice their interpersonal skills until they feel more comfortable with situations such as a job interview or a first date. The software, called MACH (short for My Automated Conversation coacH), uses a computer-generated onscreen face, along with facial, speech, and behavior analysis and synthesis software, to simulate face-to-face conversations. It then provides users with feedback on their interactions.

The research was led by MIT Media Lab doctoral student M. Ehsan Hoque, who says the work could be helpful to a wide range of people. A paper documenting the software’s development and testing has been accepted for presentation at the 2013 International Joint Conference on Pervasive and Ubiquitous Computing, known as UbiComp, to be held in September.

“Interpersonal skills are the key to being successful at work and at home,” Hoque says. “How we appear and how we convey our feelings to others define us. But there isn’t much help out there to improve on that segment of interaction.”

Many people with social phobias, Hoque says, want “the possibility of having some kind of automated system so that they can practice social interactions in their own environment. … They desire to control the pace of the interaction, practice as many times as they wish, and own their data.”

The MACH software offers all those features, Hoque says. In fact, in randomized tests with 90 MIT juniors who volunteered for the research, the software showed its value.

First, the test subjects — all of whom were native speakers of English — were randomly divided into three groups. Each group participated in two simulated job interviews, a week apart, with MIT career counselors.

But between the two interviews, unbeknownst to the counselors, the students received help: One group watched videos of interview advice, while a second group had a practice session with the MACH simulated interviewer, but received no feedback other than a video of their own performance. Finally, a third group used MACH and then saw videos of themselves accompanied by an analysis of such measures as how much they smiled, how well they maintained eye contact, how well they modulated their voices, and how often they used filler words such as “like,” “basically” and “umm.”

Evaluations by another group of career counselors showed statistically significant improvement by members of the third group on measures including “appears excited about the job,” “overall performance,” and “would you recommend hiring this person?” In all of these categories, by comparison, there was no significant change for the other two groups.

The software behind these improvements was developed over two years as part of Hoque’s doctoral thesis work with help from his advisor, professor of media arts and sciences Rosalind Picard, as well as Matthieu Courgeon and Jean-Claude Martin from LIMSI-CNRS in France, Bilge Mutlu from the University of Wisconsin, and MIT undergraduate Sumit Gogia.

Designed to run on an ordinary laptop, the system uses the computer’s webcam to monitor a user’s facial expressions and movements, and its microphone to capture the subject’s speech. The MACH system then analyzes the user’s smiles, head gestures, speech volume and speed, and use of filler words, among other things. The automated interviewer — a life-size, three-dimensional simulated face — can smile and nod in response to the subject’s speech and motions, ask questions and give responses.

While this initial implementation was focused on helping job candidates, Hoque says training with the software could be helpful in many kinds of social interactions.

After finishing his doctorate in media arts and sciences this summer, Hoque will become an assistant professor of computer science at the University of Rochester in the fall.