University of East Anglia – Avatars for Visual Communication

After all, “Good job” can be transformed from a compliment to an insult with a twist of the voice. Every day, we use our tone, our hands, our faces and our body language to add to what we say, and even to alter its meaning.

It’s the same for sign language. So how do you teach a computer to read all that, and pass it on effectively?

Professor John Glauert

“People imagine that sign language is all about the hands, but quite a lot comes from the face”, says Professor John Glauert. “Your facial expression changes what you’re talking about, and whether you’re happy about something or not.

“Also, when people perform signs they make mouth movements that go with the words. Sometimes you change the meaning of a sign with a facial expression. To change the type of fish, you might sign fish, but mouth “salmon”. The only difference between the sign variations is on the face, not the hands.”

Since 1999, the Virtual Humans Group has been looking at ways to translate everyday communication into recognisable and nuanced sign language. They develop systems that can interpret speech and language, and animate a 3D character to make recognisable signs and gestures.

This requires a diverse mix of skills. Based at the University of East Anglia, the team has expertise in speech and language recognition, 3D character animation, AI and computational linguistics.

Professor Glauert says: “Around 50,000 people in Britain use British Sign Language as their first language. With a number like that, some might not be particularly motivated to produce tailored services for signing deaf people. But part of our work is to make it more cost-effective, so that people don’t have the excuse not to do it.

“One of the things that’s struck me during the time I’ve been doing this is the gap in people’s understanding of the hearing-impaired community. There’s a lot of misunderstanding, which leads to people not providing them with what they need.”

It all started back in 1999, with an avatar called Simon the Signer. Simon translated text subtitles into animated sign language. It won two Royal Television Society Awards, but it was only a rough solution.

“Simon the Signer simply spotted important words and turned them into signs. So you’re basically putting stuff out in the order it would be in English. However, sign language doesn’t use the same order as the English language.

“If you do it like that, you can certainly turn it into something that most signers can understand, but it’s like turning a German or Spanish sentence into English without changing the word order. It doesn’t look right.”

The group originally started by looking solely at the linguistics side of the problem, but later created their own platform to animate the 3D character as well. The challenge was to balance two occasionally competing priorities: to make the signing movements as quick and fluid as possible, and to pass on the full meaning and nuance of the speech, often without any other visual means of communication.

By picking out words in order, Simon the Signer could translate quickly enough for the animation to look fluid and natural. However, for the signing sequences to have the right structure, the system needed to know more about the sentence before translating.

Enter the TESSA project. TESSA was a speech recognition system designed to translate sentences and phrases into true British Sign Language by identifying a phrase as it was being said, and producing the corresponding signs with almost no delay.

Of course, it’s a huge challenge to develop a system that can predict any sentence. There are so many variables. So TESSA was developed primarily for use in customer service situations, translating phrases spoken to customers by counter staff. In these situations, there are only a limited number of essential phrases that are likely to be said, so the system could spot them much quicker. In 2000, the technology was trialled by the Post Office in the UK.

Professor Glauert says: “The system has to recognise the whole sentence, but it can start making a pretty good guess midway through, and can come out almost straight away with an answer. It’s not looking for every phrase. It starts, gets better information, and corrects itself.”

The team also approached the issue of how to translate those meanings that weren’t spoken. First arriving in 2000, the ViSiCAST system took a number of features of communication into account, such as eyebrow position, plural verbs, and the size and placement of gestures. In two more EU-funded projects called eSIGN and Dicta-Sign, the group has since fine-tuned algorithms that can deliver gestures that differ subtly in hand shape and location.

In order to achieve this, they used an established transcription system called the Hamburg Notation System, which tackled the sentences phonetically. This is converted into computer-readable XML, and then processed through a module that uses this information to manipulate the skeleton of an avatar character.

“The actual speech recognition element of our work is state-of-the-art, but not groundbreaking. The more challenging part is what we do with the animation. It’s telling the system exactly how to move the bones of the fingers and arms to pass on a meaning.

“It’s working on two parallel tracks: One communicates what the hand is doing and what the body is doing. The other tracks the face, eyebrows and eyes. For the mouth, we use an animation technique called Morphing, which carries a description of the mouth shape.

“You add a mesh with a skin and clothes over the top. So one of the great things about our system is that we can play the information back with different characters, from humans to robots, aliens and monkeys.”

Their research has been applied in a number of different ways. IBM called their virtual signing technology “the most advanced and flexible system available”, and integrated it into a real-time system called Say It Sign It. Thanks to a collaboration with Orange, that system was modified to work on mobile devices.

It has helped to translate pre-defined information into sign language, from train announcements to weather forecasts and warnings about avalanches. In conjunction with Action on Hearing Loss, the Visual Humans Group has built resources that have allowed others to create sign language content for websites, including Germany’s Federal Ministry of Labour and Social Affairs and employment sites in the Netherlands.

There is also a valuable application in learning. Many children pick up words much more easily when there is a gesture associated with it, a process known as kinesthetic learning. With this in mind, a series of animated story DVDs have been released under the LinguaSign brand. They have been re-produced for English, Dutch, French and Portuguese. Following a 2013 trial of Key Stage 2 students at more than 50 UK primary schools, 62% of respondents confirmed that it improved a child’s speaking and listening skills in a new language.

The avatar technology has already begun to supplement the interpreters we’re used to seeing on TV. It has been showcased on Dutch programme Het Zandkasteel, and on online shows such as Wicked Kids. It has also been used by cultural heritage sites to help pass on stories from history using sign language.

Having almost mastered the hands, the team hopes to improve their work even further by improving their grasp of the expressive human face.

“It’s not about dragging deaf people into the hearing world, but providing them with the sort of services we take for granted on their own terms”, says Professor Glauert. “It’s what they want, rather than what we think they should get.”

Links for Additional Information

Communication isn’t just about words.
After all, “Good job” can be transformed from a compliment to an insult with a twist of the voice. Every day, we use our tone, our hands, our faces and our body language to add to what we say, and even to alter its meaning.
It’s...

About

There were 280 impact case studies submitted to the 2014 Research Excellence Framework (REF) sub panel 11 Computer Science and Informatics by Eighty Seven institutions. Over 80% of the case studies had some form of economic impact, including spin-out businesses created by universities, software tools and techniques developed by research projects which have benefited the efficiency of both computing practitioners in large and small organisations, as well as standard security and communication protocols in daily use by millions of users. The annual revenue generated from those spinouts which included figures in the case studies, was in excess of £170 million and they had nearly 1900 employees. The additional sales revenues attributed to the academic research in industries such as aerospace, telecommunications, computing and energy was about £400 million. Some of the impact has been in the form of public policy, for example in terms of identifying security risks, informing healthcare decisions or public debate on ethical issues. There has been considerable social impact in terms of new healthcare procedures and treatments as well as aids for disabled or elderly people.

The following figure indicates the main types of impact in the submitted case studies listed in the Appendix.

The sub-panel assessors, which included eight people from industry and government appointed only to assess impact, recommended about fifty case studies as being potentially suitable for publicising UK academic Computer Science impact.These were not the fifty highest scoring case studies but were selected based on potential interest to the general public.An initial set of twenty case studies was selected from these to be written up in a form to make them more accessible to non-technical people. The selection criteria included ease of understanding of the technology underpinning the impact, potential interest by the public,examples from a wide range of different types of impact – both social and economic and showing that excellent impact can be generated from a range of universities with both large and small submissions to REF including post-92 universities, Russell Group and the other universities.

The 2014 REF was the first formal assessment of impact as part of the overall research assessment of UK academic institutions.The sub-panel assessors were very impressed by the extent to which UK academic research has had social and economic impact within the UK and often world-wide.The range of impact case studies included:

Spin-out companies from universities, some of which had then been taken over by large international companies.

Software tools and techniques either made available open-source or sometimes licensed to particular organisations with impact in automotive, aerospace, energy suppliers, media, gaming, healthcare, pharmaceutical, transport, retail as well as computing industry.

Contributions to many different international standards e.g. telecommunication, web, compilers, security.

Impact on government, healthcare and security policy as well as on public awareness about ethical and social issues.

The REF criteria stated that the research underpinning the impact must have taken place during the period 1 January 1993 to 31 December 2013 and be of a quality that is recognised internationally in terms of originality, significance and rigour (i.e. at least 2* quality, in terms of REF scoring), but the actual impact must have taken place during the period 1 January 2008 to 31 July 2013.The underpinning research described in case studies ranged from development of specific protocols, to formal methods used to reason about software design or to machine learning techniques.The underpinning research was often of the highest quality with publications in top conferences and journals.

The twenty case studies selected for this report were picked to reflect the range of those submitted and include spin out companies, software tools and techniques, commercialisation of open source software as well as a number of healthcare related applications and aids for people with disabilities. Some case studies indicate impact influencing public policy including issues relating to electronic payments, autonomous weapons systems and evaluation of health information systems.

The working group managing the report included Jon Crowcroft, David Duce, Ursula Martin, David Robertson and Morris Sloman.John Hill wrote the impact case study texts, in consultation with the relevant academics, and Naomi Atkinson was responsible for the design layout of the report.