Microsoft Kinect used to live-translate Chinese sign language into text

New software is bridging the gap between deaf and non-deaf people in many languages.

Researchers from the Chinese Academy of Sciences (CAS) have used a Microsoft Kinect to live-translate Chinese sign language into text.

The work, a collaboration between the CAS Institute of Computing Technology and Microsoft Research Asia, could be vital to helping deaf and non-deaf people communicate with each other.

"We ultimately hope this work can provide a daily interaction tool to bridge the gap between the hearing and the deaf and hard-of-hearing in the near future," said Guobin Wu of Microsoft Research Asia, in a blogpost about the research.

Sign language is not merely a mirror of spoken language—it has a sentence structure and grammar that can be quite different to the language it's derived from. For that reason, typing and writing in English, for example, isn't straightforward for deaf and hard-of-hearing people. For those who have been deaf their whole lives, it can be akin to learning a new language.

This means that it's currently not possible for deaf and hard-of-hearing people to communicate with each other in their native language using computers. Essentially, they have to communicate in a foreign language whenever they to send a message over the Internet.

"Technology that enhances communication between non-deaf people and deaf people is to be encouraged," a spokesperson for the British Deaf Association told Wired.co.uk. "Many non-deaf people do not possess the skills of sign language and this hinders deaf people from fully participating in wider society and having equal rights."

A Scottish company, Technabling, is also working to address this issue. It has developed similar software to the Kinect solution that they say works on any camera-enabled device, including laptops, tablets, and even mobile phones.

"The main goal is to bridge the gap [between deaf and non-deaf people] in both directions," says Technabling operations director Jacques-Yves Silvia.

Their software, which will be free for individuals, will allow a deaf and non-deaf person to overcome the communications barrier between them and chat freely. Typed text will be translated into sign-language, as currently exists, but then sign-language will be translated into text allowing two-way conversation.

It will require a device to relay the conversation, so it will obviously not be as seamless as it would be if the non-deaf person put the effort into learning sign language.

Silvia says that the software was finished in late June and will launch later this year after further testing.

In 2012, Spanish computer systems engineer Daniel Martinez Capilla also developed sign-language translation software for the Kinect (but specifically for American Sign Language).

"Sign language is not merely a mirror of spoken language—it has a sentence structure and grammar that can be quite different to the language it's derived from."

A clarification: Sign languages are not "derived" from spoken languages, they are in fact entirely independent languages. American Sign Language is no more related to English than Navajo is. All three languages merely happen to be spoken in overlapping regions.

We already have rudimentary visual translators in the form of smart phone apps that allow you to see foreign language signs and menus in your own language. Then of course there's BabelFish and Google Translate for web pages.

To be able to understand and to be understood by anyone on Earth regardless of language? That would truly be worthwhile.

Oh, do I ever love reapplication of "toys" to consequential tasks. How many problems have people solved because someone was messing around and had a flash of inspiration? It makes me wish schools had half an hour a day devoted to kids playing with electronics and such -- a few sessions devoted to learning some basics, and the split the time between free-form problem-solving and free play.

This is indeed a great development, though I wonder how they translate typed text into sign language. I assume the display show some kind of 3D model of a human person performing the signs after they're translated?

In the far future, I hope these systems are rendered obsolete because we've been able to discover a way to restore hearing to the deaf.

Unless I'm grossly mistaken, this started as an Intel China Cup 2012 entry with a group of (no joke) freshmen. I actually saw this system working at a conference in February in simplified form.

Specifically, the project I'm thinking of was one of the "First Class" winners [1]:

The Chinese University of Hong Kong: Improving Communication Ability of the Disabled -- Chinese Sign Language Recognition and Translation System

Unfortunately, I can't find too much information about it in English. It was pretty damn cool, though. Used the Kinect and a simplified collection of Chinese sign language (CSL). Just to reiterate, it was a group of first year university students who did this.

"Sign language is not merely a mirror of spoken language—it has a sentence structure and grammar that can be quite different to the language it's derived from."

A clarification: Sign languages are not "derived" from spoken languages, they are in fact entirely independent languages. American Sign Language is no more related to English than Navajo is. All three languages merely happen to be spoken in overlapping regions.

You are correct, except insofar as sign languages often take both finger spellings (obviously), and many "shortcuts" for words from their parent languages. Interestingly, ASL is actually "derived" from French, in this context.

The sad part is that MS looked at Kinect and thought, "Hey, we can make this like Wii Sports!"

The rest of the world looked at Kinect and said, "Hey, maybe this would be better doing something else."

Knowing that, MS went back to the drawing board and came back with Xbone and said, "Hey, now we can make Wii Sports Resort!"

The rest of the world just shook its head. Apple then sighed and opened its wallet, grumbling, "Must we do everything?"

The sign language app that this article describes is developed by Microsoft Research itself

As for Apple, they actually turned down PrimeSense, the company that originally developed the technology for Kinect, years ago before PrimeSense went to Microsoft. Apple is only now trying to reverse it's position.

This is indeed a great development, though I wonder how they translate typed text into sign language. I assume the display show some kind of 3D model of a human person performing the signs after they're translated?

In the far future, I hope these systems are rendered obsolete because we've been able to discover a way to restore hearing to the deaf.

They are deaf, not blind. Both parties can read text transcriptions.

I'm not an expert on sign language, but I do have some experience with ASL. I wonder how well the system could actually translate sign. A lot of the ASL communication I have seen is just like spoken word. In spoken word there are many nuances to tone, volume, hand movement, and facial expression. The hand and facial expressions are even more dramatized in ASL communication. For example, you may have a sign for "big," but there are so many other words that you could choose - gigantic, humongous, gargantuan, etc. Someone "speaking" in ASL can communicate that nuance through their hand, body, and face.

I don't know about Chinese sign language, but I also wonder how well it could handle finger spelling. From watching experienced finger spellers I can honestly say I found it impossible to keep up with them. If you slowed it down to (literally) 1/10 the speed I could follow. But otherwise no chance.

I have to say, this is a genuinely great use of the technology, and Microsoft deserve the kudos for it.

Quote:

As for Apple, they actually turned down PrimeSense, the company that originally developed the technology for Kinect, years ago before PrimeSense went to Microsoft. Apple is only now trying to reverse it's position.

Are Apple even in this space? What use would a Kinect device be for a Mac user, or an iOS user? I can't see how this fits with their stuff. Any links would be welcome.

This seems more of a marketing gimmick than anything else. There are serious problems with translation software, and it's barely possible to translate between (from many examples) English and Chinese. How well do you think a project will progress when the computer has an extra layer of deciphering hand movements (rather than using precisely keyed-in text as the input) and a lot less people working on it? Keep in mind that sign languages are just as complex as spoken or written ones.

This is not to say it isn't worthwhile. Just that kudos aren't owed to any Microsoft division except PR.

Hey LASERS were originally a technology in search of a problem to solve, its only with mass market reducing costs and finding applications that theyve become commonplace.

The old adage, if you build it, they will come, is very very true, hell 3d printers have been around (in conceptual form at least) for quite some time. These days NASA has metal 3d printers (sinters?), you can buy out of the box solutions, and you can get to work making... whatever you want. I have a friend who is remaking their printer to use chocolate and sugars vs plastics, with the intent of `building` plates for other deserts.

Tap on a speech synth for the text, and you've got a portable real-time sign-to-audio bidirectional translator.

How would you send text or speech to the other person? Bluetooth to the phone or does Glass have an external speaker?

I would expect the microphone to pick up the speech, and then Glass to show the sign on the internal display.

I would expect the camera to pick up the sign from the user, and then Glass to synth speech to be played to the non-hearing impaired participant in the conversation.

The only thing missing is that I don't think that Glass has an external speaker for speech synth. That might be a matter of a bluetooth accessory for the Glass user. Perhaps a belt clip on or something.