Artificial voice system says hello

When Hideyuki Sawada says he wants his robots to be articulate, he's not talking about computer-generated voice synthesis - he wants them to talk just like humans. So he is designing an entire artificial voice system, complete with its own lung, windpipe, vocal cords and throat.

The idea is to make interacting with robots more natural. Unlike the stilted metallic utterances of a "Robbie the Robot" type droid, Sawada's creations at Kagawa University in Japan will have a natural-sounding voice that produces sound in the same way as the human vocal tract.

To mimic a human lung, Sawada uses a compressed air tank which forces air into a plastic voice-box chamber, where it makes rubber "vocal cords" vibrate (see diagram). The basic sounds generated in the voice box are then fed to a flexible tube that mimics a human vocal tract.

The sound this produces depends on the speed of the airflow, the tension of the rubber vocal cord and the shape of the vocal tract's cross-section. The tract is made from a flexible silicone tube, so that motor- powered rams positioned along it can alter its shape. Our throats and mouths work in a similar way to damp out certain frequencies generated by the vocal cords.

MPEG Videos of the artificial voice system here and here (hosted at Kagawa University)

To control his robot's vocal system, Sawada uses a neural network that can learn how to produce particular sounds by listening to its own utterances and then adjusting the airflow and the shape of the vocal tract to get closer to a desired sound.

Sawada has already taught his system a full range of vowel sounds, and has found that the shape taken up by the artificial vocal tract as it makes the sounds is similar to the shape seen in X-rays of a person making the same sounds. "It shows we are on the right track," he told New Scientist.

Consonants are harder, Sawada will tell the International Conference on Robotics and Automation in Washington, DC next week. The end of the artificial vocal tract behaves much like a pair of lips and allows it to pronounce the sounds "p" and "t".

Tongue-tied

But to extend its repertoire further, Sawada will first have to give the system a tongue, something he is not yet sure how to do. He has, however, added a vent to the resonance tube that does the same job, vocally speaking, as the nasal tract, making the vowel sounds more authentic.

Sawada's system, which he has built with Shuji Hashimoto at Waseda University, cannot match the quality of digital voice synthesisers connected to loudspeakers. But he claims that in time he will be able to make his system more realistic. Eventually he plans to install a complete version into a humanoid robot.

If you would like to reuse any content from New Scientist, either in print or online, please contact the syndication department first for permission. New Scientist does not own rights to photos, but there are a variety of licensing options available for use of articles and graphics we own the copyright to.