Voice user interface needs to output responses in Text-To-Speech (TTS) synthesized speech. Sometimes it is even more desirable to have the response in mixed languages, For example, in a foreign country, it would be convenient if a user of car-navigation system who is not fluent in that particular foreign language could hear instructions in mixed-codes, such as entities like street names synthesized in the local language and routing directions in the user’s native language. The mixed-coded TTS can be easily built by a truly bilingual speaker. However, it is usually difficult to find such a talent. We demo a new approach in turning monolingual TTS into multi-lingual TTS. Out of a speaker’s monolingual recordings, our algorithm can render speech sentences of different languages for building mixed-coded, bilingual TTS systems. We have recordings of 26 languages which are used to build our TTS of corresponding languages. By using the new approach, we can synthesize any mixed language pair out of the 26 languages.

Rick is a native English speaker. Here is a sample of the recordings by his public speech. "You know I I never when I first came to Microsoft I would have never imagined that we would be doing that and research here. Click to Play

Translating to Chinese"你知道我从来没有当我第一次来到微软，我从来没有想象，我们将在这里做和研究。" Click to Play