The 9th International Symposium on Chinese Spoken Language Processing

The performance of speech and language processing technologies has improved
dramatically over the last years, with an increasing number of systems being
deployed in a variety of languages and applications. Unfortunately, recent
methods and models heavily rely on the availability of massive amounts of
resources which only become available in languages spoken by a large number of
people in countries of great economic interest, and populations with immediate
information technology needs. Furthermore, todays speech processing systems
target monolingual scenarios for speakers who are assumed to use one single
language while interacting via voice. However, I believe that today’s
globalized world requires truly multilingual speech processing systems which
support phenomena of multilingualism such as code-switching and accented
speech. As these are spoken phenomena, methods are required which perform
reliably even if only few resources are available.

In my talk I will present ongoing work at the Cognitive Systems Lab on
applying concepts of Multilingual Speech Recognition to rapidly adapt systems
to yet unsupported or under-resourced languages. Based on these concepts, I
will describe the challenges of building a code-switch speech recognition
system using the example of Singaporean speakers code-switching between
Mandarin and English. Proposed solutions include the sharing of data and
models across both languages to build truly multilingual acoustic models,
dictionaries, and language models. Furthermore, I will describe the web-based
Rapid Language Adaptation Toolkit (RLAT, see http://csl.ira.uka.de/rlat-dev)
which lowers the overall costs for system development by automating the system
building process, leveraging off crowd sourcing, and reducing the data needs
without suffering significant performance losses. The toolkit enables native
language experts to build speech recognition components without requiring
detailed technology expertise. Components can be evaluated in an end-to-end
system allowing for iterative improvements. By keeping the users in the
developmental loop, RLAT can learn from the users’ expertise to constantly
adapt and improve. This will hopefully revolutionize the system development
process for yet under-resourced languages.