BITS Unit Selection Synthesis Corpus

Corpus de synthèse de sélections d’unités BITS

BITS-US

ID:

ELRA-S0224

BITS stands for "BAS Infrastructures for Technical Speech Processing" and was funded by the German Ministry of Science and Education during 2003-2005.

The BITS synthesis corpus consists of two parts: a set of logatome recordings for controlled diphone synthesis (ELRA-S0217) and a set of sentence recordings for unit selection techniques (ELRA-S0224).

This corpus contains 6,732 recordings spoken by 4 professional German speakers covering all German diphone combinations in different prosodic contexts.

The data is stored on 4 DVDs. Each DVD contains the recordings, the annotation files and the meta data files of one of the four professional speakers, and the entire corpus' documentation. Each speaker was recorded in an insulated room with low reverberation.

Each sentence was recorded in three channels: close microphone, large membrane microphone and laryngographic signal. All recordings are segmented and labelled into phonemic units as well as annotated prosodically.

The same 4 professional speakers also spoke the BITS Logatome Synthesis Corpus (ELRA-S0217) enabling the user to combine diphone and unit selection techniques based on the same speakers.