The KiezDeutsch-Korpus

The KiezDeutsch-Korpus (KiDKo) has been developed by project B6 (PI: Heike Wiese) of the collaborative research centre Information Structure (SFB 632) at the University of Potsdam from 2008 to 2015. KiDKo is a multi-modal digital corpus of spontaneous discourse data from informal, oral peer group situations in multi- and monoethnic speech communities.

KiDKo contains audio data from self-recordings, with aligned transcriptions (i.e., at every point in a transcript, one can access the corresponding area in the audio file). The corpus provides parts-of-speech tags as well as an orthographically normalised layer (Rehbein & Schalowski 2013). Another annotation level provides information on syntactic chunks and topological fields.

KiDKo offers a new empirical resource for research in domains such as:

Kiezdeutsch as a multiethnic dialect of German

youth language in urban areas

linguistic developments in contemporary German

informal language use

KiDKo consists of two parts:

the main corpus with spontaneous conversations between young people from a multiethnic community (Berlin-Kreuzberg)

a complementary corpus with spontaneous conversations between young people from a monoethnic community with comparable socio-economic indicators (Berlin-Hellersdorf)

Supplementary corpora

The "Oral and Written Text Production" corpus

In addition to the KiDKo three subcorpora are underway, which combine elicited data from Kreuzberg and Hellersdorf adolescents with data from Turkish learners of German from Turkey, which will allow further comparisons such as oral vs. written, spontaneous vs. elicited, and German as a first/second/foreign language:

The "Spracheinstellungen" corpus (KiDKo/E)

The corpus KiDKo/E ("KiDKo/Einstellungen") is also associated with KiDKo. It provides data on language attitudes, perceptions, and ideologies from the public discussion on Kiezdeutsch. KiDKo/E contains emails from 2009 through 2012 and readers’ comments posted between January and April 2012 on media websites.

The Corpus "Linguistic Landscapes"

Another associated corpus is KiDKo/LL ("KiDKo/Linguistic Landscapes"). Under the title "From the ’Hood With Love", this corpus assembles photos of written language productions in public space from the context of Kiezdeutsch, for instance love notes on walls, park benches, and playgrounds, graffiti in house entrances, and scribbled messages on toilet walls.