Résumé :
The presentation focuses on code-switching (CS) in French/Algerian Arabic bilingual communities and investigates how speech technologies, such as automatic data partitioning, language identification and automatic speech recognition (ASR) can serve to analyze and classify this type of bilingual speech. A preliminary study carried out using a corpus of Maghrebian broadcast data revealed a relatively high presence of CS Algerian arabic as compared to the neighboring countries Morocco and Tunisia. Therefore this study focuses on code switching produced by bilingual Algerian speakers who can be considered native speakers of both Algerian Arabic and French. A specific corpus of four hours of speech from 8 bilingual French Algerian speakers was collected. This corpus contains read speech and conversational speech in both languages and includes stretches of code-switching. We provide a linguistic description of the code-switching stretches in terms of intra-sentential and inter-sentential switches, the speech duration in each language. We report on some initial studies to locate French, Arabic and the code-switched stretches, using ASR system word posteriors for this pair of languages.

Titre : Schwa Realization in French : Using Automatic Speech Processing to Study Phonological and Socio-linguistic Factors in Large Corpora

Résumé :
The study investigates different factors influencing schwa realization in French : phonological factors, speech style, gender, and socio-professional status. Three large corpora, two of public journalistic speech (ESTER and ETAPE) and one of casual speech (NCCFr) are used. The absence/presence of schwa is automatically decided via forced alignment, which has a successful performance rate of 95%. Only polysyllabic words including a potential schwa in the word-initial syllable are studied in order to control for variability in word structure and position.
The effect of the left context, grouped into classes of a word final vowel or final consonant or a pause, is studied. Words preceded by a vowel (V#) tend to favor schwa deletion. Interestingly, words preceded by a consonant or a pause have similar behaviors : speakers tend to maintain schwa in both contexts.
As can be expected, the more casual the speech, the more frequently schwa is dropped. Males tend to delete more schwas than females, and journalists are more likely to delete schwa than politicians. These results suggest that beyond phonology, other factors such as gender, style and socio-professional status influence the realization of schwa.

Intervenant 3 : Giuseppina Turco, Karim Shoul, Rachid Ridouane

Titre : How are four-level length distinctions produced ? Evidence from Moroccan Arabic

Résumé :
We investigate the durational properties of Moroccan Arabic identical consonant sequences contrasting singleton (S) and geminate (G) dental fricatives, in six combinations of fourlevel length contrasts across word boundaries (#) (one timing slot for #S, two for #G and S#S, three for S#G and G#S, and four for G#G). The aim is to determine the nature of the mapping between discrete phonological timing units and phonetic durations. Acoustic results show that the largest and most systematic jump in duration is displayed between the singleton fricative on the one hand and the other sequences on the other hand. Looking at these sequences, S#S is shown to have the same duration as #G. When a geminate is within the sequence, a temporal reorganization is observed : G#S is not significantly longer than S#S and #G ; and G#G is only slightly longer than S#G. Instead of a four-way hierarchy, our data point towards a possible upper limit of three-way length contrasts for consonants : S < G=S#S=G#S < S#G=G#G. The interplay of a number of factors resulting in this mismatch between phonological length and phonetic duration are discussed, and a working hypothesis is provided for why duration contrasts are rarely ternary, and almost never quaternary.